PostgreSQL Sink Connector for Confluent Cloud
The Kafka Connect PostgreSQL Sink connector for Confluent Cloud moves data
from an Apache Kafka® topic to a PostgreSQL database. It writes data from a topic in
Kafka to a table in the specified PostgreSQL database. Table auto-creation and
limited auto-evolution are supported.
Important
After this connector becomes generally available, Confluent Cloud Enterprise customers will need to
contact their Confluent Account Executive for more information about using
this connector.
Features
The PostgreSQL sink connector provides the following features:
- Modes: This connector inserts and upserts Kafka records into a PostgreSQL database.
- Schemas: The connector supports Avro, JSON Schema, and Protobuf input data formats. Schema Registry must be enabled to use a Schema Registry-based format.
- Table and column auto-creation:
auto.create
and auto-evolve
are supported. If tables or columns are missing, they can be created automatically. Table names are created based on Kafka topic names.
- Primary key support: Supported PK modes are
kafka
, none
, and record_value
. Used in conjunction with the PK Fields property.
You can manage your full-service connector using the Confluent Cloud API. For details, see the Confluent Cloud API documentation.
Refer to Cloud connector limitations for
additional information.
Caution
Preview connectors are not currently supported and are not recommended for
production use.
Quick Start
Use this quick start to get up and running with the Confluent Cloud PostgreSQL sink
connector. The quick start provides the basics of selecting the connector and
configuring it to stream events to a PostgreSQL database.
- Prerequisites
- Authorized access to a Confluent Cloud cluster on Amazon Web Services (AWS), Microsoft Azure (Azure), or Google Cloud Platform (GCP).
- Authorized access to a PostgreSQL database.
- The database and Kafka cluster should be in the same region. If you use a different region, be aware that you may incur additional data transfer charges.
- Public inbound traffic access (
0.0.0.0/0
) must be allowed for the preview version of this connector.
- The Confluent Cloud CLI installed and configured for the cluster. See Install the Confluent Cloud CLI.
- Schema Registry must be enabled to use a Schema Registry-based format (for example, Avro, JSON_SR (JSON Schema), or Protobuf).
- Kafka cluster credentials. You can use one of the following ways to get credentials:
- Create a Confluent Cloud API key and secret. To create a key and secret, go to Kafka API keys in your cluster or you can autogenerate the API key and secret directly in the UI when setting up the connector.
- Create a Confluent Cloud service account for the connector.
Using the Confluent Cloud GUI
Step 2: Add a connector.
Click Connectors. If you already have connectors in your cluster, click Add connector.
Step 3: Select your connector.
Click the PostgreSQL Sink connector icon.
Step 4: Set up the connection.
Note
- Make sure you have all your prerequisites completed.
- An asterisk ( * ) designates a required entry.
Complete the following and click Continue.
- Select one or more topics.
- Enter a Connector Name.
- Select an Input message format (data coming from the Kafka topic): AVRO, JSON_SR (JSON Schema), or PROTOBUF. A valid schema must be available in Schema Registry to use a schema-based message format.
- Enter your Kafka Cluster credentials. The credentials are either the API key and secret or the service account API key and secret.
- Enter your PostgreSQL database connection details.
- Select one of the following insert modes:
INSERT
: Use the standard INSERT
row function. An error occurs if the row already exists in the table.
UPSERT
: This mode is similar to INSERT
. However, if the row already exists, the UPSERT
function overwrites column values with the new values provided.
- Enter a Table name format. This is a format string to use for the destination table name. This property may contain
${topic}
as a placeholder for the originating topic name. For example, kafka_${topic}
for the topic orders
maps to the table name kafka_orders
.
- Select your Database timezone.
- Select whether to automatically create a table if none exists.
- Select whether to automatically create columns in the table for a Kafka record if none exists.
- Enter the maximum size for batched records. A typical entry here is
1000
.
- Enter the maximum number of tasks the connector can run. See Confluent Cloud connector limitations for additional task information.
- Select a PK mode. Supported modes are listed below:
kafka
: Kafka coordinates are used as the primary key. Must be used with the PK Fields.
none
: No primary keys used.
record_value
: Fields from the Kafka record value are used. This must be a struct type.
- Enter the PK fields values. This is a list of comma-separated primary key field names. The runtime interpretation of this property depends on the PK mode selected. Options are listed below:
kafka
: Must be three values representing the Kafka coordinates. If left empty, the coordinates default to __connect_topic,__connect_partition,__connect_offset
.
none
: PK Fields not used.
record_value
: Used to extract fields from the record value. If left empty, all fields from the value struct are used.
Configuration properties that are not shown in the Confluent Cloud UI use the default
values. For default values and property definitions, see
JDBC Sink Connector Configuration Properties.
Step 5: Launch the connector.
Verify the connection details and click Launch.
Step 6: Check the connector status.
The status for the connector should go from Provisioning to Running.
Step 7: Check the results in PostgreSQL.
Verify that new records are being added to the PostgreSQL database.
You can manage your full-service connector using the Confluent Cloud API. For details, see the Confluent Cloud API documentation.
Tip
When you launch a connector, a Dead Letter Queue topic is automatically created. See Dead Letter Queue for details.
See also
For an example that shows fully-managed Confluent Cloud connectors in action with Confluent Cloud ksqlDB, see the Cloud ETL Demo. This example also shows how to use Confluent Cloud CLI to manage your resources in Confluent Cloud.
Using the Confluent Cloud CLI
Complete the following steps to set up and run the connector using the Confluent Cloud CLI.
Step 1: List the available connectors.
Enter the following command to list available connectors:
ccloud connector-catalog list
Step 2: Show the required connector configuration properties.
Enter the following command to show the required connector properties:
ccloud connector-catalog describe <connector-catalog-name>
For example:
ccloud connector-catalog describe PostgreSQLSink
Example output:
Following are the required configs:
connector.class: PostgresSink
input.data.format
name
kafka.api.key
kafka.api.secret
connection.host
connection.port
connection.user
connection.password
db.name
tasks.max
topics
Step 3: Create the connector configuration file.
Create a JSON file that contains the connector configuration properties. The following example shows required and optional connector properties:
{
"connector.class": "PostgresSink",
"name": "PostgresSinkConnector_0",
"input.data.format": "AVRO",
"kafka.api.key": "****************",
"kafka.api.secret": "****************************************************************",
"connection.host": "database-4.<host-id>.us-east-2.rds.amazonaws.com",
"connection.port": "5432",
"connection.user": "postgres",
"connection.password": "**************",
"db.name": "postgres",
"topics": "postgresql_ratings",
"insert.mode": "UPSERT",
"db.timezone": "UTC",
"auto.create": "true",
"auto.evolve": "true",
"pk.mode": "record_value",
"pk.fields": "user_id",
"tasks.max": "1"
}
Note the following property definitions:
"connector.class"
: Identifies the connector plugin name.
"name"
: Sets a name for your new connector.
"input.data.format"
: Sets the input message format (data coming from the Kafka topic). Valid entries are AVRO, JSON_SR (JSON Schema), or PROTOBUF. You must have Confluent Cloud Schema Registry configured if using a schema-based message format.
"topics"
: Identifies the topic name or a comma-separated list of topic names.
"insert.mode"
: Enter one of the following modes:
INSERT
: Use the standard INSERT
row function. An error occurs if the row already exists in the table.
UPSERT
: This mode is similar to INSERT
. However, if the row already exists, the UPSERT
function overwrites column values with the new values provided.
db.timezone
: Name of the timezone the connector uses when inserting time-based values. Defaults to UTC.
"auto.create"
(tables) and "auto-evolve"
(columns): (Optional) Sets whether to automatically create tables or columns if they are missing relative to the input record schema. If not entered in the configuration, both default to false
.
"pk.mode"
: Supported modes are listed below:
kafka
: Kafka coordinates are used as the primary key. Must be used with the PK Fields.
none
: No primary keys used.
record_value
: Fields from the Kafka record value are used. This must be a struct type.
"pk.fields"
: A list of comma-separated primary key field names. The runtime interpretation of this property depends on the pk.mode
selected. Options are listed below:
kafka
: Must be three values representing the Kafka coordinates. If left empty, the coordinates default to __connect_topic,__connect_partition,__connect_offset
.
none
: PK Fields not used.
record_value
: Used to extract fields from the record value. If left empty, all fields from the value struct are used.
"tasks.max"
: Maximum number of tasks the connector can run. See Confluent Cloud connector limitations for additional task information.
Configuration properties that are not shown in the Confluent Cloud UI use the default
values. For default values and property definitions, see
JDBC Sink Connector Configuration Properties.
Step 4: Load the configuration file and create the connector.
Enter the following command to load the configuration and start the connector:
ccloud connector create --config <file-name>.json
For example:
ccloud connector create --config postgresql-sink-config.json
Example output:
Created connector PostgresSinkConnector_0 lcc-ix4dl
Step 5: Check the connector status.
Enter the following command to check the connector status:
Example output:
ID | Name | Status | Type
+-----------+--------------------------+---------+------+
lcc-ix4dl | PostgresSinkConnector_0 | RUNNING | sink
Step 7: Check the results in PostgreSQL.
Verify that new records are being added to the PostgreSQL database.
You can manage your full-service connector using the Confluent Cloud API. For details, see the Confluent Cloud API documentation.
Tip
When you launch a connector, a Dead Letter Queue topic is automatically created. See Dead Letter Queue for details.
Next Steps
See also
For an example that shows fully-managed Confluent Cloud connectors in action with Confluent Cloud ksqlDB, see the Cloud ETL Demo. This example also shows how to use Confluent Cloud CLI to manage your resources in Confluent Cloud.