PostgreSQL Sink Connector for Confluent Cloud¶

The Kafka Connect PostgreSQL Sink connector for Confluent Cloud moves data from an Apache Kafka® topic to a PostgreSQL database. It writes data from a topic in Kafka to a table in the specified PostgreSQL database. Table auto-creation and limited auto-evolution are supported.

Important

After this connector becomes generally available, Confluent Cloud Enterprise customers will need to contact their Confluent Account Executive for more information about using this connector.

Features¶

The PostgreSQL sink connector provides the following features:

Modes: This connector inserts and upserts Kafka records into a PostgreSQL database.
Schemas: The connector supports Avro, JSON Schema, and Protobuf input data formats. Schema Registry must be enabled to use a Schema Registry-based format.
Table and column auto-creation: auto.create and auto-evolve are supported. If tables or columns are missing, they can be created automatically. Table names are created based on Kafka topic names.
Primary key support: Supported PK modes are kafka, none, and record_value. Used in conjunction with the PK Fields property.

You can manage your full-service connector using the Confluent Cloud API. For details, see the Confluent Cloud API documentation.

Refer to Cloud connector limitations for additional information.

Caution

Preview connectors are not currently supported and are not recommended for production use.

Quick Start¶

Use this quick start to get up and running with the Confluent Cloud PostgreSQL sink connector. The quick start provides the basics of selecting the connector and configuring it to stream events to a PostgreSQL database.

Prerequisites

Authorized access to a Confluent Cloud cluster on Amazon Web Services (AWS), Microsoft Azure (Azure), or Google Cloud Platform (GCP).
Authorized access to a PostgreSQL database.
The database and Kafka cluster should be in the same region. If you use a different region, be aware that you may incur additional data transfer charges.
Public inbound traffic access (0.0.0.0/0) must be allowed for the preview version of this connector.
The Confluent Cloud CLI installed and configured for the cluster. See Install the Confluent Cloud CLI.
Schema Registry must be enabled to use a Schema Registry-based format (for example, Avro, JSON_SR (JSON Schema), or Protobuf).

Kafka cluster credentials. You can use one of the following ways to get credentials:
- Create a Confluent Cloud API key and secret. To create a key and secret, go to Kafka API keys in your cluster or you can autogenerate the API key and secret directly in the UI when setting up the connector.
- Create a Confluent Cloud service account for the connector.

Using the Confluent Cloud GUI¶

Step 1: Launch your Confluent Cloud cluster.¶

See the Quick Start for Apache Kafka using Confluent Cloud for installation instructions.

Step 2: Add a connector.¶

Click Connectors. If you already have connectors in your cluster, click Add connector.

Step 3: Select your connector.¶

Click the PostgreSQL Sink connector icon.

Step 4: Set up the connection.¶

Note

Make sure you have all your prerequisites completed.
An asterisk ( * ) designates a required entry.

Complete the following and click Continue.

Select one or more topics.
Enter a Connector Name.
Select an Input message format (data coming from the Kafka topic): AVRO, JSON_SR (JSON Schema), or PROTOBUF. A valid schema must be available in Schema Registry to use a schema-based message format.
Enter your Kafka Cluster credentials. The credentials are either the API key and secret or the service account API key and secret.
Enter your PostgreSQL database connection details.
Select one of the following insert modes:
- INSERT: Use the standard INSERT row function. An error occurs if the row already exists in the table.
- UPSERT: This mode is similar to INSERT. However, if the row already exists, the UPSERT function overwrites column values with the new values provided.
Enter a Table name format. This is a format string to use for the destination table name. This property may contain ${topic} as a placeholder for the originating topic name. For example, kafka_${topic} for the topic orders maps to the table name kafka_orders.
Select your Database timezone.
Select whether to automatically create a table if none exists.
Select whether to automatically create columns in the table for a Kafka record if none exists.
Enter the maximum size for batched records. A typical entry here is 1000.
Enter the maximum number of tasks the connector can run. See Confluent Cloud connector limitations for additional task information.
Select a PK mode. Supported modes are listed below:
- kafka: Kafka coordinates are used as the primary key. Must be used with the PK Fields.
- none: No primary keys used.
- record_value: Fields from the Kafka record value are used. This must be a struct type.
Enter the PK fields values. This is a list of comma-separated primary key field names. The runtime interpretation of this property depends on the PK mode selected. Options are listed below:
- kafka: Must be three values representing the Kafka coordinates. If left empty, the coordinates default to __connect_topic,__connect_partition,__connect_offset.
- none: PK Fields not used.
- record_value: Used to extract fields from the record value. If left empty, all fields from the value struct are used.

Configuration properties that are not shown in the Confluent Cloud UI use the default values. For default values and property definitions, see JDBC Sink Connector Configuration Properties.

Step 5: Launch the connector.¶

Verify the connection details and click Launch.

Step 6: Check the connector status.¶

The status for the connector should go from Provisioning to Running.

Step 7: Check the results in PostgreSQL.¶

Verify that new records are being added to the PostgreSQL database.

You can manage your full-service connector using the Confluent Cloud API. For details, see the Confluent Cloud API documentation.

Tip

When you launch a connector, a Dead Letter Queue topic is automatically created. See Dead Letter Queue for details.

Using the Confluent Cloud CLI¶

Complete the following steps to set up and run the connector using the Confluent Cloud CLI.

Note

Make sure you have all your prerequisites completed.

Step 1: List the available connectors.¶

Enter the following command to list available connectors:

ccloud connector-catalog list

Step 2: Show the required connector configuration properties.¶

Enter the following command to show the required connector properties:

ccloud connector-catalog describe <connector-catalog-name>

For example:

ccloud connector-catalog describe PostgreSQLSink

Example output:

Following are the required configs:
connector.class: PostgresSink
input.data.format
name
kafka.api.key
kafka.api.secret
connection.host
connection.port
connection.user
connection.password
db.name
tasks.max
topics

Step 3: Create the connector configuration file.¶

Create a JSON file that contains the connector configuration properties. The following example shows required and optional connector properties:

{
  "connector.class": "PostgresSink",
  "name": "PostgresSinkConnector_0",
  "input.data.format": "AVRO",
  "kafka.api.key": "****************",
  "kafka.api.secret": "****************************************************************",
  "connection.host": "database-4.<host-id>.us-east-2.rds.amazonaws.com",
  "connection.port": "5432",
  "connection.user": "postgres",
  "connection.password": "**************",
  "db.name": "postgres",
  "topics": "postgresql_ratings",
  "insert.mode": "UPSERT",
  "db.timezone": "UTC",
  "auto.create": "true",
  "auto.evolve": "true",
  "pk.mode": "record_value",
  "pk.fields": "user_id",
  "tasks.max": "1"
}

Note the following property definitions:

"connector.class": Identifies the connector plugin name.
"name": Sets a name for your new connector.
"input.data.format": Sets the input message format (data coming from the Kafka topic). Valid entries are AVRO, JSON_SR (JSON Schema), or PROTOBUF. You must have Confluent Cloud Schema Registry configured if using a schema-based message format.
"topics": Identifies the topic name or a comma-separated list of topic names.

"insert.mode": Enter one of the following modes:
- INSERT: Use the standard INSERT row function. An error occurs if the row already exists in the table.
- UPSERT: This mode is similar to INSERT. However, if the row already exists, the UPSERT function overwrites column values with the new values provided.

db.timezone: Name of the timezone the connector uses when inserting time-based values. Defaults to UTC.
"auto.create" (tables) and "auto-evolve" (columns): (Optional) Sets whether to automatically create tables or columns if they are missing relative to the input record schema. If not entered in the configuration, both default to false.

"pk.mode": Supported modes are listed below:
- kafka: Kafka coordinates are used as the primary key. Must be used with the PK Fields.
- none: No primary keys used.
- record_value: Fields from the Kafka record value are used. This must be a struct type.

"pk.fields": A list of comma-separated primary key field names. The runtime interpretation of this property depends on the pk.mode selected. Options are listed below:
- kafka: Must be three values representing the Kafka coordinates. If left empty, the coordinates default to __connect_topic,__connect_partition,__connect_offset.
- none: PK Fields not used.
- record_value: Used to extract fields from the record value. If left empty, all fields from the value struct are used.

"tasks.max": Maximum number of tasks the connector can run. See Confluent Cloud connector limitations for additional task information.

Configuration properties that are not shown in the Confluent Cloud UI use the default values. For default values and property definitions, see JDBC Sink Connector Configuration Properties.

Step 4: Load the configuration file and create the connector.¶

Enter the following command to load the configuration and start the connector:

ccloud connector create --config <file-name>.json

For example:

ccloud connector create --config postgresql-sink-config.json

Example output:

Created connector PostgresSinkConnector_0 lcc-ix4dl

Step 5: Check the connector status.¶

Enter the following command to check the connector status:

ccloud connector list

Example output:

ID          |       Name               | Status  | Type
+-----------+--------------------------+---------+------+
lcc-ix4dl   | PostgresSinkConnector_0  | RUNNING | sink

Step 7: Check the results in PostgreSQL.¶

Verify that new records are being added to the PostgreSQL database.

You can manage your full-service connector using the Confluent Cloud API. For details, see the Confluent Cloud API documentation.

Tip

When you launch a connector, a Dead Letter Queue topic is automatically created. See Dead Letter Queue for details.

PostgreSQL Sink Connector for Confluent Cloud¶

Features¶

Quick Start¶

Using the Confluent Cloud GUI¶

Step 1: Launch your Confluent Cloud cluster.¶

Step 2: Add a connector.¶

Step 3: Select your connector.¶

Step 4: Set up the connection.¶

Step 5: Launch the connector.¶

Step 6: Check the connector status.¶

Step 7: Check the results in PostgreSQL.¶

Using the Confluent Cloud CLI¶

Step 1: List the available connectors.¶

Step 2: Show the required connector configuration properties.¶

Step 3: Create the connector configuration file.¶

Step 4: Load the configuration file and create the connector.¶

Step 5: Check the connector status.¶

Step 7: Check the results in PostgreSQL.¶

Next Steps¶