What is Cluster Linking?
Cluster Linking allows you to directly connect clusters together and mirror
topics from one cluster to another without the need for Connect.
Cluster Linking makes it much easier to build multi-datacenter, multi-cluster,
and hybrid cloud deployments.
Unlike, Replicator and MirrorMaker2, Cluster Linking
does not require running Connect to move messages from one cluster to another,
ensuring that the offsets are preserved from one cluster to another. We call
this “byte-for-byte” replication. Whatever is on the source, will be mirrored
precisely on the destination cluster.
Cluster Linking and Built-in Multi-Region Replication can be combined to create a highly-available,
durable, and distributed global eventing fabric. Use Built-in Multi-Region Replication when auto-client
failover (low RTO) or RPO=0 is required on some topics. Cluster Linking should
be used when network quality is questionable, data centers are very far apart,
or RTO goals can tolerate client reconfiguration.
Note
The destination cluster must be running Confluent Server and the source cluster can either be Confluent Server or Kafka 2.4+.
Cluster Linking is being introduced in Confluent Platform 6.0.0 as a preview feature.
Use Cases and Architectures
The following use cases can be achieved by the configurations and architectures shown. These deployments are demo’ed in Cluster Linking Demo (Docker)
and Cluster Linking Tutorial.
Topic Data Sharing
Use Case: Share the data in a handful of topics across two Kafka clusters.
- source cluster
- destination cluster
For topic sharing, data moves from the source to the destination cluster by means of a cluster link.
Mirror topics are associated with a cluster link. Consumers on the destination cluster can read from local,
read-only, mirrored topics to read messages produced on the source cluster. If an original topic on the source cluster
is removed for any reason, you can stop mirroring that topic, and convert it to a read/write topic on the destination.
Cluster Migration
Use Case: Move from an on-premises Kafka cluster to a Confluent Cloud Kafka cluster,
or from an older version to a newer version. The native offset preservation you get
by leveraging Confluent Server on the brokers makes this much easier to do with Cluster Linking
than with other Connect based methods.
Hybrid Cloud Architectures
Use Case: Deploy an ongoing data funnel for a few topics from
an on-premise environment to Confluent Cloud. Cluster Linking provides a network
partition tolerant architecture that supports this nicely (losing a network
connection momentarily does not materially affect the data on any particular
cluster), whereas trying this with stretch clusters requires a highly reliable
and robust network.