What are Self-Balancing Clusters?
Confluent Platform deployments can run hundreds of brokers, manage thousands of topics and
produce billions of messages per hour. Every day brokers die, new topics are
created and deleted, and partitions must be reassigned to balance the workload.
This can overload teams tasked with managing Confluent Platform runtime operations.
Self-Balancing automates your resource workload balancing, provides failure detection and
self-healing, and allows you to add or decommission brokers as needed, with no
manual tuning required.
Self-Balancing offers:
- Fully automated load balancing
- Auto-monitoring of clusters for imbalances based on a large set of parameters, configurations, and runtime variables
- Continuous metrics aggregation and rebalancing plans, generated instantaneously in most cases, and executed automatically
- Automatic triggering of rebalance operations based on simple configurations you set on Confluent Control Center or in Kafka
server.properties
files.
You can choose to auto-balance Only when brokers are added or Anytime, which rebalances for any uneven load.
- At-a-glance visibility into the state of your clusters, and the strategy and progress of auto-balancing through a few key metrics
How it Works
Self-Balancing Clusters optimize Kafka load awareness. Resource usage within a cluster can be
heterogeneous. Kafka out-of-the-box does not provide an automated process to improve cluster
balance, but rather requires manual calculation of how to reassign partitions.
Self-Balancing simplifies this process by moving data to spread the cluster load evenly.
Self-Balancing defines the meaning of “even” based on built-in goals, although you can provide
optional input through configurations.
Architecture of a Self-Balancing Cluster
Kafka brokers collect metrics and feed the data to an internal topic on the controller, which is on the lead broker for the cluster.
The controller, therefore, plays a key role in cluster balancing. You can think of the controller as the node on which Self-Balancing is actively running.
The metrics are processed and decisions are made, based on goals.
This data is fed to other internal Kafka topics for monitoring and potential actions, such as generating a load balancing plan or triggering a rebalance.
Tip
You can view all topics, including internal topics, by using the kafka-topics --list
command while Confluent Platform is running, for example:
kafka-topics --list --bootstrap-server localhost:9092
. Self-Balancing internal topics are prefixed with _confluent_balancer_
.
Enabling Self-Balancing Clusters
- For each broker in your cluster, set
confluent.balancer.enable=true
to enable Self-Balancing and make sure that this line is uncommented.
Confluent Platform ships with Self-Balancing set to enabled in the example file: $CONFLUENT_HOME/etc/server.properties
. To learn more, see confluent.balancer.enable.
- To make Self-Balancing accessible for Configuration and Monitoring from Control Center, configure the Control Center cluster with REST endpoints
to enable HTTP servers on the brokers, as described in Required Configurations for Control Center.
What defines a “balanced” cluster and what triggers a rebalance?
At a high level, Self-Balancing distinguishes between two types of imbalances:
- Intentional operational actions, such as adding or removing brokers. In these cases, Self-Balancing saves the operator time and manual steps that would otherwise be required.
- Ongoing cluster operations (such as a hot topic or partition). This balancing is ongoing throughout the life of the cluster.
These map to the two high level configuration options you have with Self-Balancing enabled:
- Rebalance only when a broker is added
- Rebalance anytime (for any uneven load, including changes in number of available brokers)
The first case is clear-cut. If brokers are added, self-healing occurs to
redistribute data to the new broker or offload data from the missing broker.
The second case is more nuanced. To achieve ongoing cluster and data balance,
Self-Balancing Clusters optimize on a number of goals and also avoid unnecessary movements
if rebalancing would not materially improve cluster performance. Goals include
considerations for replica placement and capacity, replication factors and
throughput, multiple metrics on topics and partitions, leadership, rack
awareness, disk usage and capacity, processor usage and capacity, network
throughputs per broker, numerous load distribution targets, and more.
Self-Balancing Clusters employ continuous monitoring and data collection to track performance
against these goals, and generate plans for rebalancing. A rebalance may or may
not be triggered based on the implications of all weighted factors for the
topology at a given moment.
Tip
In both cases, the cluster also rebalances when a broker is removed by user request (either from the
command line or the Control Center),
or if a broker goes missing some period of time (as specified by confluent.balancer.heal.broker.failure.threshold.ms).
Therefore, even with Self-Balancing set to Rebalance only when a broker is added, the cluster will rebalance for missing brokers unless you
intentionally set properties to prevent this (for example, if confluent.balancer.heal.broker.failure.threshold.ms
is set to -1
).
What happens if the lead broker (controller) is removed or lost?
In a multi-broker cluster, one of the brokers is the leader or controller and
plays a key role in Self-Balancing (as described in Architecture of a Self-Balancing Cluster). What happens if the
broker where the controller is running is intentionally removed or crashes?
- There is no impact to cluster integrity.
- When the controller is removed or lost, a new controller is elected.
- A broker removal request persists once it is made. If another broker becomes the controller,
the new controller restarts and resumes the broker removal process, possibly with some delay.
- The new leader will pick up the remove broker
request and complete it.
- If the lead broker is lost during an in-progress “add broker” operation, the “add broker” operation will not complete, and is marked as failed.
If the new controller does not pick this up, you may need to restart the broker you were trying to add.
See also, Troubleshooting.
How do the brokers leverage Cruise Control?
Confluent Self-Balancing Clusters leverages Cruise Control
for continuous metrics aggregation and reporting, reassignment algorithms and plans, and some
rebalance triggers. Unlike Cruise Control, which has to be managed separately from Kafka, Self-Balancing
is built into the brokers, meaning that data balancing is supported out of the box without additional
dependencies. The result is custom-made, automated load balancing optimized for your Kafka clusters
on Confluent Platform, and designed to work seamlessly with other components like Tiered Storage and Multi-Region Clusters.
Limitations
- Self-Balancing does not support JBOD (just a bunch of disks), also known as spanning, to make multiple disks appear as one.
- If a broker contains the only replica of a partition, Self-Balancing will block elective attempts
to remove the broker to prevent potential data loss (the broker removal operation will fail).
The best way to remedy this is to increase replication factors on topics and internal topics
on the broker you want to remove, as described in Configure replication factors for Self-Balancing in the tutorial.
- Attempts to remove a broker immediately after cluster startup (while Self-Balancing is initializing) can fail due to insufficient metrics,
and attempts to remove the lead broker can also fail at another phase. The solution is to retry broker removal after a period of time.
If the broker is a controller, you must run broker removal from the command line, as it may not be available on the Control Center.
To learn more, see Broker removal attempt fails during Self-Balancing initialization in Troubleshooting.
Configuration and Monitoring
Self-Balancing Clusters are self-managed, and can be enabled while the cluster is running. In
most cases, you should not have to tinker with the defaults. Simply enable Self-Balancing
either before or after you start the cluster, and allow it to auto-balance as
needed.
In the example server.properties
file that ships with Confluent Platform,
confluent.balancer.enable is set to true
, which means Self-Balancing is on.
Using Control Center
You can change these Self-Balancing settings from the Confluent Control Center (http://localhost:9021/) while the cluster is running:
Select a cluster, click Cluster settings and select the Self-balancing tab.
The current settings for Self-Balancing are shown.
Click Edit Settings.
Make changes and click Save.
The updated settings are put into effect and reflected on the Self-balancing tab.
To view the property names for the Self-Balancing settings in effect (while editing or monitoring them), select Show raw configs.
Metrics for Monitoring a Rebalance
Confluent Platform exposes several metrics through Java Management Extensions (JMX) that are useful for monitoring rebalances initiated by Self-Balancing:
- The incoming and outgoing byte rate for reassignments is tracked by
kafka.server:type=BrokerTopicMetrics,name=ReassignmentBytesInPerSec
and kafka.server:type=BrokerTopicMetrics,name=ReassignmentBytesOutPerSec
.
These metrics are reported by each broker.
- The number of pending and in-progress reassignment tasks currently tracked by Self-Balancing are tracked by
kafka.databalancer:type=Executor,name=replica-action-pending
and kafka.databalancer:type=Executor,name=replica-action-in-progress
.
These metrics are reported from the broker with the active data balancer instance (the controller).
- The maximum follower lag on each broker is tracked by
kafka.server:type=ReplicaFetcherManager,name=MaxLag,clientId=Replica
.
Reassigning partitions will cause this metric to jump to a value corresponding to the size of the partition and then slowly decay as the reassignment progresses.
If this value slowly rises rather than slowly falling over time, the replication throttle is too low.
You can view the full list of metrics for Confluent Platform in Monitoring Kafka.
The Apache Kafka® documentation covers metrics reporting and monitoring by means of JMX
endpoints here.
To monitor Self-Balancing, set the JMX_PORT
environment variable before starting the
cluster, then collect the reported metrics using your usual monitoring tools.
JMXTrans, Graphite, and Grafana are a popular combination for collecting and
reporting JMX metrics from Kafka. Datadog is another popular monitoring solution.
Troubleshooting
Following is a list of problems you may encounter while working with Self-Balancing and how to solve for them.
Self-Balancing options do not show up on Control Center
When Self-Balancing Clusters are enabled, status and configuration options are available on
Control Center Cluster Settings > Self-balancing tab. If, instead, this tab
displays a message about Confluent Platform version requirements and configuring HTTP servers
on the brokers, this indicates something is missing from your configurations or that
you are not running the required version of Confluent Platform.
Solution: Verify that you have the following settings and update your
configuration as needed.
- confluent.balancer.enable must be set to
true
to enable Self-Balancing.
- In the Control Center properties file,
confluent.controlcenter.streams.cprest.url
must specify the associated URL for each broker in the cluster as REST endpoints for controlcenter.cluster
,
as described in Required Configurations for Control Center.
- Your clusters must be deployed on Confluent Platform 6.0.0 or later.
Broker metrics are not displayed on Control Center
This issue is not specific to Self-Balancing, but related to proper configuration of multi-broker clusters, in general.
You may encounter a scenario where Self-Balancing is enabled and displaying Self-Balancing options on Control Center, but broker metrics
and per-broker drill-down and management options are not showing up on the Brokers Overview page. The most likely
cause for this is that you did not configure the Metrics Reporter for Control Center. To do so, uncomment the following lines
in properties files for all brokers. For example, in $CONFLUENT_HOME/etc/server.properties
:
metric.reporters=io.confluent.metrics.reporter.ConfluentMetricsReporter
confluent.metrics.reporter.bootstrap.servers=localhost:9092
Solution: To fix this on running clusters, you will need to shut down Control Center and the brokers,
update the Metrics Reporter configurations for the brokers, and reboot.
This configuration is covered in more detail in the Self-Balancing tutorial at Enable the Metrics Reporter for Control Center.
Broker removal attempt fails during Self-Balancing initialization
Self-Balancing requires 15-20 minutes to initialize and collect metrics from brokers in
the cluster. If you attempt to remove broker before metrics collection
completes, the broker removal will fail almost immediately due to insufficient
metrics for Self-Balancing. This is the most common use case for “remove broker” failing.
The following error message will show on the command line or on Control Center,
depending on which method you used for the remove operation:
Self-balancing requires a few minutes to collect metrics for rebalancing plans. Metrics collection is in process. Please try again after 900 seconds.
Solution: The solution for this is to wait about 15 minutes, and retry the broker removal.
If you want to remove a controller, the same factors are at play, so you should
give some time for Self-Balancing to initialize before attempting a remove operation. In
this case, though, the “remove broker” operation can fail at a later phase after the target broker is already shut down. In an
offline state, the broker will no longer be accessible from Control Center. If this
occurs, wait for 15 minutes, then retry the remove broker operation from the command line. Once Self-Balancing has
initialized and had time to collect metrics, the operation should succeed, and
the rebalancing plan will run.
To learn more about Self-Balancing initialization, see Self-Balancing Initialization.
Broker removal cannot complete due to offline partitions
Broker removal can also fail in cases where taking a broker down will result in
having fewer online brokers than the number of replicas required in your configurations.
The broker status (available with kafka-remove-brokers --describe
)
will remain as follows, until you restart one or more of the offline brokers:
[2020-09-17 23:40:53,743] WARN [AdminClient clientId=adminclient-1] Connection to node -5 (localhost/127.0.0.1:9096) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)Broker 1 removal status:
Partition Reassignment: IN_PROGRESS
Broker Shutdown: COMPLETE
A short-hand way of troubleshooting this is to ask “how many brokers are down?”
and “how many replicas/replication factors must the cluster support?”
Partition reassignment (the last phase in broker removal)
will fail to complete in any case where you have n
brokers down, and your configuration
requires n + 1
or more replicas.
Alternatively, you can consider how many online brokers you need to support the required number of replicas.
If you have n
brokers online, these can support at most a total of n
replicas.
Solution: The solution is to restart the down brokers, and perhaps modify the cluster
configuration as a whole. This might include both adding brokers and modifying
replicas/replication factors (see example below).
Scenarios that lead to this problem can be a combination of under-replicated
topics and topics with too many replicas for the number of online brokers.
Having a topic with a replication factor of 1 does not necessarily lead to a
problem in and of itself.
A quick way to get an overview of configured replicas on a running cluster is to
use kafka-topics --describe
on a specified topic, or on the whole cluster
(with no topic specified). For system topics, you can scan the replication
factors and replicas on system properties (which generate system topics). The
Self-Balancing Tutorial covers these commands, replicas/replication factors, and
the impact of these configurations.
Too many excluded topics causes problems with Self-Balancing
Excluding too many topics (and by inference, partitions) can be counter-productive to maintaining
a well-balanced cluster. In internal tests, excluding system topics which accounted for
approximately 100 out of 500 total topics was enough to put Self-Balancing into a less than optimal state.
The manifest result of this is that the cluster constantly churns on rebalancing.
Solution: Reduce the number of excluded topics. The relevant configuration options to modify
these settings are:ref:sbc-config-exclude-topic-names and confluent.balancer.exclude.topic.prefixes.