Ingress |
50 megabytes per second (MBps) |
Maximum number of bytes that can be produced to the cluster in one second.
Available in the Confluent Cloud Metrics API as received_bytes (convert from bytes to MB).
If you exceed the maximum, the producers will be throttled to maintain the peak ingress level,
which will register as non-zero values for the producer client produce-throttle-time-max and
produce-throttle-time-avg metrics.
If you are self-managing Kafka, you can look at the producer outgoing-byte-rate metrics
and broker kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec metrics
to understand your throughput.
To reduce usage on this dimension, you can compress your messages.
lz4 is recommended for compression. gzip is not recommended because it incurs high
overhead on the cluster.
|
Egress |
150 megabytes per second (MBps) |
Maximum number of bytes that can be consumed from the cluster in one second.
Available in the Confluent Cloud Metrics API as sent_bytes (convert from bytes to MB).
If you exceed the maximum, the consumers will be throttled to maintain the peak egress level, which
will register as non-zero values for the
consumer client fetch-throttle-time-max and fetch-throttle-time-avg metrics.
If you are self-managing Kafka, you can look at the consumer incoming-byte-rate metrics
and broker kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec
to understand your throughput.
To reduce usage on this dimension, you can compress your messages
and ensure each consumer is only consuming from the topics it requires. lz4 is recommended
for compression. gzip is not recommended because it incurs high overhead on the cluster.
|
Storage (pre-replication) |
AWS: Infinite, GCP and Azure: 10 terabytes (TB) |
Maximum bytes that can be stored on the cluster at one time, measured before replication.
Dedicated clusters on AWS have infinite storage [1].
Available in the Confluent Cloud Metrics API as retained_bytes (convert from bytes to TB). The API
value is post-replication, so divide by the replication factor of three to get pre-replication
storage usage.
If you exceed the maximum, the producers will be throttled to prevent additional writes, which
will register as non-zero values for the producer client produce-throttle-time-max and
produce-throttle-time-avg metrics.
If you are self-managing Kafka, you can look at how much disk space your cluster is using to
understand your storage needs.
To reduce usage on this dimension, you can
compress your messages
and reduce your retention settings.
lz4 is recommended for compression. gzip is not recommended because it incurs high
overhead on the cluster.
|
Partitions (pre-replication) |
4,500 partitions |
Maximum number of partitions that can exist on the cluster at one time, before replication.
Available in the Confluent Cloud Metrics API as partition_count .
Attempts to create additional partitions beyond this limit will fail with an error message.
If you are self-managing Kafka, you can look at the
broker kafka.controller:type=KafkaController,name=GlobalPartitionCount metric
to understand your partition usage.
To reduce usage on this dimension, you can delete unused topics and create new topics with
fewer partitions. You can use the
Kafka Admin interface to increase the partition count of an existing topic
if the initial partition count is too low.
Confluent Cloud clusters apply a limit to the number of partitions for a cluster. The
partition limit is determined by the cluster type and size, and refers to the
pre-replicated number of partitions. All topics that the customer creates as well
as internal topics that are automatically created by Confluent Platform components–such as
ksqlDB, Kafka Streams, Connect, and Control Center–count towards the cluster
partition limit. The automatically created topics are prefixed with an underscore (_ ),
created using the Confluent Cloud UI, Confluent Cloud CLI, Kafka AdminAPI, Confluent Control Center, Kafka Streams applications,
and ksqlDB. However, topics that are internal to Kafka itself (e.g., consumer offsets)
are not visible in the Confluent Cloud UI, and do not count against partition limits or
toward partition billing.
|
Total client connections |
3,000 connections |
Maximum number of TCP connections to the cluster that can be open at one time.
Available in the Confluent Cloud Metrics API as active_connection_count .
If you exceed the maximum, new client connections may be refused. Producer and consumer clients
may also be throttled to keep the cluster stable. This throttling would register as non-zero
values for the producer client produce-throttle-time-max and produce-throttle-time-avg metrics
and consumer client fetch-throttle-time-max and fetch-throttle-time-avg metrics.
If you are self-managing Kafka, you can look at the
broker kafka.server:type=socket-server-metrics,listener={listener_name},networkProcessor={#},name=connection-count metrics
to understand how many connections you are using.
This value can vary widely based on several factors, including number of producer clients,
number of consumer clients, number of brokers in the cluster, partition keying strategy, produce
patterns per client, and consume patterns per client.
To reduce usage on this dimension, you can reduce the total number of clients connecting to the cluster.
|
Connection attempts |
80 connection attempts per second |
Maximum number of new TCP connections to the cluster that can be created in one second.
If you exceed the maximum, connection attempts may be refused. Producer and consumer clients
may also be throttled to keep the cluster stable. This throttling would register as non-zero
values for the producer client produce-throttle-time-max and produce-throttle-time-avg metrics
and consumer client fetch-throttle-time-max and fetch-throttle-time-avg metrics.
If you are self-managing Kafka, you can look at the rate of change for
broker kafka.server:type=socket-server-metrics,listener={listener_name},networkProcessor={#},name=connection-count metrics
and client connection-creation-rate metrics
to understand how many new connections you are creating.
To reduce usage on this dimension, you can use longer-lived connections to the cluster.
|
Requests |
15,000 requests per second |
Maximum number of client requests to the cluster in one second.
Available in the Confluent Cloud Metrics API as request_count .
If you exceed the maximum, requests may be refused. Producer and consumer clients may also be
throttled to keep the cluster stable. This throttling would register as non-zero values for
the producer client produce-throttle-time-max and produce-throttle-time-avg metrics and
consumer client fetch-throttle-time-max and fetch-throttle-time-avg metrics.
If you are self-managing Kafka, you can look at the
broker kafka.network:type=RequestMetrics,name=RequestsPerSec,request={Produce FetchConsumer FetchFollower} metrics
and client request-rate metrics to understand your
request volume.
To reduce usage on this dimension, you can adjust
producer batching configurations,
consumer client batching configurations,
and shut down otherwise inactive clients.
|