Concepts
An alert consists of a trigger and one or more actions. Triggers can be defined
for topics,
brokers, consumer groups, and clusters. Each trigger is based on a metric
with condition value criteria that determines
when the trigger should fire. Any actions associated with the trigger are
executed
when the criteria is met.
Supported actions include:
- Sending an email notification to one or more accounts
- Sending a Slack webhook notification
- Sending a PagerDuty webhook notification that creates an incident ticket
A trigger can be associated with any number of defined actions. When a trigger
fires, it executes all its
associated enabled actions for which the Max send rate
has not been
exceeded. If the max send rate of a particular action has been exceeded, the
trigger event is
added to a list queue associated with the action and is included in the action
event the next time it
is executed (actions can report a set of triggers, not just one trigger).
Note
Queuing does not occur when alerts (actions) are
paused
because triggers are ignored during that interim.
The maximum triggered events per alert (default: 1000) is controlled by
the confluent.controlcenter.max.trigger.events.per.alert.config
option.
Detection of anomalous events (triggering criteria) is decoupled from the
alert actions that
should be taken when
a triggering event occurs. This means that triggers and actions are defined
independently,
which provides flexibility
when setting one or more actions to perform when a trigger fires.
Each time interceptor data is received by Control Center, metric values (such as
consumption
difference and
latency) of the corresponding time windows are updated to reflect the new data.
All newly updated
metric values are then checked against all configured triggers to determine
whether a trigger
should fire.
Note
Interceptors can conceivably report data related to any time - alerting works
across all time
windows, not just those near real time.
Buffer for consumer group triggers (deprecated)
Tip
The buffer feature for this trigger has been deprecated.
It will be removed from Control Center in the 6.1.0 release.
Because of normal lag in the system, time windows close to real time will
frequently have
associated metric values that would be cause for concern if the time window was
further behind
real time. For this reason, triggers for consumer groups have an associated
buffer value. The buffer
allows you to require an alertable state to persist for a configurable period of
time to alleviate
prematurely activating a consumer group trigger.
A triggered event that is within Buffer
seconds of real time is not
immediately registered against
actions. When the time window ultimately moves greater than the buffer seconds
behind
real time, any associated metric value that would still cause a trigger to be
fired is then
registered against any appropriate actions.