You can use Ansible Playbooks for Confluent Platform to update the Confluent Platform component configuration by
rerunning the provisioning playbook with an updated inventory file.
There are two deployment strategies: rolling and parallel. Rolling is the
default mode on running clusters.
Rolling deployment
In a rolling deployment, one node is reconfigured, redeployed, and has
health checks validated before moving onto the next. In the event of a
deployment failure on a node, the playbook stops and all remaining nodes
stay untouched and keep the old configuration.
The following reconfigurations are best handled with a rolling deployment:
- Simple property updates such as the Kafka property
log.retention.hours
- Java arguments updates
- Environment variable updates
- Updating certificates which are signed by the same CA or intermediate CA
Parallel deployment
In a parallel deployment, the deployment steps happen across all nodes in
a component at once. This method saves time, but leads to a service-wide
simultaneous restart.
Because rolling deployments are less impactful and do not cause a service
disruption, they are generally the safer option, but they do not work for
every use case. Major authentication and encryption changes do not work in a
rolling redeployment because, taking authentication as an example, the first
node will be restarted with an updated authentication mechanism that is invalid against
the rest of the cluster.
The following reconfiguration use cases are best handled with a parallel
redeployment:
- Major authentication changes
- Updating certificates signed by a new CA or intermediate CA
- Enabling RBAC
To enable parallel deployment mode, set the following variable:
deployment_strategy: parallel
Or, to select specific components to change the deployment mode, set the following variables:
zookeeper_deployment_strategy: parallel
kafka_broker_deployment_strategy: parallel
Failure Handling
The following options are supported when a configuration update fails.
Note that many Confluent Platform components (especially Kafka) can handle single node outages.
After a deployment fails on a node, to rollback the node, revert your inventory
file and redeploy on the node:
# Revert your inventory file and run the following command.
ansible-playbook -i hosts.yml all.yml \
--skip-tags package \
--limit <broken-node>
Try a new configuration on the broken node. Update your inventory file
once again and redeploy on the node:
# Update your inventory file and run the following command.
ansible-playbook -i hosts.yml all.yml \
--skip-tags package \
--limit <broken-node>
# Now deploy against all nodes.
ansible-playbook -i hosts.yml all.yml \
--skip-tags package
Enter the following command if you need a parallel restart for the change to work (for example, when enabling RBAC):
ansible-playbook -i hosts.yml all.yml \
--skip-tags package \
-e deployment_strategy=parallel