Confluent Kafka Security
In the Confluent Kafka default installation there is no encryption, authentication, or authorization configured. All the components communicate freely in plain text with any topic. This can be a big risk for the business, depending on the criticality of the information being transferred. In this case, it is crucial to have the security components configured and working properly.
Before jump to the OAuthBearer mechanism, it is important show the options that Confluent Kafka supports over security, below we classified the methods/components in three categories: Authentication, authorization and data encryption
Note that the data encryption works only in-transit from applications to brokers, the data sits unencrypted on the broker’s disk.
To use the OAuthBearer with the callback implementations it is necessary to enable SSL/TLS encryption, without the encryption, only the default implementation for unsecured JSON Web Tokens works properly.
You have to create SSL Keys and Certificates, configure brokers and applications, for details, check out the Confluent documentations.
It is important to observe that the Kafka relies on Zookeeper to store its metadata (maybe not for long), and the current version of Zookeeper do not support SSL/TLS, so it is important to protect the access and network of this important component.
OAuthBearer for Authentication
The release 5.0.0 of Confluent Kafka added the SASL OAuthBearer, a framework for authenticating to Kafka brokers using OAuth2 access tokens, which enables authentication using OAuth2 bearer tokens. The implementation could be customizable through callbacks for token retrieval and validation, providing the flexibility required for integration with different OAuth2 providers, for instance, Keycloak, Auth0, Okta, etc.
Below we have a diagram with the components and callbacks that must be implemented for the OAuth Bearer tokens retrieval, and simple implementation examples
Login Callback for Token Retrieval
Server Callback Handler for Token Validation
Communication with OAuth2 Provider
Access Control List for Authorization
For the authorization Confluent Kafka have an implementation that uses Zookeeper to store all the ACLs. By default, if a resource has no associated ACLs, then no one is allowed to access the resource, except super users configured in the super.users property in the broker servers.properties
With the OAuthBearer authentication, the ACLs authorizer implementation uses the principal name exposed by org.apache.kafka.common.security.oauthbearer.OAuthBearerToken (below) to allow or deny access to the topics. In the callback handler we used the property sub (subject - JSON Web Token) from OAuth token introspection response as principal name, but could be customized as necessary.
Integrating and centralizing the security it is always a tricky challenge, but in general offers good results through better control and visibility for the security team, and it is not different in the Confluent Kafka.
To help this journey with a more in depth concepts, we recommend reading the Confluent Security Documentation.