Organizations have a need to protect Personally Identifiable Information (PII). As Event Streaming Architecture (ESA) becomes ubiquitous in the enterprise, the prevalence of PII within data streams will only increase. Data architects must be cognizant of how their data pipelines can allow for potential leaks. In highly distributed systems, zero-trust networking has become an industry best practice. We can do the same with Kafka by introducing message-level security.
A DevSecOps Engineer with some Kafka experience can leverage Kafka Streams to protect PII by enforcing role-based access control using Open Policy Agent. Rather than implementing a REST API to handle message-level security, Kafka Streams can filter, or even transform outgoing messages in order to redact PII data while leveraging the native capabilities of Kafka.
In our proposed presentation, we will provide a live demonstration that consists of two consumers subscribing to the same Kafka topic, but receiving different messages based on the rules specified in Open Policy Agent. At the conclusion of the presentation, we will provide attendees with a GitHub repository, so that they can enjoy a sandbox environment for hands-on experimentation with message-level security.
The goal is to demonstrate how we can utilize Open Policy Agent to provide fine-grained access-control for Kafka topics and the Kafka Streams API to filter outgoing PII messages based on the end user. The demo will show how three different Kafka users (bobjones
, alicesmith
, johnhernandez
) will authenticate to the Kafka broker and receive different results while consuming from the same topic (pii
).
- Confluent v5.5.2 (ZooKeeper, Broker, Control Center)
- Open Policy Agent
- Python 3.8.11
- Java 11
- Maven 3.8.1
This demo is bootstrapped using the docker-compose.yaml
file which starts up the following Docker containers:
- Confluent (ZooKeeper, Broker, Control Center)
- Open Policy Agent
- PII fake data generator using Python
- Kafka Streams for each of the three Kafka SASL users:
bobjones
,alicesmith
,johnhernandez
To get started, run the following command:
# Prepend 'sudo' if required in your environment
docker-compose up -d
Please note that it is expected for the kafka-streams-johnhernandez
to show an exit 0
. This is because in the demo.rego, johnhernandez
was not given access to any of the topics. Read the section labeled Result for kafka-streams-johnhernandez for more information.
Example output when running docker-compose ps
:
To examine the messages received for each of the three Kafka users, run the following command (Be sure to replace the container_name
placeholder):
docker logs -f [container_name]
As expected, when connecting as bobjones
, we will only be able to view the messages pertaining to him. The same will go for the other users.
As mentioned before, since we did not grant johnhernandez
access to any of the topics via OPA, we will receive a TopicAuthorizationException
when attempting to consume messages.
In order for Kafka to integrate with Open Policy Agent, we created a derivative Docker image based on the confluentinc/cp-server:5.5.2
image. We utilized this GitHub repository which provides the custom authorization code that connects Kafka and OPA. From here, we can provide the following properties to the Kafka broker during startup:
# The Java jar path to the OPA class that handles authorization
KAFKA_AUTHORIZER_CLASS_NAME: tech.goraft.kafka.opa.OpaAuthorizer
# The OPA endpoint for handling authorization
KAFKA_OPA_AUTHORIZER_URL: "http://opa:8181/v1/data/kafka/authz/allow"
KAFKA_OPA_AUTHORIZER_ALLOW_ON_ERROR: "false"
KAFKA_OPA_AUTHORIZER_CACHE_INITIAL_CAPACITY: 100
KAFKA_OPA_AUTHORIZER_CACHE_MAXIMUM_SIZE: 100
KAFKA_OPA_AUTHORIZER_CACHE_EXPIRE_AFTER_MS: 600000
The Rego query language provides a declaritive way to define policies. For this demo, a single Rego file was created to handle the access-control for the various Kafka users. After a user authenticates to the Kafka broker, the defined OPA policies are checked to see if the user has access to the requested topic.
If you want to go beyond what is provided in the demo, the following sections will describe how to build and push the applications individually. Please verify the software defined in the Technical Stack section is properly installed on your machine.
Note: If you want to test your new image with the original demo, be sure to replace the appropriate image in the docker-compose.yaml.
- Change directory to
kafka-opa
:cd kafka-opa
- Build the Java application using Maven
mvn clean install
- Build the Docker image
docker build . -t [registry_name]/kafka-opa
- Push the Docker image
docker push [registry_name]/kafka-opa
- Change directory to
pii-datagen
:cd pii-datagen
- Build the Docker image
docker build . -t [registry_name]/pii-datagen
- Push the Docker image
docker push [registry_name]/pii-datagen
- Change directory to
kafka-streams-message-security
:cd kafka-streams-message-security
- Build the Docker image
docker build . -t [registry_name]/kafka-streams-message-security
- Push the Docker image
docker push [registry_name]/kafka-streams-message-security