When a Debezium connector is deployed to a Kafka Connect instance it is sometimes necessary to keep database credentials hidden from other users of the Connect API.
Let’s remind how a connector registration request looks like for the MySQL Debezium connector:
{
"name": "inventory-connector",
"config": {
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"tasks.max": "1",
"database.hostname": "mysql",
"database.port": "3306",
"database.user": "debezium",
"database.password": "dbz",
"database.server.id": "184054",
"database.server.name": "dbserver1",
"database.whitelist": "inventory",
"database.history.kafka.bootstrap.servers": "kafka:9092",
"database.history.kafka.topic": "schema-changes.inventory"
}
}
The username
and password
are passed to the API as plain strings. Worse yet, anybody who has access to the Kafka Connect cluster and its REST API can issue a GET
request to obtain a configuration of the connector including the database credentials:
curl -s http://localhost:8083/connectors/inventory-connector | jq .
{
"name": "inventory-connector",
"config": {
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"database.user": "debezium",
"database.server.id": "184054",
"tasks.max": "1",
"database.hostname": "mysql",
"database.password": "dbz",
"database.history.kafka.bootstrap.servers": "kafka:9092",
"database.history.kafka.topic": "schema-changes.inventory",
"name": "inventory-connector",
"database.server.name": "dbserver1",
"database.whitelist": "inventory",
"database.port": "3306"
},
"tasks": [
{
"connector": "inventory-connector",
"task": 0
}
],
"type": "source"
}
If one Kafka Connect cluster is shared by multiple connectors/teams, then this behaviour can be undesiable for security reasons.
To solve the problem KIP-297 ("Externalizing Secrets for Connect Configurations") was implemented in Kafka 2.0.
The externalization expects there is at least one implementation class of the org.apache.kafka.common.config.provider.ConfigProvider
interface. Kafka Connect provides the reference implementation org.apache.kafka.common.config.provider.FileConfigProvider
that reads secrets from a file. Available config providers are configured at Kafka Connect worker level (e.g. in connect-distributed.properties
) and are referred to from the connector configuration.
An example of worker configuration would be this:
config.providers=file
config.providers.file.class=org.apache.kafka.common.config.provider.FileConfigProvider
and the connector registration request will refer to it like so:
{
"name": "inventory-connector",
"config": {
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"tasks.max": "1",
"database.hostname": "mysql",
"database.port": "3306",
"database.user": "${file:/secrets/mysql.properties:user}",
"database.password": "${file:/secrets/mysql.properties:password}",
"database.server.id": "184054",
"database.server.name": "dbserver1",
"database.whitelist": "inventory",
"database.history.kafka.bootstrap.servers": "kafka:9092",
"database.history.kafka.topic": "schema-changes.inventory"
}
}
Here, the Placeholder ${file:/secrets/mysql.properties:user}
says that the file config provider should be used, reading the property file /secrets/mysql.properties
and extracting the user
property from it.
The file config provider is probably the simplest possible implementation, and it can be expected that other providers will appear that will integrate with secret repositories or identity management systems. It should be noted though that the file config provider is satisfactory in Kubernetes/OpenShift deployments, as secrets
objects could be injected into cluster pods as files and thus consumed by it.
We’ve created a version of the Debezium tutorial example, which demonstrates a deployment of externalized secrets. Please note the two environment variables in the Docker Compose connect
service:
- CONNECT_CONFIG_PROVIDERS=file
- CONNECT_CONFIG_PROVIDERS_FILE_CLASS=org.apache.kafka.common.config.provider.FileConfigProvider
These environment variables are directly mapped into Kafka Connect worker properties as a functionality of the debezium/connect
image.
When you issue the REST call to get the connector configuration, you will see that the sensitive information is externalized and masked from unauthorized users:
curl -s http://localhost:8083/connectors/inventory-connector | jq .
{
"name": "inventory-connector",
"config": {
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"database.user": "${file:/secrets/mysql.properties:user}",
"database.server.id": "184054",
"tasks.max": "1",
"database.hostname": "mysql",
"database.password": "${file:/secrets/mysql.properties:password}",
"database.history.kafka.bootstrap.servers": "kafka:9092",
"database.history.kafka.topic": "schema-changes.inventory",
"name": "inventory-connector",
"database.server.name": "dbserver1",
"database.whitelist": "inventory",
"database.port": "3306"
},
"tasks": [
{
"connector": "inventory-connector",
"task": 0
}
],
"type": "source"
}
Please refer to the README of the tutorial example for complete instructions.
About Debezium
Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.
Get involved
We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.