Deploying Debezium on OpenShift

The following describes how to set up the Debezium connectors for change data capture on Red Hat’s OpenShift container platform.

These instructions have been tested using the Minishift tool — allowing you to run a single node OpenShift instance locally on your machine — but should work equally well for OpenShift Online (Pro) or OpenShift Dedicated.

You can find a complete example of this set-up using Minishift in our examples repository.

Debezium Deployment

For setting up Kafka and Kafka Connect on OpenShift, a set of images provided by the EnMasse project can be used. One of the EnMasse components is Barnabas, "Kafka as a Service". It consists of enterprise grade configuration files and images that bring Kafka to OpenShift.

First we install the Kafka broker and Kafka Connect templates into our OpenShift project:

oc create -f https://raw.githubusercontent.com/EnMasseProject/barnabas/master/kafka-statefulsets/resources/openshift-template.yaml
oc create -f https://raw.githubusercontent.com/EnMasseProject/barnabas/master/kafka-connect/s2i/resources/openshift-template.yaml

Next we will create a Kafka Connect image with the Debezium connectors installed and then deploy a Kafka broker cluster and a Kafka Connect cluster:

# Deploy a Kafka broker
oc new-app -p ZOOKEEPER_NODE_COUNT=1 barnabas

# Build a Debezium image
export DEBEZIUM_VERSION=0.6.2
oc new-app -p BUILD_NAME=debezium -p TARGET_IMAGE_NAME=debezium \
    -p TARGET_IMAGE_TAG=$DEBEZIUM_VERSION barnabas-connect-s2i
mkdir -p plugins && cd plugins && \
for PLUGIN in {mongodb,mysql,postgres}; do \
    curl http://central.maven.org/maven2/io/debezium/debezium-connector-$PLUGIN/$DEBEZIUM_VERSION/debezium-connector-$PLUGIN-$DEBEZIUM_VERSION-plugin.tar.gz | tar xz; \
done && \
oc start-build debezium --from-dir=. --follow && \
cd .. && rm -rf plugins

After a while all parts should be up and running:

oc get pods
NAME                    READY     STATUS      RESTARTS   AGE
debezium-1-build        0/1       Completed   0          3m
debezium-2-build        0/1       Completed   0          3m
kafka-0                 1/1       Running     2          3m
kafka-1                 1/1       Running     0          2m
kafka-2                 1/1       Running     0          2m
kafka-connect-3-3v4n9   1/1       Running     1          3m
zookeeper-0             1/1       Running     0          3m

Alternatively, you can go to the "Pods" view of your OpenShift Web Console (https://myhost:8443/console/project/myproject/browse/pods) to confirm all pods are up and running:

openshift pods

Verifying the Deployment

Next we are going to verify whether the deployment is correct by emulating the Debezium Tutorial in the OpenShift environment.

First we need to start a MySQL server instance containing some example tables:

# Deploy pre-populated MySQL instance
oc new-app --name=mysql debezium/example-mysql:0.6

# Configure credentials for the database
oc env dc/mysql MYSQL_ROOT_PASSWORD=debezium  MYSQL_USER=mysqluser MYSQL_PASSWORD=mysqlpw

A new pod with MySQL server should be up and running:

oc get pods
NAME                             READY     STATUS      RESTARTS   AGE
...
mysql-1-4503l                    1/1       Running     0          2s
mysql-1-deploy                   1/1       Running     0          4s
...

Then we are going to register the Debezium MySQL connector to run against the deployed MySQL instance:

oc exec -i kafka-0 -- curl -X POST \
    -H "Accept:application/json" \
    -H "Content-Type:application/json" \
    http://kafka-connect:8083/connectors -d @- <<'EOF'

{
    "name": "inventory-connector",
    "config": {
        "connector.class": "io.debezium.connector.mysql.MySqlConnector",
        "tasks.max": "1",
        "database.hostname": "mysql",
        "database.port": "3306",
        "database.user": "debezium",
        "database.password": "dbz",
        "database.server.id": "184054",
        "database.server.name": "dbserver1",
        "database.whitelist": "inventory",
        "database.history.kafka.bootstrap.servers": "kafka:9092",
        "database.history.kafka.topic": "schema-changes.inventory"
    }
}
EOF

Kafka Connect’s log file should contain messages regarding execution of initial snapshot:

oc logs $(oc get pods -o name -l name=kafka-connect)

Now we can read change events for the customers table from the corresponding Kafka topic:

oc exec -it kafka-0 -- /opt/kafka/bin/kafka-console-consumer.sh \
    --bootstrap-server kafka:9092 \
    --from-beginning \
    --property print.key=true \
    --topic dbserver1.inventory.customers

You should see an output like the following (formatted for the sake of readability):

# Message 1
{
    "id": 1001
}

# Message 1 Value
{
    "before": null,
    "after": {
        "id": 1001,
        "first_name": "Sally",
        "last_name": "Thomas",
        "email": "sally.thomas@acme.com"
    },
    "source": {
        "name": "dbserver1",
        "server_id": 0,
        "ts_sec": 0,
        "gtid": null,
        "file": "mysql-bin.000003",
        "pos": 154,
        "row": 0,
        "snapshot": true,
        "thread": null,
        "db": "inventory",
        "table": "customers"
    },
    "op": "c",
    "ts_ms": 1509530901446
}

# Message 2 Key
{
    "id": 1002
}

# Message 2 Value
{
    "before": null,
    "after": {
        "id": 1002,
        "first_name": "George",
        "last_name": "Bailey",
        "email": "gbailey@foobar.com"
    },
    "source": {
        "name": "dbserver1",
        "server_id": 0,
        "ts_sec": 0,
        "gtid": null,
        "file": "mysql-bin.000003",
        "pos": 154,
        "row": 0,
        "snapshot": true,
        "thread": null,
        "db": "inventory",
        "table": "customers"
    },
    "op": "c",
    "ts_ms": 1509530901446
}
...

Finally, let’s modify some records in the customers table of the database:

oc exec -it $(oc get pods -o custom-columns=NAME:.metadata.name --no-headers -l app=mysql) \
    -- bash -c 'mysql -u $MYSQL_USER -p$MYSQL_PASSWORD inventory'

# E.g. run UPDATE customers SET email="sally.thomas@example.com" WHERE ID = 1001;

You should now see additional change messages in the consumer started before.

If you got any questions or requests related to running Debezium on OpenShift, please let us know via our user group or in the Debezium developer’s chat.

back to top