Debezium has received a huge improvement to the structure of its container images recently, making it extremely simple to extend its behaviour.

This is a small tutorial showing how you can for instance add Sentry, "an open-source error tracking [software] that helps developers monitor and fix crashes in real time". Here we’ll use it to collect and report any exceptions from Kafka Connect and its connectors. Note that this is only applicable for Debezium 0.9+.

We need a few things to have Sentry working, and we’ll add all of them and later have a Dockerfile which gets it all glued correctly:

  • Configure Log4j

  • SSL certificate for sentry.io, since it’s not by default in the JVM trusted chain

  • The sentry and sentry-log4j libraries

Log4j Configuration

Let’s create a file config/log4j.properties in our local project which is a copy of the one shipped with Debezium images and add Sentry to it. Note we added Sentry to log4j.rootLogger and created the section log4j.appender.Sentry, the rest remains as the original configuration:

kafka.logs.dir=logs

log4j.rootLogger=INFO, stdout, appender, Sentry

# Disable excessive reflection warnings - KAFKA-5229
log4j.logger.org.reflections=ERROR

log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.threshold=INFO
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{ISO8601} %-5p  %X{dbz.connectorType}|%X{dbz.connectorName}|%X{dbz.connectorContext}  %m   [%c]%n

log4j.appender.appender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.appender.DatePattern='.'yyyy-MM-dd-HH
log4j.appender.appender.File=${kafka.logs.dir}/connect-service.log
log4j.appender.appender.layout=org.apache.log4j.PatternLayout
log4j.appender.appender.layout.ConversionPattern=%d{ISO8601} %-5p  %X{dbz.connectorType}|%X{dbz.connectorName}|%X{dbz.connectorContext}  %m   [%c]%n

log4j.appender.Sentry=io.sentry.log4j.SentryAppender
log4j.appender.Sentry.threshold=WARN

Sentry.io SSL certificate

Download the getsentry.pem file from sentry.io and put it in your project’s directory under ssl/.

The Dockerfile

Now we can glue everything together in our Debezium image:

  • Let’s first create a JKS file with our Sentry certificate; this uses a Docker multi-stage building process, where we are generating a certificates.jks which we’ll later copy into our Kafka Connect with Debezium stage

  • Copy log4j.properties into $KAFKA_HOME/config/log4j.properties

  • Copy the JKS file from the multi-stage build

  • Set ENV with the Sentry version and m5sums

  • Download Sentry dependencies, the script you see called docker-maven-download is a helper which we ship by default in our images. In this case we’re using it to download a JAR file from Maven Central and put it in the Kafka libs directory. We do that by setting the ENV var MAVEN_DEP_DESTINATION=$KAFKA_HOME/libs:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
FROM fabric8/java-centos-openjdk8-jdk:1.6 as ssl-jks

ARG JKS_STOREPASS="any random password, you can also set it outside via the arguments from docker build"

USER root:root

COPY /ssl /ssl

RUN chown -R jboss:jboss /ssl

USER jboss:jboss

WORKDIR /ssl

RUN keytool -import -noprompt -alias getsentry \
    -storepass "${JKS_STOREPASS}" \
    -keystore certificates.jks \
    -trustcacerts -file "/ssl/getsentry.pem"

FROM debezium/connect:0.10 AS kafka-connect

EXPOSE 8083

COPY config/log4j.properties "$KAFKA_HOME/config/log4j.properties"

COPY --from=ssl-jks --chown=kafka:kafka /ssl/certificates.jks /ssl/

ENV SENTRY_VERSION=1.7.23 \
    MAVEN_DEP_DESTINATION=$KAFKA_HOME/libs

RUN docker-maven-download \
        central io/sentry sentry "$SENTRY_VERSION" 4bf1d6538c9c0ebc22526e2094b9bbde && \
    docker-maven-download \
        central io/sentry sentry-log4j "$SENTRY_VERSION" 74af872827bd7e1470fd966449637a77

Build and Run

Now we can simply build the image:

$ docker build -t debezium/connect-sentry:1 --build-arg=JKS_STOREPASS="123456789" .

When running the image we have now to configure our Kafka Connect application to load the JKS file by setting KAFKA_OPTS: -Djavax.net.ssl.trustStore=/ssl/certificates.jks -Djavax.net.ssl.trustStorePassword=<YOUR TRUSTSTORE PASSWORD>.

Sentry can be configured in many ways, I like to do it via environment variables, the minimum we can set is the Sentry DSN (which is necessary to point to your project) and the actual running environment name (i.e.: production, staging).

In this case we can configure the variables: SENTRY_DSN=<GET THE DNS IN SENTRY’S DASHBOARD>, SENTRY_ENVIRONMENT=dev.

In case you’d like to learn more about using the Debezium container images, please check our tutorial.

And that’s it, a basic a recipe for extending our Docker setup using Sentry as an example; other modifications should also be as simple as this one. As an example how a RecordTooLarge exception from the Kafka producer would look like in this setup, see the picture below:

Sentry Exception example

Conclusion

Thanks to the recent refactor of the Debezium container images, it got very easy to amend them with your custom extensions. Downloading external dependencies and adding them to the images became a trivial task and we’d love to hear your feedback about it!

If you are curious about the refactoring itself, you can find the details in pull request debezium/container-images#131.

Renato Mefi

Renato Mefi is a Staff Engineer at Usabilla (SurveyMonkey), where he innovates mostly around Kafka, Docker and DevOps in general, has contributed to Kafka, Debezium and others. He lives in Amsterdam and cycles for fun and commute.

   


About Debezium

Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.

Get involved

We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.