I’m happy to announce that Debezium 0.2.1 is now available. The MySQL connector has been significantly improved and is now able to monitor and produce change events for HA MySQL clusters using GTIDs, perform a consistent snapshot when starting up the first time, and has a completely redesigned event message structure that provides a ton more information with every event. Our change log has all the details about bugs, enhancements, new features, and backward compatibility notices. We’ve also updated our tutorial.
Installing the MySQL connector
If you’ve already installed Zookeeper, Kafka, and Kafka Connect, then using Debezium’s MySQL connector is easy. Simply download the connector’s plugin archive, extract the JARs into your Kafka Connect environment, and add the directory with the JARs to Kafka Connect’s classpath. Restart your Kafka Connect process to pick up the new JARs.
If immutable containers are your thing, then check out Debezium’s Docker images for Zookeeper, Kafka, and Kafka Connect with the MySQL connector already pre-installed and ready to go. Our tutorial even walks you through using these images, and this is a great way to learn what Debezium is all about. You can even run Debezium on Kubernetes and OpenShift.
Using the MySQL connector
To use the connector to produce change events for a particular MySQL server or cluster, simply create a configuration file for the MySQL Connector and use the Kafka Connect REST API to add that connector to your Kafka Connect cluster. When the connector starts, it will grab a consistent snapshot of the databases in your MySQL server and start reading the MySQL binlog, producing events for every inserted, updated, and deleted row. The connector can optionally produce events with the DDL statements that were applied, and you can even choose to produce events for a subset of the databases and tables. Optionally ignore, mask, or truncate columns that are sensitive, too large, or not needed. See the MySQL connector’s documentation for all the details.
Using the libraries
Although Debezium is really intended to be used as turnkey services, all of Debezium’s JARs and other artifacts are available in Maven Central. You might want to use our MySQL DDL parser from our MySQL connector library to parse those DDL statments in your consumers.
We do provide a small library so applications can embed any Kafka Connect connector and consume data change events read directly from the source system. This provides a much lighter weight system (since Zookeeper, Kafka, and Kafka Connect services are not needed), but as a consequence is not as fault tolerant or reliable since the application must manage and maintain all state normally kept inside Kafka’s distributed and replicated logs. It’s perfect for use in tests, and with careful consideration it may be useful in some applications.
Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.
We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.