Debezium Blog

Debezium 0.3.4 Released Debezium 0.3.4 Released

October 25, 2016 by Randall Hauch

We’re happy to announce that Debezium 0.3.4 is now available for use with Kafka Connect 0.10.0.1. This release contains several new features for the MySQL connector: support for MySQL’s JSON datatype, a new snapshot mode called schema_only, and JMX metrics. Also, the Debezium Docker images for Zookeeper, Kafka, and Kafka Connect have all been updated to allow optionally expose JMX metrics in these services. And, one backward-incompatible fix was made to the change event’s ts_sec field. See the release notes for specifics.

We’ve also updated the Debezium Docker images labelled 0.3 and latest, which we use in our tutorial.

Thanks to Akshath, Chris, Vitalii, Dennis, Prannoy, and others for their help with the release, issues, discussions, contributions, and questions!

Support for MySQL's JSON type coming soon Support for MySQL's JSON type coming soon

October 19, 2016 by Randall Hauch

mysql json

MySQL 5.7 introduced a new data type for storing and working with JSON data. Clients can define tables with columns using the new JSON datatype, and they can store and read JSON data using SQL statements and new built-in JSON functions to construct JSON data from other relational columns, introspect the structure of JSON values, and search within and manipulate JSON data. It possible to define generated columns on tables whose values are computed from the JSON value in another column of the same table, and to then define indexes with those generated columns. Overall, this is really a very powerful feature in MySQL.

Debezium’s MySQL connector will support the JSON datatype starting with the upcoming 0.3.4 release. JSON document, array, and scalar values will appear in change events as strings with io.debezium.data.json for the schema name. This will make it natural for consumers to work with JSON data. BTW, this is the same semantic schema type used by the MongoDB connector to represent JSON data.

This sounds straightforward, and we hope it is. But implementing this required a fair amount of work. That’s because although MySQL exposes JSON data as strings to client applications, internally it stores all JSON data in a special binary form that allows the MySQL engine to efficiently access the JSON data in queries, JSON functions and generated columns. All JSON data appears in the binlog in this binary form as well, which meant that we had to parse the binary form ourselves if we wanted to extract the more useful string representation. Writing and testing this parser took a bit of time and effort, and ultimately we donated it to the excellent MySQL binlog client library that the connector uses internally to read the binlog events.

Debezium 0.3.3 Released Debezium 0.3.3 Released

October 18, 2016 by Randall Hauch

releases mysql mongodb docker

We’re happy to announce that Debezium 0.3.3 is now available for use with Kafka Connect 0.10.0.1. This release contains a handful of bug fixes and minor improvements for the MySQL connector, including better handling of BIT(n) values, ENUM and SET values, and GTID sets, This release also improves the log messages output by the MySQL connectors to better represent the ongoing activity when consuming the changes from the source database. See the release notes for specifics.

We’ve also updated the Debezium Docker images labelled 0.3 and latest, which we use in our tutorial. We’ve also updated the tutorial to use the latest Docker installations on Linux, Windows, and OS X.

Thanks to Akshath, Chris, Randy, Prannoy, Umang, Horia, and others for their help with the release, issues, discussions, contributions, and questions!

Debezium 0.3.2 Released Debezium 0.3.2 Released

September 26, 2016 by Randall Hauch

releases mysql mongodb docker

We’re happy to announce that Debezium 0.3.2 is now available for use with Kafka Connect 0.10.0.1. This release contains a handful of bug fixes and minor improvements for the MySQL connector and MongoDB connector. The MySQL connector better handles BIT(n) values and zero-value date and timestamp values. This release also improves the log messages output by the MySQL and MongoDB connectors to better represent the ongoing activity when consuming the changes from the source database. See the release notes for specifics.

Thanks to Akshath, Colum, Emmanuel, Konstantin, Randy, RenZhu, Umang, and others for their help with the release, issues, discussions, contributions, and questions!

Serializing Debezium events with Avro Serializing Debezium events with Avro

September 19, 2016 by Randall Hauch

kafka avro serialization

Although Debezium makes it easy to capture database changes and record them in Kafka, one of the more important decisions you have to make is how those change events will be serialized in Kafka. Every message in Kafka has a key and a value, and to Kafka these are opaque byte arrays. But when you set up Kafka Connect, you have to say how the Debezium event keys and values should be serialized to a binary form, and your consumers will also have to deserialize them back into a usable form.

Debezium event keys and values are both structured, so JSON is certainly a reasonable option — it’s flexible, ubiquitous, and language agnostic, but on the other hand it’s quite verbose. One alternative is Avro, which is also flexible and language agnostic, but also faster and results in smaller binary representations. Using Avro requires a bit more setup effort on your part and some additional software, but the advantages are often worth it.