It’s my pleasure to announce the release of Debezium 0.9.0.Alpha2!

While the work on the connectors for SQL Server and Oracle continues, we decided to do another Alpha release, as lots of fixes and new features - many of them contributed by community members - have piled up, which we wanted to get into your hands as quickly as possible.

This release supports Apache Kafka 2.0, comes with support for Postgres' HSTORE column type, allows to rename and filter fields from change data messages for MongoDB and contains multiple bug fixes and performance improvements. Overall, this release contains 55 fixes (note that a few of these have been merged back to 0.8.x and are contained in earlier 0.8 releases, too).

A big "Thank You" is in order to community members Andrey Pustovetov, Artiship Artiship, Cliff Wheadon, Deepak Barr, Ian Axelrod, Liu Hanlin, Maciej Bryński, Ori Popowski, Peng Lyu, Philip Sanetra, Sagar Rao and Syed Muhammad Sufyian for their contributions to this release. We salute you!

Kafka Upgrade

Debezium runs with and has been tested on top of the recently released Apache Kafka 2.0 (DBZ-858). The widely used version Kafka 1.x continues to be supported as well.

Note that 0.10.x is not supported due to Debezium’s usage of the admin client API which is only available in later versions. It shouldn’t be too hard to work around this, so if someone is interested in helping out with this, this would be a great contribution (see DBZ-883).

Support for HSTORE columns in Postgres

Postgres is an amazingly powerful and flexible RDBMS, not the least due to its wide range of column types which go far beyond what’s defined by the SQL standard. One of these types being HSTORE, which is a string-to-string map essentially.

Debezium can capture changes to columns of this type now (DBZ-898). By default, the field values will be represented using Kafka Connect’s map data type. As this may not be supported by all sink connectors, you might alternatively represent them as a string-ified JSON by setting the new hstore.handling.mode connector option to json. In this case, you’d see HSTORE columns represented as values in change messages like so: { "key1" : "val1", "key2" : "val2" }.

Field filtering and renaming for MongoDB

Unlike the connectors for MySQL and Postgres, the Debezium MongoDB connector so far didn’t allow to exclude single fields of captured collections from CDC messages. Also renaming them wasn’t supported e.g. by means of Kafka’s ReplaceField SMT. The reason being that MongoDB doesn’t mandate a fixed schema for the documents of a given collection, and documents therefore are represented in change messages using a single string-ified JSON field.

Thanks to the fantastic work of community member Andrey Pustovetov, this finally has changed, i.e. you can remove given fields (DBZ-633) now from the CDC messages of given collections or have them renamed (DBZ-881). Please refer to the description of the new connector options field.blacklist and field.renames in the MongoDB connector documentation to learn more.

Extended source info

Another contribution by Andrey is the new optional connector field within the source info block of CDC messages (DBZ-918). This tells the type of source connector that produced the messages ("mysql", "postgres" etc.), which can come in handy in cases where specific semantics need to be applied on the consumer side depending on the type of source database.

Bug fixes and version upgrades

The new release contains a good number of bug fixes and other smaller improvements. Amongst them are

  • correct handling of invalid temporal default values with MySQL (DBZ-927),

  • support for table/collection names with special characters for MySQL (DBZ-878) and MongoDB (DBZ-865) and

  • fixed handling of blacklisted tables with the new Antlr-based DDL parser (DBZ-872).

Community member Ian Axelrod provided a fix for a potential performance issue, where changes to tables with TOAST columns in Postgres would cause repeated updates to the connector’s internal schema metadata, which can be a costly operation (DBZ-911). Please refer to the Postgres connector documentation for details on the new schema.refresh.mode option, which deals with this issue.

In terms of version upgrades we migrated to the latest releases of the MySQL (DBZ-763, DBZ-764) and Postgres drivers (DBZ-912). The former is part of a longer stream of work leading towards support of MySQL 8 which should be finished in one of the next Debezium releases. For Postgres we provide a Docker image with Debezium’s supported logical decoding plug-ins based on Alpine now, which might be interesting to those concerned about container size (DBZ-705).

Please see the change log for the complete list of fixed issues.

What’s next?

The work towards Debezium 0.9 continues, and we’ll focus mostly on improvements to the SQL Server and Oracle connectors. Other potential topics include support for MySQL 8 and native logical decoding as introduced with Postgres 10, which should greatly help with using the Debezium Postgres connectors in cloud environments such as Amazon RDS.

We’ll also be talking about Debezium at the following conferences:

Already last week I had the opportunity to present Debezium at JUG Saxony Day. If you are interested, you can find the (German) slideset of that talk on Speaker Deck.

Gunnar Morling

Gunnar is a software engineer at Decodable and an open-source enthusiast by heart. He has been the project lead of Debezium over many years. Gunnar has created open-source projects like kcctl, JfrUnit, and MapStruct, and is the spec lead for Bean Validation 2.0 (JSR 380). He’s based in Hamburg, Germany.

   


About Debezium

Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.

Get involved

We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.