Debezium 1.5.0.Alpha1 Released

Improved LogMiner-based Capture Implementation

Since we’ve announced the LogMiner-based implementation for the Debezium Oracle connector in Debezium 1.3, we’ve seen a constantly growing interest in this connector by folks from our lively community, who tested it out, provided feedback, logged bug reports and feature requests, submitted pull requests with fixes, and more. Based on all this input, the connector is rapidly maturing, and we aim to move the LogMiner-based implementation from "Incubating" to "Stable" state in Debezium 1.5, or 1.6 the latest. This first Alpha release of Debezium 1.5 contains a number of related improvements:

java.sql.SQLException: ORA-01333: failed to establish Logminer Dictionary (DBZ-2939)
Capture and report LogMiner state when mining session fails to start (DBZ-3055)
Debezium Oracle Connector will appear stuck on large SCN jumps (DBZ-2982)
Improve logging for Logminer adapter (DBZ-2999)

Many thanks Martín Pérez, Milo van der Zee, Anton Kondratev, and all the others for their intensive testing, feedback, and contributions while working on this! One of the next steps in this area will be several performance-related improvements; stay tuned for the details.

Reworked MySQL Connector

In order to reduce the maintenance effort for all the different Debezium connectors, we’ve started work towards a common connector framework long time ago. This framework allows us to implement many features (and bug fixes) just once, and all the connectors based on this framework will be able to benefit from it. By now, almost all of the Debezium connectors have been ported to this framework, with the exception of the Cassandra and MySQL connectors.

As of this release, also the MySQL connector provides an implementation based on this framework. Since the MySQL connector has been the first one amongst the Debezium connectors, and it has quite a few specific characteristics and features, we have decided to not simply replace the existing implementation with a new one, but rather keep both, existing and new, side by side for some time.

This allows the new implementation to mature, also giving users the choice of which implementation to use. While the new connector implementation is the default one as of this release, you can go back to the earlier one by setting the internal.implementation option to legacy. We don’t have any immediate plans for removing the existing implementation, but focus for feature work and bug fixes will shift to the new implementation going forward. Please give the new connector implementation a try and let us know if you encounter any issues with it.

While the new implementation is largely on par feature-wise with with the earlier one, there’s one exception: the previous, experimental support for changing the filter configuration of a connector instance isn’t part of the new implementation. We’re planning to roll out a comparable feature for all the framework-based connectors in the near future. Now that there also is a framework-based implementation for the MySQL connector, we’re planning to provide a range of improvements to snapshotting for all the (relational) connectors: for instance the aforementioned capability to change filter configurations, means of parallelizing snapshot operations, and more.

Other Features

Besides these key features, there’s a range of other improvements, smaller new features, and bug fixes coming with this release, including the following:

Correct handling of lists of user types in the Cassandra connector (DBZ-2974)
Multiple DDL parser fixes for MySQL and MariaDB (DBZ-3018, DBZ-3020, DBZ-3023, DBZ-3039)
Better snapshotting performance for large Postgres schemas with many tables (DBZ-2575)
Ability to emit TRUNCATE events via the Postgres connector (DBZ-2382); note that, when enabled, this adds a new op type t for this connector’s change events, so please ensure your consumers can handle such events gracefully
Thanks to the work of Kewei Shang, there is now instructions for following the Debezium tutorial example using the incubating connector for Vitess (DBZ-2678), which was added in Debezium 1.4:

Altogether, 32 issues were fixed for this release. A big thank you goes out to all the community members who contributed: Bingqin Zhou, Dave Cramer Kewei Shang, Martín Pérez, Martin Sillence, Nick Murray, and Naveen Kumar.

For the upcoming 1.5 preview releases, we’re planning to focus on further improving and stabilizing the LogMiner-based connector implementation for Oracle, wrap up some loose ends around the MySQL connector migration, and begin to explore the aforementioned snapshotting improvements.

We’ve also made the decision to continue our efforts for creating a graphical Debezium user interface; this component is currently under active development, with support for more connectors, functionality for (re-)starting and stopping connectors, examining logs, and much more in the workings. If things go as planned, the UI will officially be part of the next Debezium 1.5 preview release!

Gunnar Morling

Gunnar is a software engineer and open-source enthusiast by heart, currently working as a Technologist at Confluent. Previously, he helped to build a realtime stream processing platform based on Apache Flink and led the Debezium project, a distributed platform for change data capture. He is a Java Champion and has founded multiple open source projects such as JfrUnit, kcctl, and MapStruct. Gunnar is an avid blogger (morling.dev) and has spoken at various conferences like QCon, Java One, and Devoxx. He lives in Hamburg, Germany.

About Debezium

Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.

Get involved

We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.