It’s with great pleasure that I am announcing the release of Debezium 1.7.0.Final!
Key features of this release include substantial improvements to the notion of incremental snapshotting (as introduced in Debezium 1.6), a web-based user Debezium user interface, NATS support in Debezium Server, and support for running Apache Kafka without ZooKeeper via the Debezium Kafka container image.
Also in the wider Debezium community some exciting things happened over the last few months; For instance, we saw a CDC connector for ScyllaDB based on the Debezium connector framework, and there’s work happening towards a Debezium Server connector for Apache Iceberg (details about this coming soon in a guest post on this blog).
Incremental Snapshotting Improvements
Introduced in Debezium 1.6 and based on a paper published by Netflix Engineering, incremental snapshotting addresses many long-standing feature requests around initial snapshots, such as the ability to re-snapshot specific tables, support for modifications to the include/exclude filter configuration, and resumeability of snapshots after a connector restart.
For Debezium 1.7, incremental snapshotting has been further improved and stabilized. The Debezium MySQL connector now allows incremental snapshotting for databases without write access by the connector, which is very useful when pointing Debezium to read-only replicas. Ad-hoc snapshots can now not only be triggered via the signal table as before, but also by sending a message to a specific Kafka topic, again strengthening the support for read-only scenarios. A big thank you to Kate Galieva of Shopify Engineering for these contributions!
Incremental snapshotting is now also supported by the Debezium connector for Oracle. Another snapshotting improvement relates to non-incremental snapshots: filtered columns are now excluded from snapshot select statements right away, which improves performance of the connector when excluding large BLOB columns for instance.
We’ll follow up with a more detailed blog post around incremental snapshotting shortly.
Debezium UI
Debezium UI is part of our efforts to further simplify the experience of getting started with and operating Debezium. The UI lets you configure and start new connectors, examine the state of running connectors, and more.
The Debezium UI team has been working tirelessly to build out this web app, with support for setting up transformations (SMTs) and topic auto creation settings coming up shortly. In the meantime please take a look at the blog post initially announcing the UI to learn more about it.
Further Improvements
Other improvements in Debezium 1.7 include support for NATS Streaming in Debezium Server, as well as support for Apache Kafka 2.8 in the Debezium container images. You even can use the Debezium container image for Apache Kafka to get your feet wet with running Apache Kafka without ZooKeeper!
There’s support for MySQL INVISIBLE
columns, an off-heap implementation of the transaction buffer of the Debezium connector for Oracle, allowing to process large long-running transactions, and much more. There also have been made several very nice performance improvements; a shout-out to Naveen Kumar for his continued help here, including the creation of several JMH benchmarks for measuring the impact of improvements to specific performance-sensitive areas of the code base.
Altogether, 206 issues have been fixed for the 1.7 final and preview releases. You can find out more in the original announcement posts for Debezium 1.7.0.Alpha1, 1.7.0.Beta1, 1.7.0.CR1, and 1.7.0.CR2. Please refer to the release notes of Debezium 1.7.0.Final for the list of issues resolved since CR2 as well as procedures for upgrading from earlier versions.
The Debezium project couldn’t exist without its amazing community of contributors from different countries all around the world! A big thank you to everyone contributing to this release in one way or another! Kudos to the following individuals from the community which contributed to the Debezium core repository in 1.7:
Alfusainey Jallow, Anisha Mohanty, Ashmeet Lamba, Bingqin Zhou, Bob Roldan, Blake Peno, Brennan Vincent, Camile Sing, Chris Baumbauer, Chris Cranford, Derek Moore, Dhrubajyoti G, Erik Malm, Gunnar Morling, Harvey Yue, Hussain Ansari, Hossein Torabi, Indra Shukla, Ismail Simsek, Jakub Cechacek, Jiabao Sun, Jiri Novotny, Jiri Pechanec, Jorn Argelo, Judah Rand, Katerina Galieva, Kyley Jex, Martín Pérez, Mark Drilling, Mike Kamornikov, Naveen Kumar, Patrick Chu, Pavel Strashkin, Raphael Auv, René Kerner, Sergei Morozov, Thiago Avancini, Thiago Dantas, Tin Nguyen, Tommy Karlsson, Vivek Wassan, WenChao Ke, yangsanity, Yossi Shirizli, Yuan Zhang, Xiao Fu, Zoran Regvart, 李宗文, and 민규 김.
Outlook
The next Debezium release, 1.8, is planned for the end of the year. The roadmap is still in flux, but some of the features we plan to address are support for MongoDB change streams (so to support MongoDB 5.0), improved support for MariaDB, and the ability to compact large database history topics.
We’re also planning to further build out the Debezium UI, continue the work on the Debezium connector for Oracle and making the SQL Server connector capable of dealing with multiple databases at once, and much more. Please let us know about your feature requests via the mailing list!
Gunnar Morling
Gunnar is a software engineer at Decodable and an open-source enthusiast by heart. He has been the project lead of Debezium over many years. Gunnar has created open-source projects like kcctl, JfrUnit, and MapStruct, and is the spec lead for Bean Validation 2.0 (JSR 380). He’s based in Hamburg, Germany.
About Debezium
Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.
Get involved
We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.