It’s my pleasure to announce the first release of the Debezium 1.6 series, 1.6.0.Alpha1!
This release brings the brand new feature called incremental snapshots for MySQL and PostgreSQL connectors, a Kafka sink for Debezium Server, as well as a wide range of bug fixes and other small feature additions.
Incremental Snapshotting
Running Debezium exhibits few pain-points
-
the necessity to execute consistent snapshot before streaming is started upon new connector restart
-
inability to trigger full or partial snapshot after having connector to be running for some time
Starting this release we are deploying the solution to both these potential pitfalls.
The simpler one - an ability to trigger the snapshot during the runtime is solved by ad-hoc snapshots. The user can trigger a snapshot anytime during the streaming phase by sending an execute-snapshot
signal to Debezium with the list of tables to be snapshotted and the type of the snapshot to be used (only incremental
is supported right now, see below). When Debezium receives the signal it will execute the snapshot of the requested tables.
The more complex part that goes hand-in-hand with ad-hoc snapshotting is incremental snapshots. This feature allows the user to execute a snapshot of a set of tables during the streaming phase without interrupting the streaming. Moreover, contrary to the initial snapshot, the snapshot will resume upon connector restart and does not need to start from scratch again.
The implementation of this feature is based on a novel approach to snapshotting originally invented by DBLog Framework. Debezium implementation is described in more detail in the design document.
If you want to try the feature yourself then you need to
-
provide a signalling table
-
trigger an ad-hoc incremental snapshot by using SQL command like
INSERT INTO myschema.debezium_signal VALUES('ad-hoc-1', 'execute-snapshot', '{"data-collections": ["schema1.table1", "schema1.table2"]}')
Kafka Sink for Debezium Server
Debezium connectors can either run in Kafka Connect or can be deployed using Debezium Server that provides different destination sinks. Starting with this release if a sink is Apache Kafka it is no longer necessary to use Kafka Connect but Debezium Server with Apache Kafka Sink could be used instead which may simplify operational requirements for some deployments. In this case, the regular Apache Kafka client API is used.
Altogether, 47 issues were fixed for this release. A big thank you goes out to all the community members who contributed: Alfusainey Jallow, Bingqin Zhou, Hossein Torabi, Kyley Jex, Martín Pérez, Patrick Chu, Raphael Auv, Tommy Karlsson, WenChao Ke, and yangsanity.
For the upcoming 1.6 preview releases, we’re planning to focus on completing the follow-up task for incremental snapshotting and provide the support for SQL Server and Db2 connectors too, further improving the LogMiner-based connector implementation for Oracle mainly related to schema evolutions and LOB support.
About Debezium
Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.
Get involved
We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.