As temperatures are cooling off, the Debezium team is getting into full swing again and we’re happy to announce the release of Debezium 0.8.3.Final!
This is a bugfix release to the current stable release line of Debezium, 0.8.x, while the work on Debezium 0.9 goes on in parallel. There are 14 fixes in this release. As in earlier 0.8.x releases, we’ve further improved the new Antlr-based DDL parser used by the MySQL connector (see DBZ-901, DBZ-903 and DBZ-910).
The Postgres connector saw a huge improvement to its start-up time for databases with lots of custom types (DBZ-899). The user reporting this issue had nearly 200K entries in pg_catalog.pg_type, and due to an N + 1 SELECT issue within the Postgres driver itself, this caused the connector to take 24 minutes to start. By using a custom query for obtaining the type metadata, we were able to cut down this time to 5 seconds! Right now we’re working with the maintainers of the Postgres driver to get this issue fixed upstream, too.
More Flexible Propagation of DELETEs
Besides those bug fixes we decided to also merge one new feature from the 0.9.x branch into the 0.8.3.Final release, which those of you may find useful who are using the SMT for extracting the "after" state from change events (DBZ-857).
This SMT can be employed to stream changes to sink connectors which expect just a "flat" row representation of data instead of Debezium’s complex event structure. Not all sink connectors support the handling of deletions, though. E.g. some connectors will fail when encountering tombstone events. Therefore the SMT can now optionally rewrite delete events into updates of a special "deleted" marker field.
For that, set the delete.handling.mode
option of the SMT to "rewrite":
...
"transforms" : "unwrap",
"transforms.unwrap.type": "io.debezium.transforms.UnwrapFromEnvelope",
"transforms.unwrap.delete.handling.mode" : "rewrite",
...
When a DELETE event is propagated, the "__deleted" field of outgoing records will be set to true. So when for instance consuming the events with the JDBC sink connector, you’d see this being reflected in a corresponding column in the sink tables:
__deleted | last_name | id | first_name | email
-----------+-----------+------+------------+-----------------------
false | Thomas | 1001 | Sally | sally.thomas@acme.com
false | Bailey | 1002 | George | gbailey@foobar.com
false | Kretchmar | 1004 | Anne | annek@noanswer.org
true | Walker | 1003 | Edward | ed@walker.com
You then for instance can use a batch job running on your sink to remove all records flagged as deleted.
What’s next?
We’re continuing the work on Debezium 0.9, which will mostly be about improvements to the SQL Server and Oracle connectors. The current plan is to do the next 0.9 release (either Alpha2 or Beta1) in two weeks from now.
Also it’s the beginning of the conference season, so we’ll spend some time with preparing demos and presenting Debezium at multiple locations. There will be sessions on change data capture with Debezium a these conferences:
-
JUG Saxony Day; Dresden, Germany; Sept. 28
-
Kafka Summit; San Francisco, Cal.; Oct. 17
-
VoxxedDays Microservices; Paris, France; Oct. 29 - 31
-
Devoxx Morocco; Marrakesh, Morocco; Nov. 27 - 29
If you are at any of these conferences, come and say Hi; we’d love to exchange with you about your use cases, feature requests, feedback on our roadmap and any other ideas around Debezium.
Finally, a big "Thank You" goes to our fantastic community members Andrey Pustovetov, Maciej Bryński and Peng Lyu for their contributions to this release!
Gunnar Morling
Gunnar is a software engineer at Decodable and an open-source enthusiast by heart. He has been the project lead of Debezium over many years. Gunnar has created open-source projects like kcctl, JfrUnit, and MapStruct, and is the spec lead for Bean Validation 2.0 (JSR 380). He’s based in Hamburg, Germany.
About Debezium
Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.
Get involved
We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.