I’m very happy to share the news that Debezium 1.2.0.Beta2 has been released!
Core feature of this release is Debezium Server, a dedicated stand-alone runtime for Debezium, opening up its open-source change data capture capabilities towards messaging infrastructure like Amazon Kinesis.
Overall, the community has fixed 25 issues since the Beta1 release, some of which we’re going to explore in more depth in the remainder of this post.
Adding the Debezium Server runtime is a major milestone for the project. It is a ready-to-use standalone application for executing Debezium connectors. With Debezium Server, users can now choose from three different ways of operating Debezium, matching their individual needs:
Which one of these modes of execution you should use depends on your specific prerequisites, requirements and CDC use cases. Organizations running Apache Kafka and interested in setting up no-code data integration pipelines leveraging a rich connector eco-system, should go for the Kafka Connect approach. In-application cache invalidation is an application benefitting from the Debezium embedded engine. Debezium Server finally is meant for users who would like to take advantage of Debezium’s CDC functionality, using messaging platforms other than Apache Kafka. While you could have done so before by means of the embedded engine and a bit of bespoke Java programming, Debezium Server will greatly simplify this scenario.
Powered by the popular Quarkus stack, Debezium Server is a ready-made configurable Java application which runs a Debezium connector and propagates the produced change events to consumers via a chosen sink adapter. Initially supporting Amazon Kinesis, the Debezium Server architecture is extensible, and other adapters — e.g. for Google Cloud Pub/Sub or Microsoft Azure Event Hubs — will follow soon. Through the Debezium Server extension API, you can also implement custom sink adapters for your preferred infrastructure of propagating change events to consumers.
Ultimately, Debezium Server also is means of realizing our vision of CDC-as-a-Service, smoothly integrated with cloud-native infrastructure like Kubernetes and Knative. This release marks the first step of this endavour, and we couldn’t be more excited about the prospect of working together with the Debezium community towards this goal.
Stay tuned for more sink adapters, a container image, support for Knative Eventing, an operator for running Debezium Server on Kubernetes, and more!
Other Features and Fixes
Besides Debezium Server, a few other improvements and fixes found their way into this release. A number of improvements was done to the different single message transforms (SMTs) coming with Debezium:
Record headers and topic name are exposed to script expressions configured for these SMTs, so they can be evaluated by the filtering and routing logic (DBZ-2074)
The logical topic routing SMT can optionally pass through message keys as-is, instead of enriching them with a source topic identifier (DBZ-2034); this is very helpful when uniqueness of keys already is ensured across the different re-routed topics, e.g. when routing change events from the partition tables of a partitioned Postgres table into a single topic
Debezium’s Testcontainers integration allows for the usage of custom container images for Kafka Connect now (DBZ-2070), which comes in handy if you want to leverage custom connectors, converters or SMTs in your integration tests. For the SQL Server connector it’s optionally possible now to skip the queries for obtaining LSN timestamps (DBZ-1988). This can help to signficantly increase through-put of the connector.
Several fixes relate to the MySQL DDL parser, e.g. due to additional DDL capabilities in MySQL 8.0.x (DBZ-2080, "Unable to parse MySQL ALTER statement with named primary key", DBZ-2067; "Error and connector stops when DDL contains algorithm=instant") and when being used with MariaDB (DBZ-2062, "DDL statement throws error if compression keyword contains backticks (``)").
As always, you can find the complete list of all addressed issues and upgrading procedures in the release notes.
Gunnar is a software engineer at Decodable and an open-source enthusiast by heart. He has been the project lead of Debezium over many years. Gunnar has created open-source projects like kcctl, JfrUnit, and MapStruct, and is the spec lead for Bean Validation 2.0 (JSR 380). He’s based in Hamburg, Germany.
Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.
We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.