Debezium Blog

It’s my pleasure to announce the first release of the Debezium 1.6 series, 1.6.0.Alpha1!

This release brings the brand new feature called incremental snapshots for MySQL and PostgreSQL connectors, a Kafka sink for Debezium Server, as well as a wide range of bug fixes and other small feature additions.

I’m thrilled to announce the release of Debezium 1.5.0.Final!

With Debezium 1.5, the LogMiner-based CDC implementation for Oracle moves from Incubating to Stable state, and there’s a brand-new implementation of the MySQL connector, which brings features like transaction metadata support. Other key features include support for a new "signalling table", which for instance can be used to implement schema changes with the Oracle connector, and support for TRUNCATE events with Postgres. There’s also many improvements to the community-led connectors for Vitess and Apache Cassandra, as well as wide range of bug fixes and other smaller improvements.

It’s my pleasure to announce the release of Debezium 1.5.0.CR1!

As we begin moving toward finalizing the Debezium 1.5 release stream, the Oracle connector has been promoted to stable and there were some TLS improvements for the Cassandra connector, as well as numerous bugfixes. Overall, 50 issues have been addressed for this release.

Kafka Streams is a library for developing stream processing applications based on Apache Kafka. Quoting its docs, "a Kafka Streams application processes record streams through a topology in real-time, processing data continuously, concurrently, and in a record-by-record manner". The Kafka Streams DSL provides a range of stream processing operations such as a map, filter, join, and aggregate.

Non-Key Joins in Kafka Streams

Debezium’s CDC source connectors make it easy to capture data changes in databases and push them towards sink systems such as Elasticsearch in near real-time. By default, this results in a 1:1 relationship between tables in the source database, the corresponding Kafka topics, and a representation of the data at the sink side, such as a search index in Elasticsearch.

In case of 1:n relationships, say between a table of customers and a table of addresses, consumers often are interested in a view of the data that is a single, nested data structure, e.g. a single Elasticsearch document representing a customer and all their addresses.

This is where KIP-213 ("Kafka Improvement Proposal") and its foreign key joining capabilities come in: it was introduced in Apache Kafka 2.4 "to close the gap between the semantics of KTables in streams and tables in relational databases". Before KIP-213, in order to join messages from two Debezium change event topics, you’d typically have to manually re-key at least one of the topics, so to make sure the same key is used on both sides of the join.

Thanks to KIP-213, this isn’t needed any longer, as it allows to join two Kafka topics on fields extracted from the Kafka message value, taking care of the required re-keying automatically, in a fully transparent way. Comparing to previous approaches, this drastically reduces the effort for creating aggregated events from Debezium’s CDC events.

We are very happy to announce the release of Debezium 1.5.0.Beta2!

The main features of this release is the new Debezium Signaling Table support, Vitess SET type support, and a continued focus to minor improvements, bugfixes, and polish as we sprint to the finish line for the 1.5 release.

Overall, the community fixed 54 issues since the Beta1 release, some of which we’ll explore more in-depth below.