I am excited to announce the release of Debezium 1.4.0.Alpha1!

This first pass of the 1.4 release line provides a few useful new features:

  • New Vitess connector

  • Allow fine-grained selection of snapshotted tables

Overall, the community fixed 41 issues for this release. Let’s take a closer look at some of the highlights.

Vitess Connector

Vitess is a database solution for deploying, scaling, and managing large clusters of MySQL. We are very happy that the development team around Ruslan Gibaiev and Kewei Shang of Bolt Technology OÜ decided to build a CDC solution based on Debezium and to open-source it under the Debezium umbrella. This connector is released in incubating state in Debezium 1.4.

Ruslan and Kewei will follow up with a blog post with more details around this connector very soon; in the mean time please refer to the connector reference documentation to learn more.

Fine-grained Selection of Snapshotted Tables

One of the major focus points for Debezium 1.4 is to explore more flexible snapshot options, e.g. to re-snapshot chosen tables or parallelizing long-running snapshot operations.

A first improvement related to snapshotting is the new connector configuration snapshot.include.collection.list, which allows to snapshot only a subset of all the tables which the connector will capture later on during log reading. This comes in handy if for instance you’re interested in capturing changes to all your tables, but only need an initial snasphot of the data for some of them.

For the Postgres connector, by creating a custom implementation of the Snapshotter SPI contract, this also allows for a selective re-snapshot of specific tables. After restarting the connector, such Snapshotter would continue to read the log from the point where it left off previously until "now", then it would take a snapshot of the given tables, and finally continue to read the log for all captured tables.

For more information on this option, please see the connector-specific documentation for more details.

Other Features

Besides these key features, there’s a few other features coming with the 1.4.0.Alpha1 release:

  • Implement snapshot select override behavior for MongoDB DBZ-2496

  • SqlServer - Skip processing of LSNs not associated with change table entries DBZ-2582

Bugfixes

Also a number of bugs were fixed, e.g.:

  • Cant override environment variables DBZ-2559

  • ConcurrentModificationException during exporting data for a mongodb collection in a sharded cluster DBZ-2597

  • Mysql connector didn’t pass the default db charset to the column definition DBZ-2604

  • [Doc] "registry.redhat.io/amq7/amq-streams-kafka-25: unknown: Not Found" error occurs DBZ-2609

  • [Doc] "Error: no context directory and no Containerfile specified" error occurs DBZ-2610

  • SqlExceptions using dbz with Oracle on RDS online logs and LogMiner DBZ-2624

  • Mining session stopped - task killed/SQL operation cancelled - Oracle LogMiner DBZ-2629

  • Unparseable DDL: Using 'trigger' as table alias in view creation DBZ-2639

  • Antlr DDL parser fails to interpret BLOB([size]) DBZ-2641

  • MySQL Connector keeps stale offset metadata after snapshot.new.tables is changed DBZ-2643

  • WAL logs are not flushed in Postgres Connector DBZ-2653

  • Debezium Server Event Hubs plugin support in v1.3 DBZ-2660

  • Cassandra Connector doesn’t use log4j for logging correctly DBZ-2661

  • Should Allow NonAsciiCharacter in SQL DBZ-2670

  • MariaDB nextval function is not supported in grammar DBZ-2671

  • Sanitize field name do not sanitize sub struct field DBZ-2680

  • Debezium fails if a non-existing view with the same name as existing table is dropped DBZ-2688

A big thank you to all the contributors from the community who worked on this release: Faizan, Sergei Morozov, Kewei Shang, Michael Wang, Arik Cohen, James Gormley, jinguangyang, Kaushik Iyer, John Martin, Travis Elnicky, Yiming Liu, and Bingqin Zhou!

Chris Cranford

Chris is a software engineer at Red Hat. He previously was a member of the Hibernate ORM team and now works on Debezium. He lives in North Carolina just a few hours from Red Hat towers.

   


About Debezium

Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.

Get involved

We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.