Subscribe


Debezium 0.9.1.Final Released

Quickly following up to last week’s release of Debezium 0.9, it’s my pleasure today to announce the release of Debezium 0.9.1.Final!

This release fixes a couple of bugs which were reported after the 0.9 release. Most importantly, there are two fixes to the new Debezium connector for SQL Server, which deal with correct handling of LSNs after connector restarts (DBZ-1128, DBZ-1131). The connector also uses more reasonable defaults for the selectMethod and fetchSize options of the SQL Server JDBC driver (DBZ-1065), which can help to significantly increase through-put and reduce memory consumption of the connector.

The MySQL connector supports GENERATED columns now with the new Antlr-based DDL parser (DBZ-1123), and for the Postgres connector the handling of primary key column definition changes was improved (DBZ-997).

In terms of new features, there is a new container image provided on Docker Hub now: the debezium/tooling image contains a couple of open-source CLI tools (currently kafkacat, httpie, jq, mycli and pqcli) which greatly help when working with Debezium connectors, Apache Kafka and Kafka Connect on the command line (DBZ-1125). A big thank you to the respective authors these fantastic tools!

CLI tools for working with Debezium

Altogether, 12 issues were resolved in this release. Please refer to the release notes to learn more about all fixed bugs, update procedures etc.

Thanks a lot to community members Ivan Lorenz and Tomaz Lemos Fernandes for their contributions to this release!

About Debezium

Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.

Get involved

We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Gitter, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.


Debezium 0.9.0.Final Released

I’m delighted to announce the release of Debezium 0.9 Final!

This release only adds a small number of changes since last week’s CR1 release; most prominently there’s some more metrics for the SQL Server connector (lag behind master, number of transactions etc.) and two bug fixes related to the handling of partitioned tables in MySQL (DBZ-1113) and Postgres (DBZ-1118).

Having been in the works for six months after the initial Alpha release, Debezium 0.9 comes with a brand new connector for SQL Server, lots of new features and improvements for the existing connectors, updates to the latest versions of Apache Kafka and the supported databases as well as a wide range of bug fixes.

Some key features of the release besides the aforementioned CDC connector for SQL Server are:

  • Initial snapshotting for the Oracle connector (which remains to be a "tech preview" at this point)

  • Brand-new metrics for the SQL Server and Oracle connectors and extended metrics for the MySQL connector

  • Field filtering and renaming for MongoDB

  • A new handler interface for the embedded engine

  • Lots of improvements around the "event flattening" SMT for MongoDB

  • More detailed source info in CDC events and optional metadata such as a column’s source type

  • Option to delay snapshots for a given time

  • Support for HSTORE columns in Postgres

  • Incubating support for picking up changes to the whitelist/blacklist configuration of the MySQL connector

As a teaser on the connector metrics support, here’s a screenshot of Java Mission Control displaying the SQL Server connector metrics:

Monitoring the Debezium SQL Server connector

The list above is far from being exhaustive; please take a look at the preview release announcements (Alpha1, Alpha2, Beta1, Beta2 and CR 1) as well as the full list of a whopping 176 fixed issues in JIRA.

It’s hard to say which of the changes and new features I’m most excited about, but one thing surely sticking out is the tremendous amount of community work on this release. Not less than 34 different members of Debezium’s outstanding community have contributed to this release. A huge and massive "Thank You!" to all of you:

When upgrading from earlier Debezium releases, please make sure to read the information regarding update procedures and breaking changes in the release notes. One relevant change to the users of the Debezium connector for MySQL is that our new Antlr-based DDL parser is used by default now. After lots of honing we felt it’s time for using the new parser by default now. While the existing parser can still be used as a fallback as of Debezium 0.9, it will be phased out in 0.10.

Next Steps

After some drinks to celebrate this release, the plan is to do a 0.9.1 release rather quickly (probably in two weeks from now), providing improvements and potential bug fixes to the features and changes done in 0.9. We’ll also begin the work on Debezium 0.10, stay tuned for the details on that!

For further plans beyond that, take a look at our road map. Any suggestions and ideas are very welcomed on mailing list or in the comments below.

If you’re just about to begin using Debezium for streaming changes out of your database, you might be interested in join us for the upcoming webinar on February 7th. After a quick overview, you’ll see Debezium in action, as it streams changes to a browser-based dashboard and more. You can also find lots of resources around Debezium and change data capture such as blog posts and presentations in our curated list of online resources.

About Debezium

Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.

Get involved

We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Gitter, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.


Debezium 0.9.0.CR1 Released

Reaching the home stretch towards Debezium 0.9, it’s with great pleasure that I’m announcing the first release of Debezium in 2019, 0.9.0.CR1!

For this release we’ve mainly focused on sorting out remaining issues in the Debezium connector for SQL Server; the connector comes with greatly improved performance and has received a fair number of bug fixes.

Other changes include a new interface for event handlers of Debezium’s embedded engine, which allows for bulk handling of change events, an option to export the scale of numeric columns as schema parameter, as well as a wide range of bug fixes for the Debezium connectors for MySQL, Postgres and Oracle.

SQL Server Connector Improvements

The SQL Server connector supports blacklisting of specific columns now (DBZ-1067). That’s useful in cases where you’d like to exclude specific columns from emitted change data messages, e.g. to data protection considerations.

The "snapshot locking mode" option has been reworked (DBZ-947) and is named "snapshot isolation mode" now, better reflecting its semantics. A new mode "repeatable_read" has been added, and "none" has been renamed to "read_uncommitted". Please see the connector documentation and the migration notes for more details.

The connector allows for a much higher through-put now, thanks to caching of timestamps for the same LSN (DBZ-1078). Please refer to the change log for details on bugs fixed in this connector. A massive "Thank You" is in order to Grzegorz Kołakowski, for his tireless work on and testing of this connector!

New Embedded Engine Handler Interface

Debezium’s embedded engine now comes with a new interface ChangeConsumer, which event handlers can implement if they’d like to process change events in bulks (DBZ-1080). That can result in substantial performance improvements when pushing change events to APIs that apply batch semantics themselves, such as the Kinesis Producer Library. You can learn more in the embedded engine docs.

Misc. Changes and Bug Fixes

All the relational connectors allow now to propagate the scale of numeric columns as a schema parameter (DBZ-1073). This is controlled via the column.propagate.source.type option and builds on the exposure of type name and width added in Debezium 0.8. All these schema parameters can be used when creating the schema of corresponding tables in sink databases.

Debezium’s container image for Apache Kafka allows to create and watch topics now (DBZ-1057). You also can specify a clean-up policy when creating a topic (DBZ-1038).

The Debezium MySQL connector handles unsigned SMALLINT columns as expected now. (DBZ-1063). For nullable columns with a default value, NULL values are correctly exported (DBZ-1064; previously, the default value would have been exported in that case).

The Postgres connector handles tables without a primary key correctly now (DBZ-1029). We’ve also applied a fix to make sure that the connector works with Postgres on Amazon RDS, which recently was broken due to an update of wal2json in RDS (DBZ-1083). Going forward, we’re planning to set-up CI jobs to test against Postgres on RDS in all the versions supported by the Debezium connector. This will help us to spot similar issues early on and react quickly.

Please see the change log for the complete list of all addressed issues.

This release wouldn’t have been possible without all the contributions by the following members of the Debezium community: Addison Higham, Amit Sela, Gagan Agrawal, Grzegorz Kołakowski, Ilia Bogdanov, Ivan Kovbas, Moira Tagle, Renato Mefi and Tony Rizko.

Thanks a lot!

Next Steps

The CR1 release took us a bit longer than anticipated. The release of Debezium 0.9.0.Final will therefore be moved to early February. Rather quickly thereafter we’re planning to release Debezium 0.9.1, which will provide improvements and potential bugfixes to the features added in 0.9.

For further plans beyond that, check out our road map. If you got any feedback or suggestions for future additions, please get in touch via the mailing list or in the comments below.

About Debezium

Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.

Get involved

We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Gitter, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.


Debezium 0.9.0.Beta2 Released

With only a few days left for the year, it’s about time for another Debezium release; so it’s with great pleasure that I’m announcing Debezium 0.9.0.Beta2!

This release comes with support for MySQL 8 and Oracle 11g; it includes a first cut of metrics for monitoring the SQL Server and Oracle connectors, several improvements to the MongoDB event flattening SMT as well as a wide range of bug fixes. Overall, not less than 42 issues were addressed; very clearly, there has to be some deeper sense in that ;)

A big shout out goes to the following members Debezium’s amazing community, who contributed to this release: Eero Koplimets, Grzegorz Kołakowski, Hanlin Liu, Lao Mei, Renato Mefi, Tautvydas Januskevicius, Wout Scheepers and Zheng Wang!

In the following, let’s take a closer look at some of the changes coming with the 0.9 Beta2 release.

Monitoring and Metrics for the SQL Server and Oracle Connectors

Following the example of the MySQL connector, the connectors for SQL Server and Oracle now expose a range of metrics for monitoring purposes via JMX (DBZ-978). This includes values like the time since the last CDC event, offset of the last event, the total number of events, remaining and already scanned tables while doing a snapshot and much more. Please see the monitoring documentation for details on how to enable JMX. The following image shows an example of displaying the values in OpenJDK’s Mission Control tool:

Monitoring the Debezium SQL Server connector

We’re planning to expand the set of exposed metrics in future versions and also make them available for Postgres and MongoDB. Please let us know about the metrics you’d like to see by commenting on JIRA issue DBZ-1040.

As a bonus, we’ve also created a Grafana dashboard for visualizing all the relevant metrics:

Connector metrics in Grafana

We’ll blog about monitoring and the dashboard in more detail soon; but if you are interested, you already can take a look at this demo in our examples repository.

Misc. Features

The "snapshot.delay.ms" option already known from the Debezium MySQL connector is now available for all other Debezium connectors, too (DBZ-966). This comes in handy when deploying multiple connectors to a Kafka Connect cluster, which may cause rebalancing the connectors in the cluster, interrupting and restarting running snapshots of already deployed connector instances. This can be avoided by specifying a delay which allows to wait with the snapshotting until the rebalancing phase is completed.

The MongoDB CDC Event Flattening transformation received a number of improvements:

  • Support for MongoDB’s $unset operator (DBZ-612)

  • Support for full document updates (DBZ-987)

  • New option for dropping delete and tombstone messages (DBZ-563)

  • Option to convey the original type of operation as a header parameter (DBZ-971); that option is also available for the Flattening SMT for the relational connectors and can be useful in case sink connectors need to differentiate between inserts and updates

Bug fixes

As always, we’ve also fixed a good number of bugs reported by Debezium users. The set of fixed issues includes:

  • Several bugs related to streaming changes from MySQL in GTID mode (DBZ-923, DBZ-1005, DBZ-1008)

  • Handling of tables with reserved names in the SQL Server connector (DBZ-1031)

  • Potential event loss after MySQL connector restart (DBZ-1033)

  • Unchanged values of TOASTed columns caused the Postgres connector to fail (DBZ-842)

Please see the change log for the complete list of addressed issues.

Next Steps

We’re planning to do a candidate release of Debezium 0.9 in early January. Provided no critical issues show up, Debezium 0.9.0.Final should be out by the end of January. For the CR we’ve mostly scheduled a number of further bug fixes, improvements to the SQL Server connector and the addition of further metrics.

In parallel, we’ll focus our attention on the Oracle connector again, finally getting back to the long-awaited LogMiner-based capture implementation (DBZ-137). This will be a primary feature of Debezium 0.10.

In addition, we’ll spend some cycles on the blogging and demo side of things; namely we’re thinking about writing on and demoing the new monitoring and metrics support, HA architectures including failover with MySQL, HAProxy and Debezium, as well as enriching CDC events with contextual information such as the current user or use case identifiers. Stay tuned!

Also going beyond 0.10, we got some great plans for Debezium in the coming year. If you’d like to bring in your ideas, too, please let us know on the mailing list or in the comments below, we’re looking forward to hearing from you.

And with that, all that remains to be said, is "Happy Festivus for the rest of us!"

Happy change data streaming and see you in 2019!

About Debezium

Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.

Get involved

We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Gitter, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.


Debezium 0.9.0.Beta1 Released

It’s my pleasure to announce the release of Debezium 0.9.0.Beta1! Oh, and to those of you who are celebrating it — Happy Thanksgiving!

This new Debezium release comes with several great improvements to our work-in-progress SQL Server connector:

  • Initial snapshots can be done using the snapshot isolation level if enabled in the DB (DBZ-941)

  • Changes to the structures of captured tables after the connector has been set up are supported now (DBZ-812)

  • New connector option decimal.handling.mode (DBZ-953) and pass-through of any database.* option to the JDBC driver (DBZ-964)

Besides that, we spent some time on supporting the latest versions of the different databases. The Debezium connectors now support Postgres 11 (DBZ-955) and MongoDB 4.0 (DBZ-974). We are also working on supporting MySQL 8.0, which should be completed in the next 0.9.x release. The Debezium container images have been updated to Kafka 2.0.1 (DBZ-979) and the Kafka Connect image now supports the STATUS_STORAGE_TOPIC environment variable, bringing consistency with CONFIG_STORAGE_TOPIC and OFFSET_STORAGE_TOPIC that already were supported before (DBZ-893).

As usual, several bugs were fixed, too. Several of them dealt with the new Antlr-based DDL parser for the MySQL connector. By now we feel confident about its implementation, so it’s the default DDL parser as of this release (DBZ-757). If you would like to continue to use the legacy parser for some reason, you can do so by setting the ddl.parser.mode connector option to "legacy". This implementation will remain available in the lifetime of Debezium 0.9.x and is scheduled for removal after that. So please make sure to log issues in JIRA should you run into any problems with the Antlr parser.

Overall, this release contains 21 fixes. Thanks a lot to all the community members who helped with making this happen: Anton Martynov, Deepak Barr, Grzegorz Kołakowski, Olavi Mustanoja, Renato Mefi, Sagar Rao and Shivam Sharma!

What else?

While the work towards Debezium 0.9 continues, we’ve lately been quite busy with presenting Debezium at multiple conferences. You can find the slides and recordings from Kafka Summit San Francisco and Voxxed Days Microservices on our list of online resources around Debezium.

There you also can find the links to the slides of the great talk "The Why’s and How’s of Database Streaming" by Joy Gao of WePay, a Debezium user of the first hour, as well as the link to a blog post by Hans-Peter Grahsl about setting up a CDC pipeline from MySQL into Cosmos DB running on Azure. If you know about other great articles, session recordings or similar on Debezium and change data capture which should be added there, please let us know.

About Debezium

Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.

Get involved

We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Gitter, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.


back to top