Welcome to the newest edition of the Debezium community newsletter, in which we share all things CDC related including blog posts, group discussions, as well as StackOverflow questions that are relevant to our user community.

It’s been a long time since our last edition. But we are back again! In case you missed our last edition, you can check it out here.

Upcoming Events

Due to the ongoing global pandemic, all the conferences, and meet-ups have gone virtual. On the bright side, this means you get to attend some nice events from the comfort of your couch:

If you’d like to have a session on Debezium at your virtual meetup or conference, please get in touch!

Articles

There have been several blog posts about Debezium lately; here are some of the latest ones that you should not miss:

And if watching a talk is more your kind of thing, here’s the recording of the session Change Data Streaming Patterns in Distributed Systems from this year’s Berlin Buzzwords, by Gunnar Morling and Hans-Peter Grahsl:

Please also check out our compiled list of resources around Debezium for even more related posts, articles, podcasts and presentations.

Integrations

A few cool integrations and usages of Debezium appeared over the last few weeks and months. Here are several ones which we found especially fascinating:

Examples

If you are getting started with Debezium, you can get hands-on learning and better understanding of how things work from the examples and demos in our examples repository. We have introduced several new examples and updated the existing ones. Out of which we’d like to highlight some new additions:

If you are interested in showcasing a new demo or an example, please send us a GitHub pull request or reach out to us directly through our community channels found here.

Time to Upgrade

Debezium version 1.6.0.Final was released last week. Apart from Debezium Server sinks for Apache Kafka and Pravega, the 1.6 release brought a brand-new feature for incremental and ad-hoc snapshots, providing long-awaited capabilities like resuming long-running snapshots after a connector restart, Re-snapshotting selected tables during streaming, and snapshotting tables newly added to the list of captured tables after changing the filter configuration. A big shout-out to Netflix engineers Andreas Andreakis and Ioannis Papapanagiotou for their paper DBLog: A Watermark Based Change-Data-Capture Framework, upon which incremental snapshotting is based.

Given the long time since the last community newsletter, it’s also worth mentioning some of the new features added in Debezium 1.5, released in April this year: the MySQL connector saw a substantial rewrite, now also supporting transaction marker events, Debezium’s LogMiner-based CDC implementation for Oracle was declared stable, and we’ve added support for Redis Streams to Debezium Server.

If you are using an older version, we urge you to check out the latest major release. For details on all the bug fixes, enhancements, and improvements, check out the release-notes.

The Debezium team has also begun active development on the next version, 1.7. The major focus in 1.7 is implementing incremental snapshotting for more connectors (MongoDB, Oracle), reworking the transaction buffer for the Oracle connector, and expanding the Debezium UI. For details on the further upcoming release check out the Debezium roadmap.

You can keep track of bug fixes, enhancements, and changes that will be coming up in the 1.7 release by visiting our releases page.

Getting Involved

Getting started with a huge, and an existing code base can be intimidating, but we want to make sure that the process of getting started is extremely easy and smooth for you here. We are now a vibrant community with 270+ contributors overall, and we welcome all kinds of community contributions, discussions, and enhancements. As a beginner you can grab some of the issues labeled with easy-starter if you want to dive in quickly. Below is a list of issues that are open to grab:

  • Document "schema.include.list"/"schema.exclude.list" for SQL Server connector (DBZ-2793)

  • Limit log output for "Streaming requested from LSN" warnings (DBZ-3007)

  • Create smoke test to make sure Debezium Server container image works (DBZ-3226)

  • Add signal table automatically to include list (DBZ-3293)

  • Implement support for JSON_TABLE in MySQL parser (DBZ-3575)

  • Implement window function in MySQL parser (DBZ-3576)

  • Standardize "snapshot.fetch.size default" values across connectors (DBZ-3694)

If you are new to open source, please check out our contributing guidelines to get started!

Call to Action

Our community users page includes a variety of organizations that are currently using Debezium. If you are a user of Debezium, and would like to be included, please send us a GitHub pull request or reach out to us directly through our community channels found here.

And if you haven’t yet done so, please consider adding a ⭐ for the GitHub repo; keep them coming, we’re almost at 5,000 stars!

Also, we’d like to learn about your requirements for future Debezium versions. In particular, we’d be very curious about your feedback on the CDC-based Sagas approach mentioned above. Is it something you’d like to see supported in our Quarkus extension for instance? Please let us know about this, as well as any other feedback you may have, via the Debezium mailing list.

Lastly, we’re planning to continue our interview series Debezium Community Stories With…​; so if you got exciting stories to tell about your usage of Debezium, please reach out!

And as always, stay safe, and healthy. Wish you and your loved ones good health and strength.

Anisha Mohanty

Anisha is a Software Engineer at Red Hat. Currently working with the Debezium Team. She lives in Bangalore, India.

   


About Debezium

Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.

Get involved

We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.