A Happy New Year to the Debezium Community!
May all your endavours be successful, your data be consistent, and most importantly, everyone stay safe and healthy. With 2020 in the books, I thought it’d be nice to take a look back and do a quick recap of what has happened around Debezium over the last year.
First, some facts and numbers for you stats lovers out there:
-
After the release of Debezium 1.0 in December 2019, we successfully released a stable Debezium version at the end of each quarter, with preview releases roughly every three weeks[1]
-
About 1,400 commits in the core repo (plus many more in the other ones), 36 blog posts and release announcements, 166 threads on the mailing list (if the query in my Google inbox is to be trusted)
-
About 100 new contributors, bringing the overall number of people contributing to the Debezium core repo to 245, plus additional people contributing to the other repositories of the Debezium GitHub organization
-
The first GA release of the commercially supported Debezium offering by Red Hat, as part of Red Hat Integration
-
Two new members on the core engineering team — the more, the merrier!
-
About 1,600 additional GitHub ⭐s for the Debezium core repo, bringing the total number of star gazers to more than 4,100
While those figures give a nice impression of the overall activity of Debezium, they don’t really tell what has been happening exactly. What’s behind the numbers? Here are some of my personal Debezium highlights from the last year:
-
Two new, community-led Debezium connectors for Db2 and Vitess; a big shout-out to the engineers of IBM and Bolt, respectively, for stepping up and taking the lead of these connectors!
-
Besides these new connectors, each of the releases brought a wide range of new features; some of the things I’m most excited about are Debezium Server for integrating Debezium with message infrastructure like Apache Pulsar, AWS Kinesis, Google Cloud Pub/Sub, and Azure Event Hubs, the Quarkus extension for implementing the outbox pattern, the new LogMiner-based connector implementation for ingesting change events from Oracle, transaction markers, support for CloudEvents, and so much more!
-
Integration of Debezium by multiple open-source projects, e.g. Apache Flink, Spring Cloud Stream, Hazecast Jet, and Apache Camel. Further integrators of Debezium include Materialize, Google Cloud DataFlow and Heroku’s streaming data connectors. Here on this blog, we also discussed how to integrate and use Debezium with technologies such as Testcontainers, the Apicurio API and schema registry, and OpenTracing.
-
Debezium being listed at "Trial" level on the ThoughtWorks Tech Radar
-
A proof-of-concept for a graphical user interface for configuring and operating Debezium; stay tuned for more details here, as this is currently in the process of being built out for other connectors
The year also brought a large number of blog posts and presentations from the community about their experiences with Debezium. You can find our full list of Debezium-related resources here (please send a PR for adding anything you think should be listed there). Some contents I particularly enjoyed include:
-
"Managing Data Consistency Among Microservices with Debezium" by Justin Chao
-
"Change Data Capture with Flink SQL and Debezium" by Marta Paes
-
"Microservices & Data: Implementing the Outbox Pattern with Debezium" by Thorben Janssen
-
"ASAP! – The Storified Demo of Introduction to Debezium and Kafka on Kubernetes" by Aykut Bulgu
-
"Setting up PostgreSQL for Debezium" by Michał Mackiewicz
-
"A year and a half with Debezium: CDC With MySQL" by Midhun Sukumaran
-
"Debezium on OpenShift Cheat Sheet" by Abdellatif Bouchama
-
"Implementing the Transactional Outbox pattern with Debezium in Quarkus" by Iain Porter
-
"Analysing Changes with Debezium and Kafka Streams" by Mike Fowler
-
"(De)coupling yourself" by Dina Bogdan
-
"Kafka Connect: How to create a real time data pipeline using Change Data Capture (CDC)" by Francisco Lima
-
"Tutorial: Set up a Change Data Capture architecture on Azure using Debezium, Postgres and Kafka " by Abhishek Gupta
It is just so amazing to see how engaged and helpful this community is; A big thank you to everyone for writing and talking about your experiences with Debezium and change data capture!
I think 2020 has been a great year for the Debezium community, and I couldn’t be happier about all the things we’ve achieved together. Again, a huge thank you to each and everyone in the community contributing to the project, be it via by implementing features and bug fixes, reporting issues, engaging in discussions, answering questions on Stack Overflow, helping to spread the word in blog posts and conference talks, or otherwise!
What’s on the roadmap for this year? It’s fair to say: "A lot" :) E.g. we’d like to rework the way snapshots are done: they should be parallelizeable, updates to the include/exclude filters should be possible, and more. The Debezium UI will see substantial expansion and improvements. We’re planning to conduct a systematic performance profiling and improvements of identified bottlenecks. There may be official support for MariaDB, as well as an operator for running Debezium Server on Kubernetes. Plus some super-cool things I cannot talk about at this point yet :)
Onwards and Upwards!
Gunnar Morling
Gunnar is a software engineer at Decodable and an open-source enthusiast by heart. He has been the project lead of Debezium over many years. Gunnar has created open-source projects like kcctl, JfrUnit, and MapStruct, and is the spec lead for Bean Validation 2.0 (JSR 380). He’s based in Hamburg, Germany.
About Debezium
Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.
Get involved
We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.