Debezium Blog

In this blog post, we are going to discuss how Delhivery, the leading supply chain services company in India, is using Debezium to power a lot of different business use-cases ranging from driving event driven microservices, providing data integration and moving operational data to a data warehouse for real-time analytics and reporting. We will also take a look at the early mistakes we made when integrating Debezium and how we solved them so that any future users can avoid them, discuss one of the more challenging production incidents we faced and how Debezium helped ensure we could recover without any data loss. In closing, we discuss what value Debezium has provided us, areas where we believe there is a scope for improvement and how Debezium fits into our future goals.

One of the typical Debezium uses cases is to use change data capture to integrate a legacy system with other systems in the organization. There are multiple ways how to achieve this goal

  • Write data to Kafka using Debezium and follow with a combination of Kafka Streams pipelines and Kafka Connect connectors to deliver the changes to other systems

  • Use Debezium Embedded engine in a Java standalone application and write the integration code using plain Java; that’s often used to send change events to alternative messaging infrastructure such as Amazon Kinesis, Google Pub/Sub etc.

  • Use an existing integration framework or service bus to express the pipeline logic

This article is focusing on the third option - a dedicated integration framework.

Release early, release often! After the 1.1 Beta1 and 1.0.1 Final releases earlier this week, I’m today happy to share the news about the release of Debezium 1.1.0.Beta2!

The main addition in Beta2 is support for integration tests of your change data capture (CDC) set-up using Testcontainers. In addition, the Quarkus extension for implementing the outbox pattern as well as the SMT for extracting the after state of change events have been re-worked and offer more configuration flexibility now.

It’s my pleasure to announce the release of Debezium 1.1.0.Beta1!

This release adds support for transaction marker events, an incubating connector for the IBM Db2 database as well as a wide range of bug fixes. As the 1.1 release still is under active development, we’ve backported an asorted set of bug fixes to the 1.0 branch and released Debezium 1.0.1.Final, too.

At the time of writing this, not all connector archives have been synched to Maven Central yet; this should be the case within the next few others.

This article is a dive into the realms of Event Sourcing, Command Query Responsibility Segregation (CQRS), Change Data Capture (CDC), and the Outbox Pattern. Much needed clarity on the value of these solutions will be presented. Additionally, two differing designs will be explained in detail with the pros/cons of each.

So why do all these solutions even matter? They matter because many teams are building microservices and distributing data across multiple data stores. One system of microservices might involve relational databases, object stores, in-memory caches, and even searchable indexes of data. Data can quickly become lost, out of sync, or even corrupted therefore resulting in disastrous consequences for mission critical systems.

Solutions that help avoid these serious problems are of paramount importance for many organizations. Unfortunately, many vital solutions are somewhat difficult to understand; Event Sourcing, CQRS, CDC, and Outbox are no exception. Please look at these solutions as an opportunity to learn and understand how they could apply to your specific use cases.

As you will find out at the end of this article, I will propose that three of these four solutions have high value, while the other should be discouraged except for the rarest of circumstances. The advice given in this article should be evaluated against your specific needs, because, in some cases, none of these four solutions would be a good fit.