Debezium Blog

The modern data landscape bears little resemblance to the centralized databases and simple ETL processes of the past. Today’s organizations operate in environments characterized by diverse data sources, real-time streaming, microservices architectures, and multi-cloud deployments. What began as straightforward data flows from operational systems to reporting databases has evolved into complex networks of interconnected pipelines, transformations, and dependencies. The shift from ETL to ELT patterns, the adoption of data lakes, and the proliferation of streaming platforms like Apache Kafka have created unprecedented flexibility in data processing. However, this flexibility comes at a cost: understanding how data moves, transforms, and evolves through these systems has become increasingly challenging.

Understanding data lineage

Data lineage is the process of tracking the flow and transformations of data from its origin to its final destination. It essentially maps the "life cycle" of data, showing where it comes from, how it’s changed, and where it ends up within a data pipeline. This includes documenting all transformations, joins, splits, and other manipulations the data undergoes during its journey.

At its core, data lineage answers critical questions: Where did this data originate? What transformations has it undergone? Which downstream systems depend on it? When issues arise, where should teams focus their investigation?

Welcome to the third installment of our series on Debezium Signaling and Notifications. In this article, we continue our exploration of Debezium signaling and notifications. In particular, we will delve into how to enable and manage these features using the JMX channel.

We will also explore how to send signals and get notifications through the REST API leveraging Jolokia.

Welcome to this series of articles dedicated to signaling and notifications in Debezium! This post serves as the first installment in the series, where we will introduce the signaling and notification features offered by Debezium and discuss the available channels for interacting with the platform.

In the subsequent parts of this series, we will delve deeper into customizing signaling channels and explore additional topics such as JMX signaling and notifications.

One of the typical Debezium uses cases is to use change data capture to integrate a legacy system with other systems in the organization. There are multiple ways how to achieve this goal

  • Write data to Kafka using Debezium and follow with a combination of Kafka Streams pipelines and Kafka Connect connectors to deliver the changes to other systems

  • Use Debezium Embedded engine in a Java standalone application and write the integration code using plain Java; that’s often used to send change events to alternative messaging infrastructure such as Amazon Kinesis, Google Pub/Sub etc.

  • Use an existing integration framework or service bus to express the pipeline logic

This article is focusing on the third option - a dedicated integration framework.