Debezium Blog

SQL Server offers change data capture (CDC) through change tables - a special set of system tables that record modifications in selected “ordinary” tables. If you want to monitor changes in real time, you query these change tables periodically. That’s exactly how Debezium works today: it polls SQL Server’s change tables at configured intervals and turns the results into a continuous stream of CDC records. This approach works fine, but could we do better?
Captured tables are filled by the SQL Server Agent, which reads the transaction log, extracts changes, and stores them in change tables. In theory, we could skip the middleman and parse the transaction log directly. That’s how tools like OpenLogReplicator handle CDC for Oracle databases. Let’s peek inside the SQL Server internals and explore little bit how it works and stores the records.
In this post, we’ll:
-
Prepare a local SQL Server instance for experimentation
-
Explore the internal structure of the SQL Server transaction log
-
Understand how the records are stored on the disk

Although the end of summer is near, the Debezium team has a hot off the pressed preview release available with a fresh batch of improvements and enhancements. With Debezium 3.3.0.Beta1, this release brings a variety of stability fixes, performance optimizations, and user experience improvements across the connector ecosystem. Let’s take a look at what those are.

Debezium 3.2.2.Final delivers critical stability improvements, including a fix for potential data loss during failed ad-hoc blocking snapshots, resolution of confusing connector startup errors, and enhanced JMX throughput metrics for Oracle LogMiner.

Debezium 3.3.0.Alpha2 is out, bringing key fixes and powerful enhancements!
Highlights include heartbeat handling fixes, the ability to start MongoDB streaming from a precise oplog position, faster PostgreSQL TOAST performance, extended TSVECTOR support in the JDBC sink, and improved publication DDL handling in PostgreSQL. The Debezium Platform also gets major usability boosts with clearer error messages, fine-grained UI logging, and better source/destination definitions.

Most engineers working in data streaming are not SQL specialists. So you might be asking yourself: What is a CTE? More importantly, what are CTE queries, why are they useful, and how do they help you in the context of Debezium?
In this post, we’ll answer those questions, explore how the Debezium Oracle connector leverages CTE queries, and discuss the benefits and trade-offs involved.