Welcome to the first edition of the Debezium community newsletter in which we share blog posts, group discussions, as well as StackOverflow questions that are relevant to our user community.
Gunnar Morling recently attended Kafka Summit in London where he gave a talk on Change Data Streaming Patterns for Microservices With Debezium. You can watch the full presentation here.
Strimzi provides an easy way to run Apache Kafka on Kubernetes or Openshift. This article by Sincy Sebastian shows just how simple it is to replicate change events from MySQL to Elastic Search using Debezium.
Debezium allows replicating data between heterogeneous data stores with ease. This article by Matthew Groves explains how you can replicate data from MySQL to CouchBase.
As the size of data that systems maintain continues to grow, this begins to impact how we capture, compute, and report real-time analytics. This article by Maria Patterson explains how you can use Debezium to stream data from Postgres, perform analytical calculations using KSQL, and then stream those results back to Postgres for consumption.
In a recent article published in Portuguese, Paulo Singaretti illustrates how they use Debezium and Kafka to stream changes from their relational database and then store the change stream results in Google Cloud Services.
This recent blog by Jia Zhai provides a complete tutorial showing how to use Debezium connectors with Apache Pulsar.
Debezium version 0.9.5 was just released. If you are using the 0.9 branch you should definitely check out 0.9.5. For details on the bug fixes as well as the enhancements this version includes, check out the release notes.
We intend to publish new additions of this newsletter periodically. Should anyone have any suggestions on changes or what could be highlighted here, we welcome that feedback. You can reach out to us via any of our community channels found here.
Hello everyone, my name is Chris Cranford and I recently joined the Debezium team.
My journey at Red Hat began just over three years ago; however I have been in this line of work for nearly twenty years. All throughout my career, I have advocated and supported open source software. Many of my initial software endeavors were based on open source software, several which are still heavily used today such as Hibernate ORM.
When I first joined Red Hat, I had the pleasure to work on the Hibernate ORM team. I had been an end user of the project since 2.0, so it was an excellent fit to be able to contribute full time to a project that had served me well in the corporate world n-times over.
It wasn’t long ago when @gunnarmorling and I had a brief exchange about Debezium. I had not heard of the project and I was super stoked because I immediately saw parallel in its goals and Hibernate Envers, a change data capture solution that is based on Hibernate’s event framework that I was currently maintaining.
I believe one of my first "wow" moments was when I realized how well Debezium fits into the micro-services world. The idea of being able to share data between micro-services in a very decoupled way is a massive win for building reusable components and minimizes technical debt.
Debezium just felt like the next logical step. There are so many new and exciting things to come and the team and myself cannot wait to share them.
So lets get started!
When I first learned about the Debezium project last year, I was very excited about it right away.
I could see how this project would be very useful for many people out there and I was very impressed by the professional way it was set up: a solid architecture for change data capture based on Apache Kafka, a strong focus on robustness and correctness also in the case of failures, the overall idea of creating a diverse eco-system of CDC connectors. All that based on the principles of open source, combined with extensive documentation from day one, a friendly and welcoming web site and a great getting-started experience.
So you can imagine that I was more than enthusiastic about the opportunity to take over the role of Debezium’s project lead. Debezium and CDC have close links to some data-centric projects I’ve been previously working on and also tie in with ideas I’ve been pursuing around CQRS, even sourcing and denormalization. As core member of the Hibernate team at Red Hat, I’ve implemented the initial Elasticsearch support for Hibernate Search (which deals with full-text index updates via JPA/Hibernate). I’ve also contributed to Hibernate OGM - a project which connects JPA and the world of NoSQL. One of the plans for OGM is to create a declarative denormalization engine for creating read models optimized for specific use cases. It will be very interesting to see how this plays together with the capabilities provided by Debezium.
Currently I am serving as the lead of the Bean Validation 2.0 specification (JSR 380) as well as its reference implementation Hibernate Validator. Two other projects close to my heart are MapStruct - a code generator for bean-to-bean mappings - and ModiTect, which is tooling for Java 9 modules and their descriptors. In general, I’m a strong believer into the idea of open source and I just love it to work with folks from all over the world to create useful tools and libraries.
Joining the Debezium community and working on change data capture is a great next step. There are so many things to do: connectors for Oracle, SQL Server and Cassandra, but also things like an entity join processor which would allow to step from row-level events to more aggregated business-level events (e.g. for updating a combined search index for an order and its order lines) or tooling for managing and visualizing histories of event schema changes.
One thing I’d like to emphasize is that the project’s direction generally isn’t going to change very much. Red Hat is fully committed to maintaining and evolving the project together with you, the Debezium community. The ride really has just begun!
Finally, let me say a huge thank you to Randall for his excellent work! You’ve been a true role model for going from an idea over pitching it - within Red Hat as well as within the wider community - to building a steadily growing and evolving project. It’s stating the obvious, but it wouldn’t be for Debezium without you. Thanks for everything and looking forward very much to working with you and the community on this great project!
Just before I started the Debezium project in early 2016, Martin Kleppmann gave several presentations about turning the database inside out and how his Bottled Water project demonstrated the importantance that change data capture can play in using Kafka for stream processing. Then Kafka Connect was announced, and at that point it seemed obvious to me that Kafka Connect was the foundation upon which practical and reusable change data capture can be built. As these techniques and technologies were becoming more important to Red Hat, I was given the opportunity to start a new open source project and community around building great CDC connectors for a variety of databases management systems.
Over the past few years, we have created Kafka Connect connectors for MySQL, then MongoDB, and most recently PostgreSQL. Each were initially limited and had a number of problems and issues, but over time more and more people have tried the connectors, asked questions, answered questions, mentioned Debezium on Twitter, tested connectors in their own environments, reported problems, fixed bugs, discussed limitations and potential new features, implemented enhancements and new features, improved the documentation, and wrote blog posts. Simply put, people with similar needs and interests have worked together and have formed a community. Additional connectors for Oracle and SQL Server are in the works, but could use some help to move things along more quickly.
It’s really exciting to see how far we’ve come and how the Debezium community continues to evolve and grow. And it’s perhaps as good a time as any to hand the reigns over to someone else. In fact, after nearly 10 wonderful years at Red Hat, I’m making a bigger change and as of today am part of Confluent’s engineering team, where I expect to play a more active role in the broader Kafka community and more directly with Kafka Connect and Kafka Streams. I definitely plan to stay involved in the Debezium community, but will no longer be leading the project. That role will instead be filled by Gunnar Morling, who’s recently joined the Debezium community but has extensive experience in open source, the Hibernate community, and the Bean Validation specification effort. Gunnar is a great guy and an excellent developer, and will be an excellent lead for the Debezium community.
Will the Debezium project change? To some degree it will always continue to evolve just as it has from the very beginning, and that’s a healthy thing. But a lot is staying the same. Red Hat remains committed to the Debezium project, and will continue its sponsorship and community-oriented governance that has worked so well from the beginning. And just as importantly, we the community are still here and will continue building the best open source CDC connectors.