Debezium for Oracle - Part 3: Performance and Debugging

What to know before we start

This installment is going to focus on a number of highly technical Oracle database features, and these often are within the realm of an Oracle database administrator’s area of expertise. Depending on your permissions and roles within your environment, you may not necessarily have access to all the tools and commands that will be discussed herein, so you may need to coordinate with your database administrator if you are working in a non-local environment.

Additionally, this post picks where we last left off in Part 2. We had previously deployed a full Kafka ecosystem with Zookeeper, a Kafka broker, and a Kafka Connect environment. These are not necessarily a requirement in this series, but if you would like to follow along manually, I encourage you to follow the steps in the earlier parts of the series to quickly get started.

What is Oracle and why is it complex?

I touched on the first part of this briefly in Part 1, but I believe it bears a much deeper answer to support some of the technical concepts we’re going to discuss today.

A common take-away from newcomers or those unfamiliar with Oracle often ask, "Why is everything with Oracle always complex?". Most often, these users either have little Oracle database experience or have exposure to other open source database solutions that in broad, general terms are easier and simpler to use, particularly directly out of the box. So why is that?

Oracle was first released to the public over four decades ago in 1979, making it the oldest and the most popular database in the world. In the top five most popular, Microsoft SQL Server is the next oldest at three decades released in 1989 and the others all being half the age of Oracle or less.

What has helped drive Oracle’s market share has been its ability to innovate fast, retain compatibility with existing database platform users, and being flexible enough to provide you features now that you may not need until the future. This has lent itself to allow Oracle to thrive unlike its competition, but we all know that flexibility often comes at the cost of other things; and traditionally that has been at the hand of ease of use. Oracle has a broad and extensive arsenal of tools at your disposal, but these are often tightly coupled leading to complex installations and configurations, but the advantage beyond its flexibility is that it’s the best at what it does, so often the trade-off is worth it.

As we go through the following parts of this installment, I hope this context provides a newfound perspective. While it may seem commonplace to think that many of Oracle’s oddities when compared to its competition are pain points, in reality they’re strengths that have perpetually defined Oracle at the forefront of a critical space in the age of big data.

Performance

When choosing to use change data capture (CDC) pipelines, low-latency is very often a driving factor. Whether you are using CDC to provide event-driven communication between microservices or complex data replication, it’s important that events arrive as soon as possible, so throughput and performance are often at the forefront of evaluating a solution’s merit, often immediately after reliability.

In this section, we’re going to cover several configuration properties of the Debezium Oracle connector that can allow you to optimize the connector’s performance based on your environment and needs.

Disk IO
Redo Log Size
Log Mining Strategy
Batch size
Query filter mode

Disk IO

A fundamental aspect of the Debezium Oracle connector’s implementation is that it uses Oracle APIs to work with the LogMiner process to read the redo and archive logs. These logs are managed by two Oracle processes, the Oracle LogWriter (LGWR) and the Oracle Archiver (ARCH). The details around these two processes are less important to the discussion other than the fact they’re responsible for management of both the current redo logs used by the Oracle database and the creation of the archive logs that contain the past historical changes made to Oracle.

The Debezium Oracle connector uses the LogMiner API to read these redo and archive logs off disk and generate change events. Unfortunately, redo and archive logs cannot be retained on the database server indefinitely. Often, logs can be several gigabytes in size, and when combined with supplemental log configurations, an Oracle database can generate many logs in short periods of time, thus disk space gets consumed quickly. When these redo or archive logs exist on a networked drive or a high-latency device, this has a direct impact on LogMiner’s ability to read and provide the change event data to the connector as quickly as possible.

One possible way to improve performance is to coordinate with your Oracle database administrator to see if you can retain more archive logs longer even if its on a separate mounted disk but local to the database machine. Oracle provides the ability to define what is called log archive destinations, and it supports up to a maximum of 31 different paths where archive logs can be written by the Archiver process.

It’s not uncommon that your database administrator may have configured several log archive destinations already for other processes such as GoldenGate, DataGuard, etc. If such paths have been defined, and they have a log retention policy that aligns with the ingestion rate, you can safely use Debezium with those destinations if one already exists. If no path has been defined, you can create a new one following the aforementioned Oracle documentation link.

To have Debezium use a specific log archive destination, the log.mining.archive.destination.name connector property must be provided.

An example using LOG_ARCHIVE_DEST_5

{
  "log.mining.archive.destination.name": "LOG_ARCHIVE_DEST_5",
  ...
}

The physical path to the logs will be obtained from Oracle directly, you only need to specify the destination name the database administrator configured.

This configuration is only applicable for use with the Oracle LogMiner adapter and has no impact when ingesting changes using Oracle XStream.

Redo Log Size

Oracle’s redo logs are used primarily for recovery and instance failures. When an instance is first created, the administrator provides a starting size for redo logs. If the size of the redo logs is too small or even too large, this can have a direct impact on the performance of your instance.

For example, the size of redo logs has a direct impact on how frequent the Oracle Archiver (ARCH) process transitions redo logs to archive logs, which is referred to as a log switch. Generally, Oracle recommends that administrators minimize the number of log switches in small windows, but this can vary depending on a number of factors like volume or logging configurations.

A log switch is a fairly expensive operation because it’s the moment in time when a redo log is copied by the Archiver process to an archive log and a new redo log is allocated. If there is ever a period when the Archiver process falls behind and all redo logs have filled, Oracle’s database can degrade or even halt if a checkpoint cannot occur because all current redo logs are full and awaiting archival.

If you are using an image of Oracle based on Oracle Docker Images, you will have noticed that by default the redo logs created are extremely small, several megabytes each. For development purposes, this is fine out of the box, but when using such an instance for any type of serious integration such as Debezium; this simply doesn’t work well, especially with the default mining strategy which we’ll discuss in more detail in the next section.

However, small redo log sizes aren’t the only problem. If the redo log files are sized too large, this can have an adverse impact on the read time from disk, making the gap while the connector waits for changes even longer as there is the need to perform more Disk IO due to larger files.

Resizing Oracle’s redo logs requires existing knowledge of the database server paths and where it is safe to store those files; therefore, since that information is environment dependent, we aren’t going to cover directly how to do this here. Oracle provides excellent documentation on how to perform this task.

Unfortunately, there isn’t a simple answer for what size you should use. This requires a bit of finesse, science, and heuristics of your environment to gauge what is the best choice, but this is something in your arsenal that could be adjusted if necessary.

Log Mining Strategy

In Part 2, we covered two log mining strategies for the Debezium Oracle connector. These strategies control how the connector interacts with Oracle LogMiner and how specific entries from the redo logs are ingested for both schema and table changes.

Redo logs store redo entries and not all redo entries store explicitly every pierce of data needed to re-construct the change that occurred. For example, DML operations (inserts, updates, deletes) do not refer to table or column names but rather object identifiers. These object identifier and version details change in the data dictionary as column or table modifications (DDL changes) occur. This means that the identifier and/or its version will differ from a redo entry for the same table before and after a schema change.

The log mining strategy controls precisely how redo entries are interpreted by Oracle LogMiner, primarily by either writing the data dictionary to the end of the redo logs or omitting this step. There are benefits to using either strategy and we’re going to dive into what those are and why you may use one strategy over another.

Default Mining Strategy: The default mining strategy is the safest choice, but it is also the most expensive. This strategy will append a copy of the data dictionary to the redo logs when a log switch is observed.

This strategy’s main benefit is schema and data changes are ingested seamlessly by Oracle LogMiner. In other words, if an INSERT is followed by an ALTER TABLE and that is followed by an UPDATE, Oracle LogMiner will safely deduce the right table and column names from the old and the new object ids and versions. This means that Debezium will be able to safely ingest that change event as one might expect.

The unfortunate pain point of this strategy is that it’s an expensive step at each log switch.

First, it requires that the connector append a copy of the data dictionary periodically to the redo logs and Oracle performs a full log switch (all log groups perform a switch) after writing the dictionary. This means that more archive logs will be generated than when using the online catalog strategy we’ll discuss momentarily.

Secondly, it also requires that when a LogMiner process begins to mine the redo logs, it must first read and prepare a section of the SGA with all the dictionary metadata so that resolution of table and column names can happen properly. Depending on the size of the redo logs, and more appropriately the dictionary segment of the logs, this can take upwards of several minutes to prepare. So you can probably guess that when you combine this strategy with a poorly sized redo logs, this can easily create a performance bottleneck.

It is not recommended to ever deploy multiple Oracle connectors using this strategy, but instead use a single Oracle connector.

Online Catalog Strategy: The online catalog mining strategy is used when specifying the log.mining.strategy connector configuration property with the value online_catalog. Unlike the default mining strategy, this strategy does not write any additional data to the redo logs, but instead, relies on the current data dictionary to resolve table and column names.

The benefit to this strategy is that since we are not writing any dictionary details to the redo logs, redo logs will only transition to archive logs based on existing database activity. In short, Debezium will not influence this frequency beyond the additional supplemental logging configuration required, making it easier to manage the volume of archive logs created. Secondly, because no dictionary details are written to the logs and the number of log switches remains constant to existing behavior, a mining session starts nearly instantaneously and there is no need for LogMiner to prepare any dictionary metadata as the existing data dictionary satisfies that requirement as-is.

Unfortunately, this strategy does have a single restriction and that is schema changes are not observed seamlessly. In other words, if a redo entry refers to an object id/version that does not match the object id/version in the online data dictionary, Oracle LogMiner is incapable of reconstructing the SQL for that operation.

However, schema changes can be handled with this strategy, but it requires doing schema changes in a lock-step fashion. In other words, you would you halt changes on the table, wait until the last change for the table has been captured by Debezium, apply the schema change, wait for the schema change to be emitted by Debezium, and finally resume allowing changes to the data in the table.

This strategy provides the optimal performance gain both for Oracle and the connector.

The only requirement is that if a table’s schema isn’t static, and you may have changes to it periodically, if you can perform the schema changes in lock-step as described above, you can safely perform schema changes using this strategy; otherwise schema changes should be avoided on the table(s) being captured.

Finally, this strategy should be used if deploying multiple Oracle connectors on the same Oracle database.

In conclusion, the mining strategy chosen can have significant impacts on the performance of the database as well as the ingestion rate of the Debezium Oracle connector. It’s important to weigh the benefits and consequences of this choice based on what is possible given your environment.

There is an effort underway to bridge these two strategies and deliver a solution that provides all the performance benefits of the online catalog strategy and the seamless schema management provided by the default mining strategy. The progress for this effort can be found in DBZ-3401.

Batch size

The Debezium Oracle connector uses an adaptive batch size algorithm to determine the number of rows that will be fetched per database call. The algorithm is controlled by the following configuration properties:

log.mining.batch.size.default: This specifies the default number of rows that will be fetched per database call.
log.mining.batch.size.min: This specifies the minimum number of database rows that will be fetched per database call.
log.mining.batch.size.max: This specifies the maximum number of database rows that will be fetched per database call.

These settings give the connector the ability to read more data and reduce network latency when the connector has fallen behind or observed a large transaction in the logs at the expense of consuming more SGA and JVM memory temporarily and using less SGA and JVM memory when the connector has caught up to near real-time changes.

The connector defaults for these are great starting points, but depending on your change event volume, it may be wise to increase or even shrink these settings based on your environment to improve performance.

Query filter mode

Any developer who has ever worked on a SQL-based application will tell you that just because a query performs well in one environment or at one point in time doesn’t mean that the same query will be as efficient in another environment or even the future as the data set changes. That’s why with Debezium 2.3, we added a new feature called log.mining.query.filter.mode. Through various discussions with Oracle community users with various installations, volume sizes, and integrations, we concluded that the LogMiner query used by the Oracle connector simply cannot be a one-size fits all solution.

In order to be the most efficient, we needed to provide the user community with a way they can tune the LogMiner query that best satisfies their configuration and their environment. There are currently three options for how the LogMiner query is constructed and each influence how the query’s where-clause is generated.

none: Specifies that no additional predicates are added to the LogMiner query.

Instead, all filtering is delegated primarily to the Oracle connector’s Java runtime and not the database query. This has the highest network bandwidth usage of all the options and can have the highest throughput depending on the volume and the data-set of redo entries. For lower volume installations, this can easily perform the fastest but does not scale well as the volume of redo entries increases or if the data-set of interest is smaller than the total of the data-set.
in: Specifies that the schema and table include/exclude filters are applied using a SQL in-clause.

By default, the include/exclude configuration options support comma-separated lists of regular expressions; however, if you elect to avoid the use of regular expressions, you can apply database-level filters to the LogMiner query more efficiently by using this query filter mode. An in-clause is much more efficient over using disjunctions or Oracle’s REGEXP_LIKE operator which we’ll discuss with the next option. This also performs extremely well if you have a lot of schema or table include/exclude list options defined in your configuration. And finally, because this choice performs database-level filtering, this reduces the network latency and only returns the necessary rows to the connector.
regex: Specifies that the schema and table include/exclude filters are applied using the SQL operator REGEXP_LIKE.

Since the include/exclude configuration options support comma-separated lists of regular expressions, must be used instead of in when using regular expressions. While this option performs database-level filtering much like the in-clause choice, the use of regular expressions degrades in performance as more include/exclude options are specified in the connector configuration. Therefore, in order to maximize performance, its generally best when using regular expressions to write as few expressions that match the most tables or schemas as possible to reduce the number of predicates appended to the query.

As of Debezium 2.3, the default is none, so you can gain additional performance by specifically configuring the log.mining.query.filter.mode to use in ir regex depending on the values provided in your include/exclude list configuration properties.

Debugging

As much as I would like to think Software Engineering is butterflies and flowers; it’s far from the truth and managing an environment that runs software is no different. When problems occur, it’s important to have the knowledge to self-diagnose and get back to a running state as quickly as possible. So we’re going down the rabbit-hole and discuss a number of common errors that we’ve seen, how do you debug those errors, and what might be the potential fixes.

None of the log files contains offset SCN, re-snapshot is required: I’m fairly certain that at some point most Oracle connector users have seen this error in the logs, whether during PoC design or testing, but hopefully not production. The error message itself is relatively clear, but what is often not clear is "why did this happen".

For other databases, their transaction logs only contain committed changes that are then consumed by Debezium. Unfortunately, Oracle does not do this and instead writes every single change to the transaction logs, even if the change is later rolled back due to a constraint violation or an explicit user or system rollback. This means that reading the changes from the Oracle redo and archive logs isn’t as simple as reading from position X until the end of the file and then repeat with the next log in sequence. Instead, the connector must maintain what we call a low and high watermark SCN, or if you’re familiar with the connector’s offsets these are represented as scn and commit_scn.

The low watermark or scn represents the safe resume point in the redo logs. Generally this points to the position in the logs where the oldest in-progress transaction started. The high watermark or commit_scn represents the position in the logs where we last emitted a transaction batch for a given redo thread. This means that the changes in between these two values are a mixture of uncommitted changes, committed changes, or rolled back changes.

When the connector starts, the low watermark or scn read from the offsets is compared to the oldest available archive log in Oracle. If the archive log begins with a system change number that comes after this scn value, this error will occur.

Long-running transactions directly impact the low watermark or scn position. If a transaction remains active for longer than your archive log retention policy and the connector is restarted due to a re-balance or failure, this error can occur. If you suspect long-running transactions, you can configure the log.mining.transaction.retention.ms property in order to discard a transaction that lives longer than the specified value. While this does cause data loss as that transaction’s changes are discarded, it does allow the low watermark to safely advance forward at a reasonable pace even when long-running transactions occur. You should set the transaction retention period to a value less than your archive log retention period.

Another use case that can raise this error is if you are capturing changes from an Oracle database with a low volume of changes. In particular, if you are using an older version of Debezium where the LogMiner query applied database-level filters or you’ve configured the new query filter mode to apply database-level filters, it’s possible that the connector may go extended periods of time without observing a single change event. Since offset data only synchronizes when the connector sends an event to Kafka, low volumes of changes in a large window of time can mean those Kafka offsets become stale and if a restart occurs, this error could happen. In this use case, configuring both the heartbeat.internval.ms and heartbeat.action.query connector properties is a great way to guarantee that there is some activity flowing to Kafka to keep those offsets from becoming stale.
ORA-01555: snapshot too old: This specific error is most commonly observed during the connector’s initial snapshot. The Oracle connector relies on what is called flashback queries during the initial snapshot phase.

A flashback query is a standard SELECT statement that uses a system change number in order to generate a result-set based on the state of the data at that given point in the database’s lifetime. This can be useful for a variety of reasons, including being able to restore objects without the need for media recovery because Oracle is capable of retaining that previous state for a certain period of time. The data returned by these queries use the Automatic Undo Management (AUM) subsystem and rely on the undo data area where transactions are recorded and retained for a given period of time, configurable based on the database parameter UNDO_RETENTION.

If the SCN used for the flashback query becomes too old and the undo retention no longer maintains historical data for that system change number, Oracle will report an ORA-01555 error that the snapshot is now too old. When this happens during your initial snapshot, the snapshot will need to be retaken from the beginning and unless you reconfigure Oracle’s undo retention period to allow for a longer retention time, rerunning the snapshot on the same data set will result in the same outcome.

So either a) have your DBA increase the UNDO_RETENTION database parameter temporarily or b) use a schema-only snapshot and then rely on incremental snapshots to generate the initial data set from your existing table data.
Streaming changes takes several minutes to appear: Occasionally users will notice there is a latency when the connector first starts or at specific periods during the connector’s lifetime. One of the best ways to identify what is going on is to coordinate with your DBA and to take a close look at your database’s alert log, which records all the LogMiner and XStream interactions that Debezium performs with your database. But most often this latency is quite common for users who use the default log mining strategy.

The default mining strategy that we covered earlier performs a what is called a data dictionary build step and depending on your database, this can take some time to be written to your redo logs and then parsed by the LogMiner process. It’s not uncommon for this to take 30 seconds up to several minutes to complete, and when using the default mining strategy, this process occurs on each log switch.

So we normally suggest that if you experience this latency frequently that you check the frequency of your log switches. If your database is performing excessive log switches within a small window that does not adhere to Oracle’s guidelines, your DBA may need to tune the database accordingly. Reducing the frequency of log switches increases the time that Debzium can reuse the same log file for mining and therefore reduces the need to build and parse the data dictionary.

If your table schema does not change often or won’t change at all, you can reconfigure the connector to use the online_catalog mining strategy as an alternative. This avoids the writing the data dictionary to the redo logs and the parse phase performed by LogMiner, greatly increasing the speed at which a mining session begins at both connector start-up and at each log switch interval.
How do you know if an archive log can be deleted?: Database administrators typically keep archive logs on the database server for a short period of time before they’re removed. This interval varies and depends on a number of factors including the frequency the logs are created, their respective sizes, and the physical space available on the server. What is most important is that if Debezium requires a specific archive log that it remains available until it’s no longer needed for CDC.

The easiest way to determine what logs are needed is via JMX metrics, looking at the field OffsetScn. This field references the system change number where the connector will resume from in case of a restart, and so any archive or redo log that contains this system change number or comes after this change number must remain available.

Debezium 2.4 will add another JMX metric that will provide the cut-off timestamp for archive logs based on this OffsetScn. This means that you will be able to use this timestamp directly in shell scripts to compare the filesystem’s timestamp with the one from JMX metrics, safely knowing which logs can must be retained and which can safely be removed via RMAN.
Memory footprint, how do you manage it efficiently?: Due to the nature of how transaction data is written to the Oracle archive and redo logs, a buffer of the transaction state must be managed by the connector. Under ideal circumstances, this buffer maintains short-lived data, the transaction starts, we buffer its relevant changes, and we observe the rollback or commit, and the data managed in the buffer is handled and the buffer cleared.

Because the connector buffers transactions, it’s extremely important that you have some prior knowledge of your environment’s transaction patterns. If this information can vary and cannot be predicted, you may want to consider using an alternative buffer type than the default Heap (memory) based buffer as this can easily lead to OutOfMemory exceptions under these circumstances when memory is configured too low.

Please refer to the documentation about Event Buffering. The Oracle connector offers two Infinispan based solutions that allow the connector store the buffer off-heap, reducing the connector’s memory footprint and being capable of dealing with very large transactions seamlessly.

Lastly, we have composed a collection of frequently asked questions in the documentation. We generally do our best to keep the most commonly asked questions there as a reference. Please take a moment and read through those and if you suspect anything may be missing, please open a Jira issue.

Wrapping up

I really hope this series on the Oracle connector has been helpful and informative. We’ve covered topics ranging from installation, configuration, deploying the connector, performance optimizations, and how to debug or evaluate specific common use cases we hear from the community.

As I mentioned earlier, Oracle is unlike most other database platforms and requires a bit more care and precision to maximize it’s potential. When evaluating the Oracle connector, it is important that you coordinate with an Oracle administrator professional to make sure that you’re maximizing the connector’s potential, particularly if performance is a critical metric in your evaluation.

As always, if you have questions about anything related to the content in this post or about something you observe in your environment, the team will do their best to provide you with the answers you need either by using our mailing list or reaching out to us on our chat.

Chris Cranford

Chris is a software engineer at Red Hat. He previously was a member of the Hibernate ORM team and now works on Debezium. He lives in North Carolina just a few hours from Red Hat towers.

About Debezium

Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.

Get involved

We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.