Managing Change Data Capture T-SQL Tuesday #74

t-sql-tuesdayT-SQL Tuesday is here again. This month the host is Robert Davis (b | t)

The topic for T-SQL Tuesday #74 is “Be the Change”. This covers data changes, ETLs, data sanitisation, etc. Basically anything that changes your data.

I want to discuss the management of the SQL Server feature “Change Data Capture” (CDC). This is commonly used by ETL structures to incrementally load data warehouses. Typically, there would be an initial load of all the data into the DW. This is then followed by incrementally loading the changes to the live data to keep the DW up-to-date. CDC is the system that captures these changes.

CDC was introduced in SQL Server 2008, it was designed to provide a standardised and easy to roll system for capturing changes to your data. Almost all actions that happen in SQL Server are logged to the transaction log. This includes all INSERT, UPDATE, DELETE and MERGE statements. CDC reads the changes to your data from the transaction log asynchronously. This helps it to be as lightweight as possible as it is not locking any source tables to check the data.

The changes read from the transaction log are stored in change tables. These change tables can be found in a cdc schema that is added when CDC is enabled on the database. One change table is created for each table that is enabled for CDC. These tables can be queried by the ETL process to retrieve the changes to the data that it needs to apply to the DW.

These change tables can get very large very quickly if they are storing changes for tables whose data is altered frequently.

Luckily, a cleanup SQL Server Agent job is created when you enable CDC at the database level. The job will be named cdc.DATABASENAME_cleanup. This means that the data in the change tables has a retention period. By default the clean up job is scheduled to run every day at 2am and the retention period is 3 days.

3 days may sound like a lot but what if our ETL failed during a weekend or a holiday? Could you guarantee that you could get it back up and running before the changes you need to send to the DW meet their retention period? If you couldn’t, then you may need to start over and load all of your data again.

You could set a very long retention period which may be OK for small rarely changing tables, but the bigger the change tables get the longer it can take to load new changes and cleanup the oldest changes. Index maintenace can also be difficult to manage on these tables.

In some cases it may make sense to copy the data from the change tables to another database where it can be managed separatley from the live database. The ETL can pickup the changes from this “Change Database” and initiate the cleanup job to empty the change tables in the source database.

This method can allow you to have sensible retention periods and to persist the data long enough for your ETL to recover from an outage. The process of moving the data from the change tables to this “Change Database” needs to be very robust and heavily tested because if it fails you may find yourself back in the same place you started.

If the source database is in an Availability Group it is not possible to read the change data from a secondary. One very nice bonus to using this method is that the change data loaded to the “Change Database” can be read from a secondary if it is in an Availability Group. This allows the reads from the ETL to be offloaded to a secondary node.

The cleanup job needs to be configured to suit your environment. The cleanup job removes the expired change data from a change table in batches. Once it has removed all of the expired data, it moves onto the next change table. All the while the tables it hasn’t got to (and the ones it has completed) are receiving more changes.

If you have a lot of change tables (or even just 1 change table) with lots of data changing very regularly, then the defaults of the cleanup job may soon start to cause you pain.

I have seen cleanup jobs take over a couple of days to catchup after it was found that they had not been running or a huge amount of data was loaded.

Increasing the threshold setting on the cleanup job can dramatically improve the time it takes to delete expired changes but setting this too high may cause excessive locks on the change tables. Test, test and test!

The current seetings of your CDC jobs can be found with the following query:

SELECT * FROM dbo.cdc_jobs

The stored procedure sys.sp_cdc_change_job can be used to change these settings. I’m setting the threshold to 20,000 in the script below. Experiment with different values to find what’s right for you. See BOL for details.

EXECUTE sys.sp_cdc_change_job
@job_type = N'cleanup',
@threshold = 20000;

For more information read through this post by Kendra Little where she discusses some more of the challenges with using CDC.

Thanks for reading!

About the author

James Anderson

I'm a freelance SQL Server DBA in Southampton, UK. Performance tuning code and improving development processes are what I love to do for work. Outside of work, I live to travel to as many places as possible with my wife Sarah.

2 thoughts on “Managing Change Data Capture T-SQL Tuesday #74”

  1. Thanks for participating James. Nice viewpoint on CDC. What would be your thoughts on setting the cleanup jobs to only execute Tuesday through Friday? That way, you get a couple of days to deal with issues that happen over the weekend.

    1. Thanks Robert that means a lot. Also congratulations, you are the first non spam commenter on my blog!

      As for disabling the cleanup job Saturday through Monday, I would say in a 24-7 environment this could be very dangerous. It would solve the problem of potentially losing data in the change tables if your ETL were to fail but it would also mean that your change tables (depending on server activity) would potentially have to hold much more data (possibly 6 days worth if your retention is 3 days during the week and it fails on the Saturday). The writes to these larger tables could start causing you pain. You would also have a big cleanup job to do Tuesday morning / Monday night.

      I would say if you are willing to hold 6 days worth of data over the weekend when you are not there then why not have that as your retention period all week and run the cleanup each night. With that said it may make sense if your weekends are very quiet.

Leave a Reply

Your email address will not be published. Required fields are marked *

Sign Up

James Anderson SQL Server DBA

Hi, I'm James Anderson. When I'm not blogging about SQL Server, I do this.

The SQL People

DBA Stackexchange Profile