When an Availability Group fails over, the databases in the Availability Group start processing writes on a different replica. System databases can’t be part of an Availability Group because each replica has its own master, model and msdb database. SQL Agent jobs are stored in the msdb database so we have to deploy our agent jobs to each node in the Availability Group.
Imagine you deploy a job to each replica, that pulls a lot of data and then writes it to a file. When the scheduled time for the job rolls around all of the jobs will fire and start reading from their own replica.
To solve this issue you have to decide which replica you want to execute the job and make sure that the others don’t. So you decide you want the job to run against your asynchronous read-only secondary because you don’t want to inflict any extra IO on your primary. You could manually disable the job on the primary, but if there’s a failover you will need to disable the job on the new primary and enable the job on the old primary. This will be unmanageable as soon as you have multiple jobs or one that executes very frequently.
The answer is to make your Agent jobs Availability Group aware
The code below acts as a wrapper for the code in your Agent job. It ensures that your job will only execute against an asynchronous replica.
DECLARE @Role NVARCHAR(120), @Mode NVARCHAR(120); SELECT @Role = ars.role_desc, @Mode = ar.availability_mode_desc FROM sys.availability_replicas ar JOIN sys.dm_hadr_availability_replica_states ars ON ar.replica_id = ars.replica_id JOIN sys.availability_groups ag ON ars.group_id = ag.group_id WHERE ag.name - 'TheDatabaseAvenger_AG_001' AND ars.is_local = 1 -- Run only if the node allows direct read-only connections AND ar.secondary_role_allow_connections = 2; IF @Role = 'SECONDARY' AND @Mode = 'ASYNCHRONOUS_COMMIT' BEGIN; -- Insert your code to execute against the asynch replicas only END;
This makes tasks like patching easier as the jobs will automatically adjust as you failover the Availability Group while patching each replica in turn.
It would be easy to alter the logic to only ever run a job on the primary or a synchronous replica. With further IF statements the same job could perform different tasks on different replicas depending on the role of the replica.
Something to keep in mind is that if the job is writing files to a local folder the folder needs to exist on all replica nodes. A better solution may be for the jobs to write to a central file share accessible to all nodes.
While the code above works great for Availability Groups with only one asynchronous replica, it should be noted that it will run against all asynchronous replicas that it is configured against. Making this more granular to only run against certain asynchronous replicas is out of scope for this post but I may look into building a more granular system for a future blog post