Recently I found myself in the position of needing to performance tune some stored procedures because of an estimated 600% increase in our transactional traffic. As the deadline approached it became clear that I wasn’t going to be able to get any code or index changes through QA and UAT before the flood of data hit us. So I reluctantly started to experiment with plan guides. Plan guides are a controversial feature as they can change the behaviour of a query by injecting query hints without changing any code. See Kendra Little’s article here for all the headaches they can cause.
I found that if I used a plan guide to inject the OPTION(MERGE JOIN) hint into the worst performing statement in my stored procedure I achieved much better performance. I was able to consider using a query hint as the I knew the statement was performing badly due to bad estimates caused by the use of table variables. I also knew that the query was always going to be dealing with a set amount of records on each execution so the cost estimates would be very stable. Now I am not a fan of query hints in code at all, let alone hints that are injected in via a plan guide, but in the situation I had little choice. So my plan was:
- Create the plan guide to achieve the required performance before the deadline
- Raise a second release to remove the plan guide and place the hint into the code
- Raise a further release to remove the table variables so that the hint is no longer needed
In an ideal world I would only have considered step 3 and everything else would have had to wait until I got the code perfect and it had gone through all the stages of testing. In the real world unexpected things happen and you have to move fast but that is not an excuse for leaving the quick ‘n’ dirty fixes in place forever. Step 2 will take longer to get through test (as it is a code change) but it is a vital step between 1 and 3 as the quicker I remove those plan guides the less likely it is that someone will come along and start tearing their hair out when they can’t figure out why that statement is executing in a certain way no matter what changes they make.
So once I had finished testing the plan guide and it had given the required performance in all situations, I read about a built in function named sys.fn_validate_plan_guide. This function validates a plan guide against the statement it will be affecting. If the function returns no results then the plan guide is deemed as valid and ready for use, but if a row is returned the plan guide is invalid and the first error causing it to be invalid is displayed. My testing had shown that the plan guide was altering the execution plan and that the performance was improved when enabled. If I disabled the plan guide I got the old execution plan with the poor performance. After the tests proved the plan guide was working I decided to run the function against my plan guide to see if there was something I had missed. It returned the following error:
Invalid object name ‘@MyTableVariable’
This error is caused by the table variable being declared in a previous statement in the stored procedure which is understandable (Kendra mentions in the article linked to above that she has never managed to get plan guides to work with statements that use table variables). Now while this is an annoying limitation I can understand that the plan guide is limited to the scope of 1 statement and the declaration of a table variable and the query that uses it are two statements, but does this limitation actually exist?
Well no it doesn’t (at least not in 2012 or 2014). I already knew this as my testing had already proved that the plan guide was working. The function I used to validate the plan guide obviously has an issue and is not playing by the same rules as the SQL Server engine. This confused me for a while and I was afraid that it was alerting me to an issue that I had not thought of in my testing but after further testing I couldn’t find a way to make the plan guide fail.
In the end I trusted my testing and the plan guide made it into production in time for the increase of traffic. The plan guide has been live for a week now and performance is good. The lesson learnt here is that no validation function is a replacement for good old testing as iterated here by Aaron Bertrand in the comments of a question I posted to DBA Stack-exchange on this topic.
I have raised a connect item for this as it could cause confusion for others in the future. If you have time please give it a vote.