Hints in SQL Server

I used to be really suspicious of using hints in SQL Server, and now I can’t imagine working without them.

My opinion on this topic changed over the last few years due to a number of the performance issues I’ve worked on. I spoke at SQLSaturday 1000 (Oregon 2020) last weekend, and my talk was primarily about things I learned optimizing garbage collection and similar incremental processes. During that work I ran into a number of issues with queries like this example from the WideWorldImporters database:

	DELETE inv
	FROM @OrdersGC gc
	JOIN Sales.Invoices inv
		ON inv.OrderID = gc.OrderID;

Order Matters

The logic here is simple enough. Earlier in the process, we found orders we wanted to delete per retention policy, and put the OrderID values in a memory optimized table variable. We then use the motv to delete from all related tables, and finally the Orders table.

This query doesn’t have a WHERE clause. It’s plain to see how we want this to function though. We have 100 rows in our motv, and we want to delete the related rows in Invoices. But I’ve seen issues caused by execution plans that flip the order:

Table variables have no statistics, so the optimizer doesn’t know how many rows to expect from that operation (though table variable deferred compilation in SQL Server 2019 can resolve this) . Occasionally, I would see a plan with a join order that is the opposite of my expectation. The lack of a WHERE clause hurts here, but there’s no clause I can apply that will filter better than the items I already have in my table variable.

Consistency

I work on hundreds of databases with the same schema. They have different data sets and distributions, different sizes, and their statistics are going to update at different times. But if one of them chooses a bad plan, I have to push aside whatever other work to research the high CPU on database xyz.

Consistency is really valuable to me. And in this case, the answer is simple. Yes, I want to scan the fast, small memory-optimized table variable first, and use it to filter the larger, slower table. Adding a join hint or a force order to this query should keeps its plan and performance consistent.

	DELETE inv
	FROM @OrderList gc
	INNER LOOP JOIN Sales.Invoices inv
		ON inv.OrderID = gc.OrderID;

	DELETE inv
	FROM @OrderList gc
	JOIN Sales.Invoices inv
		ON inv.OrderID = gc.OrderID
	WITH OPTION(FORCE ORDER);

Both approaches force the join order. The INNER LOOP JOIN hint has the added benefit of ensuring the plan uses a nested loops join. A hash match wouldn’t be efficient with a batch size of a few hundred or a few thousand. A merge join would likely require a sort of one of the inputs, which defeats the purpose.

Index hints

I had to use index hints in an example I was using in my session for SQLSaturday 1000 (Oregon 2020).

DELETE TOP (@BatchSize) vt
FROM Warehouse.VehicleTemperatures vt
WHERE vt.RecordedWhen < DATEADD(DAY, -180, GETUTCDATE());

This was an example of a garbage collection process. The plan didn’t appear to be a problem, but we should be suspicious of the scan here:

The table scan only read 100 rows, but that’s because there is a TOP operator. The first 100 rows met our filter, so the query ended at that point. If no rows (or less than 100) matched, we would have scanned the entire table.

An index exists on the RecordedWhen column; it just wasn’t used. This is another place where a hint seems obvious. Maybe updating statistics would also resolve the issue, but this gives me more certainty.

DELETE TOP (@BatchSize) vt
FROM Warehouse.VehicleTemperatures vt WITH (INDEX(IX_VehicleTemperatures_RecordedWhen))
WHERE
	vt.RecordedWhen < DATEADD(DAY, -180, GETUTCDATE());

With Great Power

By using hints we are taking some of the responsibility away from the SQL Server, and we can cause entirely new problems. Here are some considerations before you try adding a hint.

  1. Relationships. Make sure you understand the cardinality and relationship between tables. This will inform your expectations about how many rows will be returned where.
  2. Indexes. Understand what options you have on each table in your query. A table may use one index based on the WHERE clause, or another based on the ON clause. The join order and indexes used are related. An index hint may push SQL Server to a specific join order; vice versa with join\order hints.
  3. Index hints can break your code! If you use an index hint in a procedure and later drop the index, SQL Server will not politely ignore your suggestion and move on. The procedure will fail until you remove the hint or recreate the index. So, if you use index hints, be aware of this and always check if any hints reference an index before you drop it.
  4. The most effective filter. If the logic of your statement filters across several tables, consider which one should reduce your result set the most. You probably want that table first in your execution plan.
  5. Test and test again. The new plans may be completely different from what we imagine, so we really must test our hinted queries and procedures with gusto. Test it for a variety of cases to make sure your code works on realistic data sets. In my case, I will sometimes test against large and small restored databases to make sure it performs as expected.

I’ve heard other engineers speak dismissively of hints, but I would encourage you to not discard a useful tool. Just realize you can cut yourself with it.

One of my coworkers recently resolved a performance issue by changing the join order and forcing it with a hint, or “doing a Jared Poche” in his words. Which shows you how often I’ve used hints, and how often they’ve worked.

Hopefully you learned something from this post. Please follow me on twitter (@sqljared) or contact me if you have questions.

Another Example: 

I was reviewing the performance of a procedure recently and stumbled over another pumpkin. 

My last blog post was about the Halloween Problem, and we saw its effects on an UPDATE statement. In this case, it was the same issue but with an INSERT statement. Here’s the code being executed:

INSERT INTO Schema1.Object1 (
		Column1,
		Column2,
		Column3,
		Column4 )
	SELECT
		Object2.Column1,
		Object2.Column2,
		Object2.Column3,
		Object2.Column4 
	FROM Object3 Object2
	LEFT LOOP JOIN Schema1.Object1 Object4 WITH(INDEX(Column5))
		ON Object4.Column3 = Object2.Column3
		AND Object4.Column1 = Object2.Column1
		AND Object4.Column2 = Object2.Column2
		AND Object4.Column4 = Object2.Column4
	WHERE
		Object4.Column6 IS NULL

The gist is, we’re trying to insert a record into Object1, assuming said record doesn’t already exist. We’re querying the data we want to insert from a temp table, but joining to the base table to make sure a record doesn’t already exist with the same values.

In the case of an UPDATE statement, if we update fields that are in our query’s search criteria, we could update the same row multiple times. SQL Server’s Halloween protections prevent this, but result in extra work that affect our performance.

The INSERT statement here is similar, trying to insert a record while querying to see if the same record exists. So, again SQL Server adds Halloween protections to our plan:

Plan Analysis

I would have expected us to scan the temp table, then have a LEFT JOIN to the base table. The Table Spool is the red flag that we have an issue with the plan, and is frequently seen with Halloween protections.

The index scan on the base table seems to be overkill since we’re joining on the primary key columns (the key lookup isn’t much of a concern). But we’re likely doing the scan because of the spool; it’s SQL Server’s way of getting all relevant records in one place at one time, breaking the normal flow of row mode operation, to make sure we don’t look up the same record multiple times.

Easy Fix

The data we are trying to insert is being passed into the procedure using a memory-optimized table valued parameter. We’ve queried that into the temp table as another step before our final INSERT SELECT query, because SQL Server will sometimes make poor optimizations when TVP’s are involved (because they have no statistics).

The solution then is an easy one. We move our LEFT JOIN out of the final INSERT, and we make that check as we query the TVP’s data into the temp table. We separate the SELECT against that table from the INSERT; they are now in separate operations, and the Halloween protections are no longer necessary.

If you liked this post, please follow me on twitter or contact me if you have questions.

I encountered something recently I’d never encountered, so I had to share. I was making another change to a procedure I’ve been tuning recently. The idea was to alter the UPDATE statements to skip rows unless they are making real changes. The activity here is being driven by customer activity, and that sometimes leads to them setting the same value repeatedly. Difficult to know how often we update a row to the same value, but we think it could be significant. So, we added a clause to the UPDATE so we’ll only update if ‘old_value <> new_value’. The actual update operator is the most expensive part of the statement, but Simple enough so far. The scan is against a memory optimized table variable, and the filter to the left our our seeks and scans check for a change to our value. Nothing left but to update the index and…

Curve Ball

Wait, what’s all this? We have a Split operator after our Clustered Index Update. SQL Server does sometime turn an UPDATE statement into effectively a DELETE and INSERT if the row needs to move, but this seems a bit much. We have a total of 4 index update/delete operators now, and they aren’t cheap. My very simple addition to the WHERE clause actually caused a small increase in duration, and a big jump in CPU. So what’s going on?

UPDATE Object1
SET
	Column1 = CASE 
		WHEN Variable1 = ? THEN Object2.Column1 
		WHEN Variable1 = ? THEN Object1.Column1 + Object2.Column1 
		END,
	Column4 = GETUTCDATE()
FULL OUTER JOIN Variable5 dcqt
	ON Object1.Column9 = Variable2
	AND Object1.Column10 = Variable3
	AND Object1.Column6 = CASE WHEN Variable6 = ? AND Object2.Column2 >= ? THEN ? ELSE Object2.Column2 END
LEFT JOIN Variable7 oq
	ON Object1.Column6 = Object3.Column6
WHERE
	Object1.Column10 = Variable3
	AND Object1.Column9 = Variable2
	AND Object1.Column2 >= ?
	AND Variable8 < ?
	AND Object1.Column1 <> CASE 
		WHEN Variable1 = ? THEN Object2.Column1 
		WHEN Variable1 = ? THEN Object1.Column1 + Object2.Column1 
		END

So, we’re updating Column1 to the result of a CASE statement, and the last part of our WHERE clause compares Column1 to the same CASE. And the CPU for this statement just doubled? I happened to jog this by the superlative Kevin Feasel, who suggested this was the Halloween Problem.

The Halloween Problem

The Halloween Problem is a well documented issue, and it affects other database systems, not just SQL Server. The issue was originally seen by IBM engineers using an UPDATE to set a value that was also in their WHERE clause. So, database systems include protections for the Halloween Problem where necessary in DML statements, and SQL Server decided it needed to protect this query. And our query matches the pattern for this issue; we’re filtering on a field while we are updating it. All DML statements can run afoul of this issue, and there are examples for all in this really excellent series of posts by Paul White.

An Unlikely Ally

The protection SQL Server employs ultimately comes down to interrupting the normal flow of rows from operators up the plan. We actually need a blocking operator! A blocking operator would cause all the rows coming from the query against our primary table to pool in that operator. We’ll have a list of all relevant rows in one place, and they can be passed on to operators above without continuing the index seek against our table; possibly seeing the same row a second time. Eager spools\table spools are frequently used for this purpose, and SQL Server used an eager spool to provide Halloween protection in my case. A sort would also do, and we could design this query to sort the results of querying the table. If our query already employed a blocking operation to interrupt the flow, SQL Server would not need to introduce more operations to protect against the Halloween Problem. In my case, I definitely don’t want it spooling and creating three more expensive index operations. My idea for rewriting this goes a step farther.

A Two Table Solution

If we queried the data from our base table into a temp table, we could then update the base table without querying that same table on the same column, tripping over a pumpkin in the process. My procedure is already using memory optimized table variables, because this proc runs so frequently that temp tables cause contention in tempdb (described here). So in this instance, I’ll actually query this data into another motv. I can also use the data I stash in my motv to decide if any rows I intended to update don’t exist yet. I’ll INSERT those later, only if needed.

Bonus

Now, the whole point of the original change was to reduce work when we update a field to be equal to its current contents. In my motv, I’m going to have the old and new value for the field. It would be simplicity itself to put my UPDATE in a conditional, so that we only run the UPDATE if at least one row in my motv is making an actual change to the field. So instead of just reducing the cost of our update operators, we’ll skip the entire UPDATE statement frequently, for only the cost of one SELECT and a write to our motv.

Results

Stunning. The new logic causes this proc to skip the UPDATE statements entirely over 96% of the time. Even given that we are running a query to populate the memory optimized table variable (which is taking <100 microseconds) and running an IF EXISTS query against that motv (which takes 10-20 microseconds), we’re spending 98% less time doing the new logic than the original UPDATE statement. When I started reviewing the procedure several months ago, it took 3.1 milliseconds on average. I’ve tried several other changes in isolation, some effective, some not. The procedure is now down to 320 microseconds; an almost 90% reduction overall after a 71% drop from this change. I have some other ideas for tweaks to this proc, but honestly, there’s very little left to gain with this process. Time to find a new target. If you liked this post, please follow me on twitter or contact me if you have questions.