Using T-SQL Functions in the WHERE clause of query July 16, 2012Posted by fofo in Sql Server, Sql Server 2005, SQL Server 2008, SQL Server 2012.
In this post Ι would like to demonstrate with a hands-on example the negative performance implications when using built-in or user defined functions in a WHERE statement of a SQL query.
Obviously we can use the functions in the SELECT part of the query. In that case the performance impact is not that great.
You can use any edition of SQL Server of 2012 you want for this example. SQL Server Express edition will do just fine.Download the Express edition, here .
In my case I will use Developer edition of 2012 for these examples.
I connect to the SQL Server instance.
I need to create a sample database (SAMPLEDEMO) and a table (SALES) in it.I create a non-clustered index on the OrderDate column.
Type (copy and paste but make sure you review it) and execute the code below
USE [master] GO CREATE DATABASE [SAMPLEDEMO] GO USE [SAMPLEDEMO] GO CREATE TABLE [dbo].[SaleS]( [SalesOrderID] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL, [OrderDate] [datetime] NOT NULL, [DueDate] [datetime] NOT NULL, [ShipDate] [datetime] NULL, [Status] [tinyint] NOT NULL, [SubTotal] [money] NOT NULL, [TaxAmt] [money] NOT NULL, [Freight] [money] NOT NULL, [Comment] [nvarchar](128) NULL, [rowguid] [uniqueidentifier] ROWGUIDCOL NOT NULL, [ModifiedDate] [datetime] NOT NULL, CONSTRAINT [PK_SalesOrderHeader_SalesOrderID] PRIMARY KEY CLUSTERED ( [SalesOrderID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] CREATE NONCLUSTERED INDEX [IX_Sales_OrderDate] ON [dbo].[SaleS] ( [OrderDate] ASC ) GO
Then I will populate the table with 10.000 rows of data using a third party tool (Data Generator from RedGate).
Now we will write 2 queries (we were asked to produce a report from our e.g sales manager) that will bring back only the SalesOrderID and OrderDate values for the orders that were placed on July 2008.
Type and execute the code below.Make sure that you have clicked on the “Include Actual Execution Plan” button.We need the metrics from the execution plan.
USE [SAMPLEDEMO] GO SELECT [SalesOrderID],[OrderDate] FROM [SAMPLEDEMO].[dbo].[SaleS] WHERE year(orderdate)= '2008' and MONTH(OrderDate) = '7' SELECT [SalesOrderID],[OrderDate] FROM [SAMPLEDEMO].[dbo].[SaleS] WHERE orderdate between '2008/07/01' and '2008/08/01'
Have a look at the picture below to see my findings
In both cases I get back 102 rows of data. In the first sql query using the built-in functions the cost is 91% relative to the batch and I see that I have an Index scan. By rewriting the query (omitting the built-in functions) I see that the cost is 9% relative to the batch and the optimiser uses an Index Seek which is a far better option.The non-clustered index we created is used in the second case because the optimiser can see now that the values in the OrderDate column are organised according to this index.This is a “range” query and because there is an index that is available on this column it is used efficiently to retrieve the requested data.
Hope it helps!!!!