SQL Server 2005 has got built-in partitions. As result, I have been given a task to port a database from SQL Server 2000 to 2005, and replace old style partitions with new one. It seems reasonable, but before modifying a production database, which is about 5TB in size, I've tested a small one.
Switch the data - it's an easy part. I need also to test all related stored procedures. At this point I've found shortcomings, which tightly related to a nature of the partitions.
In select statement SQL Server 2005 iterates over partitions, in contrast SQL Server 2000 rolls out partition view and embeds partition tables into an execution plan. The performance difference can be dramatic (the case I'm dealing with).
Suppose you are to get 'top N' rows of ordered set of data from several partitions. SQL Server 2000 can perform operations on partitions (to get ordered result per partition), and then merge them, and return 'top N' rows. However, if execution plan just iterates partitions and applies the same operations to each partition in sequential manner the result will be semiordered. To get 'top N' rows the sort operator is required. This is the case of SQL Server 2005.
The problem is that the SQL Server 2005 never uses merge operator to combine results!
To illustrate the problem let's define two partitioned tables:
create partition function [test](smalldatetime) as range left for values (N'2007-01-01', N'2007-02-01')gocreate partition scheme [testScheme] as partition [test] to [primary], [primary], [primary])goCREATE TABLE [dbo].[Test2000_12]( [A] [smalldatetime] NOT NULL, [B] [int] NOT NULL, [C] [nvarchar](50) NULL,CONSTRAINT [PK_Test2000_12] PRIMARY KEY CLUSTERED( [A] ASC, [B] ASC))GOCREATE NONCLUSTERED INDEX [IX_Test2000_12] ON [dbo].[Test2000_12] ( [B] ASC, [A] ASC)GOCREATE TABLE [dbo].[Test2000_01]( [A] [smalldatetime] NOT NULL, [B] [int] NOT NULL, [C] [nvarchar](50) NULL,CONSTRAINT [PK_Test2000_01] PRIMARY KEY CLUSTERED ( [A] ASC, [B] ASC))GOCREATE NONCLUSTERED INDEX [IX_Test2000_01] ON [dbo].[Test2000_01] ( [B] ASC, [A] ASC)GOCREATE TABLE [dbo].[Test2000_02]( [A] [smalldatetime] NOT NULL, [B] [int] NOT NULL, [C] [nvarchar](50) NULL,CONSTRAINT [PK_Test2000_02] PRIMARY KEY CLUSTERED ( [A] ASC, [B] ASC))GOCREATE NONCLUSTERED INDEX [IX_Test2000_02] ON [dbo].[Test2000_02] ( [B] ASC, [A] ASC)GOCREATE TABLE [dbo].[Test2005]( [A] [smalldatetime] NOT NULL, [B] [int] NOT NULL, [C] [nvarchar](50) NULL,CONSTRAINT [PK_Test2005] PRIMARY KEY CLUSTERED ( [A] ASC, [B] ASC)) ON [testScheme]([A])GOCREATE NONCLUSTERED INDEX [IX_Test2005] ON [dbo].[Test2005] ( [B] ASC, [A] ASC) ON [testScheme]([A])GOALTER TABLE [dbo].[Test2000_01] WITH CHECK ADD CONSTRAINT [CK_Test2000_01] CHECK (([A]>='2007-01-01' AND [A]<'2007-02-01'))GOALTER TABLE [dbo].[Test2000_01] CHECK CONSTRAINT [CK_Test2000_01]GOALTER TABLE [dbo].[Test2000_02] WITH CHECK ADD CONSTRAINT [CK_Test2000_02] CHECK (([A]>='2007-02-01'))GOALTER TABLE [dbo].[Test2000_02] CHECK CONSTRAINT [CK_Test2000_02]GOALTER TABLE [dbo].[Test2000_12] WITH CHECK ADD CONSTRAINT [CK_Test2000_12] CHECK (([A]<'2007-01-01'))GOALTER TABLE [dbo].[Test2000_12] CHECK CONSTRAINT [CK_Test2000_12]GOcreate view [dbo].[test2000] asselect * from dbo.test2000_12union allselect * from dbo.test2000_01union allselect * from dbo.test2000_02go/*Returns numbers table.Table has a following structure: table(value int not null);value is an integer number that contains numbers from 0 to a specified value.*/create FUNCTION dbo.[Numbers]( /* Number of rows to return. */@count int)RETURNS TABLE ASRETURN with numbers(value) as (select 0union allselect value * 2 + 1 from numbers where value < @count / 2union allselect value * 2 + 2 from numbers where value < (@count - 1) / 2)selectrow_number() over(order by U.v) valuefrom numbers cross apply (select 0 v) UPupulate tables:
insert into dbo.Test2005selectcast(N'2006-01-01' as smalldatetime) + 0.001 * N.Value,N.Value,N'Value' + cast(N.Value as nvarchar(16))fromdbo.Numbers(500000) Ngoinsert into dbo.Test2000selectcast(N'2006-01-01' as smalldatetime) + 0.001 * N.Value,N.Value,N'Value' + cast(N.Value as nvarchar(16))fromdbo.Numbers(500000) Ngo
Perform a test:
select top 20A, Bfrom dbo.Test2005--where--(A between '2006-01-10' and '2007-01-10')order byBselect top 20A, Bfromdbo.Test2000--where--(A between '2006-01-10' and '2007-01-10')order byB--option(merge union)
The difference is obvious if you will open execution plan. In the first case estimated subtree cost is: 17.4099; in the second: 0.0455385.
SQL server cannot efficiently use index on columns (B, A). The problem presented here can appear in any select that occasionally accesses two partitions, but regulary uses only one, provided it uses a secondary index. In fact this covers about 30% of all selects in my database.
Next day
I've meditated a little bit more and devised a centaur: I can define a partition view over partition table. Thus I can use either this view or table depending on what I'm trying to achieve either iterate partitions or roll them out.
create view [dbo].[Test2005_View] asselect * from dbo.Test2005 where $partition.test(A) = 1union allselect * from dbo.Test2005 where $partition.test(A) = 2union allselect * from dbo.Test2005 where $partition.test(A) = 3
The following select is running the same way as SQL Server 2000 partitions:
select top 20A, Bfrom dbo.Test2005_View-- dbo.Test2005order byB
Remember Me
a@href@title, b, blockquote@cite, em, i, strike, strong, sub, super, u