SQL 2016 – Partitioning An Existing Table With A Columnstore Index

By Lori Brown | ColumnStore

Feb 24

— by Lori Brown @SQLSupahStah

I recently ran across a data warehouse that did have the largest table partitioned. We were migrating up to SQL 2016 and we wanted to make sure that the partitions were still working and in place. Since I felt I needed to refresh my table partitioning skills, I decided to conduct a small scale test of partitioning up an existing table by year and to make it more fun, I wanted to have a columnstore index present to see how interesting things could be.

I am using the Fact.Purchase table in the new WideWorldImportersDW database. WideWorldImporters replaces AdventureWorks and can be researched and downloaded from here: https://blogs.technet.microsoft.com/dataplatforminsider/2016/06/09/wideworldimporters-the-new-sql-server-sample-database/

I did a little querying in the Fact.Purchase table and found that the Date Key column is great to use to partition it by year. And, I found that the dates in the table are generally from 2013 through 2016.

First you have to add the filegroups that will be used for the table once it is partitioned. I have added 4 filegroups for years 2013 – 2016.

blog_20170224_1

Once that is done, I have to add an .NDF file for each filegroup and map the file to the correct filegroup.

blog_20170224_2

 

Next I need a partition function and a partition scheme.

USE [WideWorldImportersDW]

GO

CREATE PARTITION FUNCTION [PF_Purch_Year](date) AS RANGE RIGHT FOR VALUES (‘2014-01-01’, ‘2015-01-01’, ‘2016-01-01’)

GO

CREATE PARTITION SCHEME [PS_Purch_Year] AS PARTITION [PF_Purch_Year] TO (Purch2013, Purch2014, Purch2015, Purch2016)

GO

What I am basically doing is setting parameters or boundaries for objects that use the PF_PURCH_YEAR function to place date values older than 01/01/2014 into the Purch2013 filegroup, values between 01/01/2014 and 01/01/2015 into the Purch2014 filegroup, values between 01/01/2015 and 01/01/2016 into the Purch2015 filegroup and finally everything newer than or equal to 01/01/2016 into the Purch2016 filegroup.

blog_20170224_3

Now that I have everything set up, I have to move the data to it. However, I cannot do that with an existing clustered index. I have to drop the clustered index and rebuild it with the partition scheme. This does get a bit confusing if you are trying to move existing things as you will see.

I first dropped all my indexes. Of course I had scripted them all out for rebuild but ran into trouble when I tried to build them again.

USE [WideWorldImportersDW]

GO

DROP INDEX [CCX_Fact_Purchase] ON [Fact].[Purchase]

GO

ALTER TABLE [Fact].[Purchase] DROP CONSTRAINT [PK_Fact_Purchase]

GO

DROP INDEX [FK_Fact_Purchase_Date_Key] ON [Fact].[Purchase]

GO

DROP INDEX [FK_Fact_Purchase_Stock_Item_Key] ON [Fact].[Purchase]

GO

DROP INDEX [FK_Fact_Purchase_Supplier_Key] ON [Fact].[Purchase]

GO

 

Here we go trying to rebuild….this is where it gets interesting…

USE [WideWorldImportersDW]

GO

CREATE CLUSTERED COLUMNSTORE INDEX [CCX_Fact_Purchase] ON [Fact].[Purchase]

WITH (DROP_EXISTING = OFF)

ON PS_Purch_Year ([Date Key])

GO

I received an error:

Msg 35316, Level 16, State 1, Line 3

The statement failed because a columnstore index must be partition-aligned with the base table. Create the columnstore index using the same partition function and same (or equivalent) partition scheme as the base table. If the base table is not partitioned, create a nonpartitioned columnstore index.

 

Uh oh! This meant that the table had already been part of a different partition scheme. I did not think to look for that before I started. Ugh! I thought that it must be easy to change the base table partition alignment and it is but it is not super intuitive. Since I had already dropped all my indexes, I figured that I needed to recreate the original clustered index on the old partition since that would be how the base table is associated with anything. I rebuilt it using the old partition. So far so good. When I tried rebuilding it with DROP_EXISTING = ON I received the same error as before. After thinking about it for a bit, I dropped everything again but this time created a regular clustered index on the new partition to align the base table. This worked!

ALTER TABLE [Fact].[Purchase] ADD CONSTRAINT [PK_Fact_Purchase] PRIMARY KEY CLUSTERED

(

[Purchase Key] ASC,

[Date Key] ASC

)ON PS_Purch_Year ([Date Key])

GO

 

Now the base table is aligned with the partition scheme that I wanted it on but I need the clustered index to be the columnstore index. We have to drop and create again and everything falls into place.

ALTER TABLE [Fact].[Purchase] DROP CONSTRAINT [PK_Fact_Purchase]

GO

USE [WideWorldImportersDW]

GO

CREATE CLUSTERED COLUMNSTORE INDEX [CCX_Fact_Purchase] ON [Fact].[Purchase]

WITH (DROP_EXISTING = OFF)

ON PS_Purch_Year ([Date Key])

GO

ALTER TABLE [Fact].[Purchase] ADD CONSTRAINT [PK_Fact_Purchase] PRIMARY KEY NONCLUSTERED

(

[Purchase Key] ASC,

[Date Key] ASC

)ON PS_Purch_Year ([Date Key])

GO

CREATE NONCLUSTERED INDEX [FK_Fact_Purchase_Date_Key] ON [Fact].[Purchase]

(

[Date Key] ASC

)ON PS_Purch_Year ([Date Key])

GO

CREATE NONCLUSTERED INDEX [FK_Fact_Purchase_Stock_Item_Key] ON [Fact].[Purchase]

(

[Stock Item Key] ASC

)ON PS_Purch_Year ([Date Key])

GO

CREATE NONCLUSTERED INDEX [FK_Fact_Purchase_Supplier_Key] ON [Fact].[Purchase]

(

[Supplier Key] ASC

)ON PS_Purch_Year ([Date Key])

GO

 

Success!!!! Now I just needed to make sure that things are where I expect them to be. I pilfered the following queries from MSDN…

SELECT *

FROM sys.tables AS t

JOIN sys.indexes AS i

ON t.[object_id] = i.[object_id]

JOIN sys.partition_schemes ps

ON i.data_space_id = ps.data_space_id

WHERE t.name = ‘Purchase’;

GO

blog_20170224_4

blog_20170224_5

As you scroll across the above results I can see that all my indexes are in the PS_Purch_Year partition scheme.

SELECT t.name AS TableName, i.name AS IndexName, p.partition_number, p.partition_id, i.data_space_id,

f.function_id, f.type_desc, r.boundary_id, r.value AS BoundaryValue

FROM sys.tables AS t

JOIN sys.indexes AS i

ON t.object_id = i.object_id

JOIN sys.partitions AS p

ON i.object_id = p.object_id AND i.index_id = p.index_id

JOIN sys.partition_schemes AS s

ON i.data_space_id = s.data_space_id

JOIN sys.partition_functions AS f

ON s.function_id = f.function_id

LEFT JOIN sys.partition_range_values AS r

ON f.function_id = r.function_id and r.boundary_id = p.partition_number

WHERE t.name = ‘Purchase’

ORDER BY p.partition_number;

 

This query shows the boundaries.

blog_20170224_6

I also found a really handy query created by David Peter Hansen that gives you info on partitions on a table much more concisely. https://davidpeterhansen.com/view-partitions-in-sql-server/  As they say….go read the whole thing. J

blog_20170224_7

I hope this helps someone who wants to partition an existing table. I did this on my own system with a small table and not a production system so please be sure to test your process against a test system before you accidentally cause an issue in production.

For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRx@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. Visit us at www.sqlrx.com!

 

Follow

About the Author

Lori is an avid runner, cross fitter and SQL enthusiast. She has been working for SQLRX for 15 years and has been working with SQL in general for 25 years. Yup...she is an old hand at this stuff.

>