Anthony Martin's Blog

SQL Server and Microsoft BI

Posts Tagged ‘Partitioning

SQL Server Partition Management Utility: Part 2

leave a comment »

In Part 1 of this series of the free SQL Server Partition Management Utility we covered why and how to use the utility, and walked through a simple example where we executed the utility from the command line.  In this post I want to show you how to execute the utility from SSIS.

Utilizing in SSIS

Let’s go through the same example as before where we want to load a new partition to the partition range to the SalesOrderHeader table.  You probably now have a good idea how the SSIS package design needs to flow:

  1. Create staging table with no indexes
  2. Load newly created staging table
  3. Add indexes and constraints to staging table to match destination table partition
  4. Call the SWITCH command to move staging table to a new partition in the destination table

not a fan of vs 2012 - And below is a screenshot of the corresponding package in SSIS (I’m firmly in the not a fan group of the VS2012 color theme or lack thereof I suppose you could argue!).

The Execute Process Tasks with the task naming convention prefix “EPT” will handle calling the utility’s executable.

image_thumb22

Figure 1: Control flow view of package

go project parameters - However, I am a huge fan of project parameters in SSIS 2012, so let’s create a few to make our lives easier.  For clarity, I’ve prefixed all the parameters with “partmgmt”.  The first five parameters contain the values for the five different commands that the executable handles.  I specifically stored them in parameters because I kept forgetting what they were and had to keep jumping back and forth between the solution and the documentation.  The final parameter partmgmt_Executable stores the file path of the ManagePartition executable.

image_thumb8

Figure 2: Project parameters

One last configuration step, the creation of variables to store the arguments to pass to the ManagePartition executable.  We’ll need to pass the database, schema, and table names of the destination partitioned table.  Also, we’ll need the name of the staging table, and the partition range value which will be used to create the staging table on the correct filegroup and to create the check constraint.

Homework:  You’ll need to figure out the most appropriate way to set the partmgmt_PartitionRangeValue variable for your solution.

Capture14

Figure 3: Variables

In order to call the ManagePartition executable from SSIS, we’ll use the Execute Process Task.  On the Expressions page I’ve created two expressions, one for the executable to be called and one for the arguments to be passed to the executable (shown in Figure 4).  The Executable expression simply uses the partmgmt_Executable project parameter.  Figure 5 shows the expression used for the arguments which relies purely on the project parameters and variables that we’ve already defined.

Short recap of the arguments:

  • /C switch is the command to be executed
  • /d switch is the database name of the partitioned table
  • /s switch is the schema of the partitioned table
  • /t switch is the table name of the partitioned table
  • /A switch is the name of the staging table
  • /v switch is the value of the partition range to be loaded into the partitioned table

Capture_thumb2

Figure 4: Expressions for Execute Process Task

Capture16

Figure 5: Full detail of Arguments expression for CreateStagingNoIndex

Now that we’ve created staging table Test1 for the partition range value of 07/31/2009, the next step would be to load the staging table with a data flow task.  I’ll assume you already understand how that works so I’ll just show you the source query used in the data flow task to load our dummy record which uses the DATEADD function to add a year to the OrderDate, DueDate, and ShipDate for one record in the SalesOrderHeader table that has the highest OrderDate.

Capture15

Now we are ready to add the necessary objects to the staging table so that it meets the requirements of the SWITCH operator.  Because we spent the time to properly configure our project parameters and variables, we can use the previous Execute Process Task as a template and just modify the Arguments expression to use a different project parameter (partmgmt_Command_IndexStaging) for the /C or command switch.

Capture17

Figure 6: Full Detail of Arguments expression for IndexStaging

We are now ready for the final step of the process, switching the staging table into the destination partitioned table.  The syntax for the T-SQL statement is:

ALTER TABLE <StagingTableName> SWITCH TO <DestinationPartitionedTable> PARTITION <PartitionNbr>

We already have variables for <StagingTableName> and <DestinationPartitionedTable>, but are missing <PartitionNbr>.  Let’s add a new Execute SQL Task to the beginning of our package that queries the $PARTITION system function and supplies the partition range as a parameter to return the partition number like so:

Capture4_thumb2

Figure 7: SQL Statement to get partition number

Add our partition range value parameter like so:

Capture6_thumb3

Figure 8: Pass in partition range value variable to parameter

Store the result of the query to our partition number variable like so:

Capture5_thumb4

Figure 9: Set result of query to partition number variable

Ok, now we have all the elements we need to construct our ALTER TABLE SWITCH statement!  This time I decided to use an expression on a variable to construct the SQL statement that is executed by the Execute SQL task named “SQL SWITCH Partition”.  Here is a look at the expression:

Capture18

Recap

Now you have a pattern that you can use to incorporate the SQL Server Partition Management Utility into your SSIS ETL processes.  Have fun designing!

Written by Anthony Martin

July 2, 2013 at 4:18 pm

Posted in Uncategorized

Tagged with , ,

SQL Server Partition Management Utility: Part 1

with 2 comments

3uzpzj

In all of the SQL Server environments I’ve worked in over the last 8 years since SQL Server 2005 was released I’ve seen first hand that table partitioning isn’t used frequently enough.  Now, don’t get me wrong I think most SQL Server professionals are very aware of the feature and its benefits.  However, I believe database developers are actually paralyzed by everything they need to consider to design an effective solution utilizing partitioning.

One of the benefits of table partitioning is that it allows you insert data into a very large table as fast as possible.  This is accomplished by first loading data into a staging table that has the same structure as the destination table.  Assuming a list of requirements on the staging table is met, you can then use ALTER TABLE with the SWITCH option to move the data in the staging table into a partition in the destination table.  The switch happens in the blink of an eye as it is a metadata only operation.

So database developers need to know what requirements must be met on the staging table before figuring out how to automate loading data into a partition table using the SWITCH pattern.

  • The staging table must have the same column structure as the partitioned table
  • The staging table must have the same indexes
  • The staging table must have the same constraints
  • The staging table must have the same filegroup as the target partition of the partitioned table
  • The staging table must have a check constraint to ensure the data fits into the target partition of the partitioned table

Meeting these requirements isn’t the hardest thing to accomplish, but it isn’t all that trivial either, especially if have an environment where table structures are changing due to new development or index tuning.  One approach to handle this is to store the DDL for each partitioned tables’ table structure, indexes, and constraints and utilize them in your ETL packages.  With this approach, for every DDL change that you are aware of you have to change your ETL to accommodate the change.  And for changes that you don’t know about (how could that ever happen)?  Well, you’ll find out soon enough from a failed ETL run.

But don’t worry, as the subject of this blog post suggests there is a free utility to help you with the dirty work!

The Utility

The SQL Server Partition Management project on CodePlex is that utility.  It provides several commands that allow you to control the automation process to create the staging table while ensuring that it meets all of the requirements of the SWITCH operator that we previously outlined.  The utility determines the database structures that need to be built based on the arguments passed to the utility.

On the Downloads page of the CodePlex project the ManagePartition.exe is the executable that does all the work.  It is executed from either the command line or through SSIS.  In part two of this series I’ll cover in detail how to use this utility in SSIS.

The ReadMe.htm file is also available on the download tab and contains very good documentation for the executable (confusingly the CodePlex documentation tab is empty).

The third and final download available is named PartitionManagement3.0.zip.  This contains all of the source code should you choose to extend the solution as you see fit as well as SQL test scripts.

Pre-reqs

Make sure that you have the required pre-requisites called out in the readme file with download links.  You will need the .NET Framework 3.5 or higher, SQL Server 2012 System CLR Types, AND SQL Server 2012 Management Objects (SMO).

Commands

The ManagePartition executable accepts five different commands that assist you with creating your staging table to meet the requirements for the SWITCH operator.  The two commands that I typically use, and will focus on for the rest of this post are CreateStagingNoIndex and IndexStagingCreateStagingNoIndex pretty much does what it sounds like, it creates a staging table with the same structure as the specified destination table without any indexes (clustered or nonclustered) so that you can load the staging table as fast as possible.  You can probably also guess what the IndexStaging command does, it creates any indexes or indexed views, and the check constraints for the partition number or range value specified.  The three other commands available are outlined below:

  • ClearPartition– Switches out a partition in a partitioned table to a staging table.
  • CreateStagingFull– Creates a staging table that matches a partition of a partitioned table in terms of structure, indexes, constraints, and filegroup.
  • CreateStagingClusteredIndex– Creates a staging table that matches a partition of a partitioned table in terms of structure, the clustered index, and filegroup.  Nonclustered indexes will not be created.

Arguments

I’m not going to cover all of the arguments available but the table below outlines the more important arguments.  However, be aware that there is also functionality to connect without using integrated security by providing a SQL Server login and password and to script the objects to be created without executing them on the server.

Argument Short Form Description
/Command /C The utility command to be executed
/Database /d The name of the database
/Schema /s The schema name of the partitioned table
/PartitionTable /t The name of the partitioned table
/PartitionNumber /p The number of the target partition in the partitioned table
/PartitionRangeValue /v The range value corresponding to the target partition in the partitioned table
/StagingTable /A The name of the staging table to be created

Testing

The provided testing script PartitionMgmtTest_2012.sql in the PartitionManagement3.0.zip creates a database named PartitionTest by copying some objects and data from the AdventureWorks2012 database.  Note, that the script does attempt to drop views before they exist so don’t be surprised when you see errors when running the script.  Also, the entire script isn’t meant to be executed in one batch.  Do a CTRL+F and search for the string ‘ManagePartition tests’.  Execute everything above this line first to create the database.  Everything below the line is the test cases.

Hands On with the Utility

Why should I use it, check.  Background info, check.  Let’s see it in action!

Our scenario is that we want to load the partition corresponding to the partition range value of 01/01/2007 for the SalesOrderHeader table.  This table has a primary key, foreign keys, indexes, check constraints, and default constraints that are staging table must exactly match before using the SWITCH operator.

The first thing we’ll do is run the ManagePartition executable with the command CreateStagingNoIndex with the appropriate arguments as shown below to create our staging table named Test1.

image

The command executes in just a few seconds and we no have a staging table with no indexes or constraints that we can bulk insert data into very fast.  We’ll pretend we accomplished that task and are now ready to make sure the table meets all the requirements of the SWITCH operator.  We’ll do that by using the IndexStaging command as shown below:

image

Notice the only difference between the two commands we just executed was the /C argument which specifies the command to be run, everything else stays the same…pretty simple!  So how do we verify that it worked as advertised?  Let’s first browse the two tables in SSMS object explorer and eyeball the keys, constraints, and indexes.

Capture9

BOOM!  Exact match, even the Columnstore index!  What is the extra constraint chk_Test1_partition_4  on our staging table you ask?  No worries, this is check constraint to meet our last requirement that enforces the data in our table to match the partition range for our specified partition range value.  And the definition of the check constraint:

Capture10

One last test, lets make sure we can successfully execute the SWITCH operator.  Since we cheated before and never actually loaded any data into the staging table and the fact that our partitioned table already has data for the partition range specified what we’ll do instead is first SWITCH the partition out of the partitioned table and into the staging table and the SWITCH the partition back into the partitioned table from the staging table.

First, the screenshot below shows the SWITCH out of the partitioned table and into the staging table.  Also, notice that querying the staging table now returns data!

Capture11

And finally, we SWITCH the staging table into the partitioned table.  Querying the staging table now returns no data as expected.

Capture12

Recap

Now that I showed you how you can use the SQL Server Partition Management Utility to ease the pain of creating staging tables that will meet the requirements of the SWITCH operator you are running out of excuses for implementing partitioning in your environment…try it out!

In part 2 of this series I’ll show you how to use the utility within SSIS, stay tuned Smile

Written by Anthony Martin

July 1, 2013 at 3:17 am

Posted in Uncategorized

Tagged with ,