Copying Data Should Be Easy

May 20, 2009

Typically, moving data isn’t that big of a deal. You can either fire up SSMS and write a quick script to copy data from one server to another. Or, if you need to move data from production into development, you can just do a back up and restore. “Typically” is the key word in the previous paragraph. In my situation, we have a production master that replicates to two separate servers. The data on each subscriber is different and the determination process is driven by business criteria (actually, it’s driven by a number in a TINYINT column). The publisher is mainly a central target for our ETL processes. As such, it doesn’t have a lot of indexing on it. The subscribers, however, do. Refreshing the development environment is a bit trickier than just restoring over the dev database.

[caption id=“attachment_604” align=“alignright” width=“251”] Yeah, sure, destroy. Sounds good.[/caption]

Initially, I tried using the Import Data task to copy data across from the production server to the dev server. The trick here is that there were still foreign keys on the table. In case you’re unfamiliar with this feature of the Import and Export Wizard, it gives you the option to “Delete” rows in the destination table. In this case, the word “Delete” actually means “Use TRUNCATE TABLE instead and don’t tell anyone about it”. Well, the fun part is that TRUNCATE TABLE will fail noisily if you have foreign keys in place. Being an enterprising young man, I thought to myself, I know, I’ll just write a script to write a script that will drop and re-create all of my foreign keys. So, instead of writing the script myself, I twittered and Ted Krueger send back a tweet that led me to this little gem. Don’t mind the sloppy syntax, that script works great. So, after modifying it to give me both the drop and creates, I was ready to go, right? Not so fast! As we know, every row written to a table also has to be written to all of the non-clustered indexes as well: one table plus four indexes = five writes. This is a large database (just large, not a VLDB). As such, there are a lot of indexes. They are also large. Inserting a large number of rows into a large number of indexes will take, as they say, some time. Being impatient, and enterprising, I decided to write a script to drop and re-create all of the indexes as well. But, rather than write a script, I just went ahead and found one and modified it to suit my purposes. Actually, I tweeted and Adam Machanic and Mladen Prajdic ended up pointing me toward Tibor Karaszi’s sp_indexinfo – enhanced index information procedure which I was able to easily modify to give me the index drop and create info I needed. Protip: SQL Server Google hint: Add “vyas” or “tibor” to your searches for DDL scripts and you’ll almost always get a good hit… (Thanks, Adam!) At this point, everything should be done and our story should be over. You, dear reader, would have falsely assumed that this story has a happy ending. When I fired up the data import task it would eventually and inevitably fail catastrophically with an error along the lines of “TCP Provider: An existing connection was forcibly closed by the remote host.” We started looking into the actual networking issued before we realized that I was running the export data wizard from my local computer, which shares a hub with several other networked devices. Once I remoted into the development server and attempted the process, I got the following error: “TCP Provider: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full.” This error comes from an issue when you are using the /3GB switch on a 32-bit server that might be copying large quantities of data from one place to another. The solution is do make sure that you still have AWE and PAE enabled, remove the /3GB switch, and then reboot! I’m still waiting for my reboot. I think I should have just done a backup and restore and copied the indexes back in from production. So much for trying to be clever and automate the process, eh? Oh, here’s the script that drops and re-creates all of the indexes in the current database. It should be, at the very least, a curiosity to someone.``` DECLARE @newline AS NVARCHAR(2);

SET @newline = NCHAR(13) + NCHAR(10);

IF OBJECT_ID(N’tempdb..#indexes’) IS NOT NULL DROP TABLE #indexes;

WITH key_columns AS ( SELECT c.[object_id], c.name AS column_name, ic.key_ordinal, ic.is_included_column, ic.index_id, ic.is_descending_key FROM sys.columns AS c INNER JOIN sys.index_columns AS ic ON c.[object_id] = ic.[object_id] AND ic.column_id = c.column_id ), physical_info AS ( SELECT p.[object_id], p.index_id, ds.name AS location, SUM(p.rows) AS rows, SUM(a.total_pages) AS pages FROM sys.partitions AS p INNER JOIN sys.allocation_units AS a ON p.hobt_id = a.container_id INNER JOIN sys.data_spaces AS ds ON a.data_space_id = ds.data_space_id GROUP BY [object_id], index_id, ds.name ) SELECT OBJECT_SCHEMA_NAME(i.[object_id]) AS schema_name, OBJECT_NAME(i.[object_id]) AS table_name, i.name AS index_name, CASE i.type WHEN 0 THEN N’heap' WHEN 1 THEN N’CLUSTERED' WHEN 2 THEN N’nc' WHEN 3 THEN N’xml' ELSE CAST(i.type AS VARCHAR(2)) END AS [type], i.is_unique, CASE WHEN is_primary_key = 0 AND is_unique_constraint = 0 THEN N’IX' WHEN is_primary_key = 1 AND is_unique_constraint = 0 THEN N’PK' WHEN is_primary_key = 0 AND is_unique_constraint = 1 THEN N’UX' END AS cnstr, STUFF((SELECT CAST(’, ’ + kc.column_name + CASE kc.is_descending_key WHEN 0 THEN ’' ELSE ’ DESC’ END AS VARCHAR(MAX) ) AS [text()] FROM key_columns AS kc WHERE i.[object_id] = kc.[object_id] AND i.index_id = kc.index_id AND kc.is_included_column = 0 ORDER BY key_ordinal FOR XML PATH ( ’’ )), 1, 2, ‘’) AS key_columns, STUFF((SELECT CAST(’, ’ + column_name AS VARCHAR(MAX)) AS [text()] FROM key_columns AS kc WHERE i.[object_id] = kc.[object_id] AND i.index_id = kc.index_id AND kc.is_included_column = 1 ORDER BY key_ordinal FOR XML PATH ( ’’ )), 1, 2, ‘’) AS included_columns, p.location, p.rows, p.pages, CAST((p.pages * 8.00) / 1024 AS DECIMAL(9, 2)) AS MB, s.user_seeks, s.user_scans, s.user_lookups, s.user_updates INTO #indexes FROM sys.indexes AS i LEFT OUTER JOIN physical_info AS p ON i.[object_id] = p.[object_id] AND i.index_id = p.index_id LEFT OUTER JOIN sys.dm_db_index_usage_stats AS s ON s.[object_id] = i.[object_id] AND s.index_id = i.index_id AND s.database_id = DB_ID() WHERE OBJECTPROPERTY(i.[object_id], N’IsMsShipped’) = 0 AND OBJECTPROPERTY(i.[object_id], N’IsTableFunction’) = 0 AND OBJECT_NAME(i.[object_id]) LIKE N’%’ AND i.[type] >< 0 ORDER BY table_name, i.[type], index_name;

SELECT N’ DROP INDEX ' + QUOTENAME(i.index_name) + N’ ON ’ + QUOTENAME(i.[schema_name]) + N’.’ + QUOTENAME(i.table_name) FROM #indexes AS i WHERE i.cnstr >< ‘PK’;

SELECT N’ALTER TABLE ’ + QUOTENAME(i.[schema_name]) + N’.' + QUOTENAME(i.table_name) + N’ ADD CONSTRAINT ' + QUOTENAME(i.index_name) + N’ PRIMARY KEY CLUSTERED (' + i.key_columns + N’) ON ' + QUOTENAME(location) FROM #indexes AS i WHERE i.cnstr = ‘PK’;

SELECT N’CREATE ' + CASE WHEN is_unique = 1 THEN N’ UNIQUE ’ ELSE ’’ END + CASE WHEN type = ‘CLUSTERED’ THEN N’ CLUSTERED INDEX ' ELSE N’ NONCLUSTERED INDEX ’ END + QUOTENAME(i.index_name) + N’ ON ' + QUOTENAME(i.[schema_name]) + N’.’ + QUOTENAME(i.table_name) + N’ (' + key_columns + N’)' + CASE WHEN included_columns IS NOT NULL THEN N’ INCLUDE (’ + included_columns + N’)' ELSE N’’ END + N’ ON ' + QUOTENAME(i.location) FROM #indexes AS i WHERE i.cnstr >< ‘PK’ AND i.location IS NOT NULL;