I need to change a INT to a BIGINT dataype in a large table with little downtime.
-
Monday, November 23, 2009 2:00 PM
I have a column that is a INT data type that needs to be changed to a BIGINT data type. The column is a identity column that is reaching upper limits and part of the PK. The table has 1.5 billion records and very little downtime. I was thinking of replicating the table to reduce the needed downtime to change the data type. I have a two hour window for maintenance each month to make changes.
I need help or suggestion on how to accomplish this in a two hour window.- Moved by Tom PhillipsModerator Monday, November 23, 2009 5:43 PM Possibly better answer from TSQL forum (From:SQL Server Database Engine)
All Replies
-
Monday, November 23, 2009 4:55 PMhttp://sql-server-performance.com/Community/forums/t/25772.aspx
HTH
Vinay
Vinay Thakur http://vinay-thakur.spaces.live.com/ http://twitter.com/ThakurVinay- Proposed As Answer by Naomi NMicrosoft Community Contributor, Moderator Monday, November 23, 2009 5:59 PM
- Unproposed As Answer by Arnie RowlandMVP, Moderator Saturday, November 28, 2009 3:58 AM
-
Saturday, November 28, 2009 3:58 AMModeratorWhile providing this link does provide good information, this post doesn't really qualify as an attemtp to help the user. It would have been useful to provide some explanation, and then perhaps suggest the link.
Since most questions have been asked and answered somewhere, this would become an impersonal and lifeless forum if we all followed this model.
You may be only one person in the world, but you may also be the world to one person. -
Saturday, November 28, 2009 8:15 PM
INT can take 4 billion different unique values.
Why can't you just issue a
DBCC CHECKIDENT('MySchema.MyTableName', RESEED, -2147483648)
To have the table "start" over at negative 2.1 billion. That will give you 2.1 billion more records. And has NO downtime!
And when value is increasing and reaching near zero, issue a
DBCC CHECKIDENT('MySchema.MyTableName', RESEED, 1550000000)
To continue at your old value (and have 600 million more records). How long time has it taken to fill up your present 1.5 billion records?
Do remember that if the identity column also is clustered, you will get performance issues! But if you have a clustered index on another column(s), you will not notice that the identity value is starting over from negative 2.1 billion.
//Peter -
Sunday, November 29, 2009 1:02 AMModerator
To have the table "start" over at negative 2.1 billion. That will give you 2.1 billion more records. And has NO downtime!
I would not recommend that. The INT identity(1,1) surrogate primary key is commonly accepted in the positive range. You may get unusual issues extending to negative range. Even business issues: some people may think that id -1000 same as id 1000. When you run out of the negative range, you have to go BIGINT anyhow.
I vote for BIGINT.
You can minimize down time by testing out the fastest conversion in advance.
Kalman Toth SQL SERVER 2012 & BI TRAINING
New Book: Beginner Database Design & SQL Programming Using Microsoft SQL Server 2012- Edited by Kalman TothMicrosoft Community Contributor, Moderator Monday, October 01, 2012 1:06 AM
-
Sunday, November 29, 2009 6:51 AMI am sorry, but I don't see how my response was inaccurate nor disrespectful, to the extent that you thought it was a "hijack"?
I told OP of another method that he not might have considered. Maybe there is no need to change to BIGINT at all? If all other approaches has been explored, that's fine and change the column to BIGINT.
Changing to BIGINT column, IN my opinion, has more impact than DBCC approach. Changing the column to BIGINT need the architect and developers fully test the application(s) again, to see that all changes are good. And that e is AFTER the changing to BIGINT has been done. Or maybe has a test environment where he has restored a backup, changed the column type there, and also made the necessary modifications to the application(s)?
We don't know because OP hasn't told us.
The previous answer is split here for continuation
http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/3b7215ae-e65b-4e7d-a262-75cc3fe5ad1a- Edited by SwePesoMVP Sunday, November 29, 2009 6:53 AM
-
Sunday, November 29, 2009 8:08 AMModerator
Peso,
Did you ever try it? Starting out at: -2147483648?
Anyone has any experience withint identity( -2147483648,1) Primary Key
?
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com -
Sunday, November 29, 2009 8:16 AMYes, I have. I wouldn't be responsible if I gave advice I have not testest myself, would I?
-
Sunday, November 29, 2009 8:22 AMModerator
I need help or suggestion on how to accomplish this in a two hour window.
Build a new table like:
SELECT ID = CONVERT(BIGINT, ID), .... INTO newTableAlpha FROM TableAlpha
( You have to start this step before the two hour maintenance window.. like 10 hours before..depending on your test run results)
Apply indexes/constraints
( two hour window starts)
Rename TableAlpha to zzzTableAlpha
Rename newTableAlpha to TableAlpha
Compare TableAlpha to zzzTableAlpha
Apply updates/inserts to TableAlpha based on the delta
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com -
Sunday, November 29, 2009 8:37 AMModerator
Yes, I have. I wouldn't be responsible if I gave advice I have not testest myself, would I?
That is still different though from the current case whereby the positive int range is already allocated.
In your experience, any pros & cons?
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com -
Sunday, November 29, 2009 9:02 AMOnly good experiences, "pros".
-
Sunday, November 29, 2009 9:04 AM
Build a new table like:
SELECT ID = CONVERT(BIGINT, ID), .... INTO newTableAlpha FROM TableAlpha
( You have to start this step before the two hour maintenance window.. like 10 hours before..depending on your test run results)
I am interested to find out which table locks are applied with the suggestion above. A "SELECT * INTO" does put a shared lock, or a table lock?
And if the suggestion puts a table lock, how is business affected? -
Sunday, November 29, 2009 12:01 PMModeratorPeso,
I am just recommending an approach. You can build the stage table ( newTableAlpha ) from a different database (last night's backup restore for example).
The point is, once it is built, you just have to compare it to live table to apply the delta insert/deletes/updates. Hopefully 2 hour maintenance window is sufficient for that, but that is only a hope.
If the HW platform is very fast, it may even be possible to do the entire conversion within the 2 hour window.
TheBrick,
Can you take measurements in a copy of the database? How long SELECT INTO takes for example? Can you share the results with us?
Thanks.
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com -
Monday, November 30, 2009 3:02 PM
The select into took over 10 hours to complete for the one table. I don't have a good way to capture the delta, because the table was not designed with the appropriate change columns like data time. The other issue is data can be permanently deleted from this table.
I am also concerned about the overhead and locking on this table during the copy.
Do you guys have a good way to track delta? -
Monday, November 30, 2009 3:06 PMThe destination table for the copy had no PK, NC Indexes, or FK's. I did it this way for speed. I don't believe changing the recovery model will make a differences, becuase select *** into **** is treated as one transaction, correct.
-
Monday, November 30, 2009 3:23 PM
The destination table for the copy had no PK, NC Indexes, or FK's. I did it this way for speed. I don't believe changing the recovery model will make a differences, becuase select *** into **** is treated as one transaction, correct.
Yes, and the source table(s) are locked with table lock too, as you noticed.
The delta would need change tracking enabled on SQL Server 2008. If you don't have 2008 with change tracking enabled, tracking the delta would also take 10 hours because of updates/deletes and inserts. I think this approach is a dead end for you, if you want this to happen within the 2 hours maintenance window.
Have you considered the other approach? It takes about 0.01 seconds to complete.
Which version and edition of SQL Server are you using? -
Monday, November 30, 2009 3:52 PM
There is another way and that is to implement a trigger to track the delta.
I have used this only once in the past.
1) "SELECT TOP 0 * INTO DupTable FROM SourceTable"
2) Edit the DupTable and a) Remove identity property and b) change column to bigint.
3) Create a delete trigger to remove matching record in DupTable
4) Create an insert trigger to insert new records into new DupTable
5) Create an update trigger to update matching records in DupTable
6) Issue
DECLARE @ID INT SELECT @ID = 0 WHILE @ID <= 1550000000 BEGIN INSERT DupTable SELECT * FROM SourceTable WHERE IDCOL >= @ID AND IDCOL < @ID + 1000000 SET @IDCOL = @IDCOL + 1000000 END
to "transfer" 1 million records at a time from the old SourceTable to the new DupTable with minimal logging. This must be done "off-hours" so that the triggers doesn't hit a records which isn't transfered yet.
7) Change all code to handle BIGINT
8) Here is the tricky part:
9) Drop all indexes and all constraints against the SourceTable.
10) Rename SourceTable to OldTable
11) Rename DupTable to SourceTable
12) Apply all constraints and indexes in the table.
But still, I don't believe it is possible to make this happen (the tricky part) within the 2 hour maintenance window.
I have dealt with VLDB for many years now, and that's why I suggested the alternative approach because I know things takes time, and it is near impossible to accomplish this in the 2 hour maintenance window. I believe any experienced DBA will agree.
One last resort might be this:
BCP all records out (with a batchsize of 1 million records), truncate the source table, drop the contraints and indexes over the ID column and change the column to bigint. Now, BCP in all records in again with identity insert enabled. Last, recreate the indexes and constraints used over id column.
Try to BCP all records out and report back here the time it took to export them. If it takes less than 30 minutes, you might have a chance with this approach, because BCP in will take some more time. And recreating indexes and constraints will also take some time. -
Monday, November 30, 2009 3:52 PMModerator
I think this approach is a dead end for you, if you want this to happen within the 2 hours maintenance window.
Have you considered the other approach? It takes about 0.01 seconds to complete.
Which version and edition of SQL Server are you using?
I agree with Peso, let's forget about the 2 hour maintenance window given the 10 hours SELECT INTO.
You can go the negative range INT IDENTITY as Peso suggested if that is a good fit.
If you go BIGINT, here are some thoughts:
1. On SAT 1AM you backup production db and restore to a copy of production db
2. In the next 30 hours, you build StagingTableAlpha in production using the production db copy as source
3. Rename TableAlpha to zzzTableAlpha
4. Rename StagingTableAlpha to TableAlpha
5. During the next 24 hours you apply the deltas (which happened since the backup) in small batches on a business priority basis
6. By Monday morning the new table should be synched and fully functional
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com -
Monday, November 30, 2009 4:06 PM
If you go BIGINT, here are some thoughts:
I assume you also mean to include (for completeness)
1. On SAT 1AM you backup production db and restore to a copy of production db
2. In the next 30 hours, you build StagingTableAlpha in production using the production db copy as source
3. Rename TableAlpha to zzzTableAlpha
4. Rename StagingTableAlpha to TableAlpha
5. During the next 24 hours you apply the deltas (which happened since the backup) on a business priority basis
6. By Monday morning the new table should be synched and fully functional
Kalman Toth
2a) No constraints nor indexes are present in StagingTableAlpha. You can't change a datatype to BIGINT otherwise.
2b) Change datatype to BIGINT
4a) Reapply constraints
4b) Reapply indexes (which can be be done online if using Enterprise Edition)
The hard part with this approach is to find out what is the delta. And without change tracking that would take another 10 hours.
Meanwhile, users will have access to data which isn't clean, ie records are still present which should have been deleted.
Records which has been updated for any reason, any column and any new value. -
Monday, November 30, 2009 4:31 PM
The only reason I haven't considered the other approach is it takes two maintenance windows. One 2 hour window to do the transfer and another window to swap the tables, change the column to identity and create FK's. I am also concerned about the overhead of the triggers on the source table.
I do like your approach and I am going to do a test to gather data points of time needed to complete each. It maybe worth to overhear. -
Monday, November 30, 2009 4:35 PMModerator
When we are moving entire data centers, we often have a very short 'switchover' window. Perhaps using a similar process could be used for this problem.
1. Create a new, duplicate table, with the bigint identity column.
2. Create a duplicate 'holding table' for changes that will occur during the change over. Add 'Action' column.
3. Create INSERT/UPDATE/DELETE triggers for the existing table to copy all changes (including action) to the holding table.
4. Copy the data from the existing table to the new table (filter out the holding table data). Use reasonable batches to reduce TLog pressure. Make frequent TLog backups to reduce TLog size pressure.
5. Create the appropriate indexes on the new table
6. Using Merge, move 'holding table' data to the new table, deleting the moved set from the holding table.
7. Force Users out of the db, lock the db
8. Drop all foreign keys referencing the old table
9. Drop (or rename) the old table
10. Rename the new table to the old table name
11. Create the foreign keys referencing the new table
12. Make the final move from the holding table to the 'new' and now renamed table
13. Open the db to users.
14. Backup the db
In this scenario, it really doens't matter how long it takes to do 99% of the task. What you are concerned about is minimizing user impact.
This 'should' be accomplished with less than 15 minutes total downtime. By carefully scripting all of the steps, I have accomplished similar actions on much larger databases with less than 5 minutes downtime.
"You cannot do a kindness too soon because you never know how soon it will be too late." -Ralph Waldo Emerson- Marked As Answer by TheBrick Monday, November 30, 2009 4:50 PM
-
Monday, November 30, 2009 4:51 PMI like this approach and will give it a try in test to see how it goes.
-
Monday, November 30, 2009 5:01 PMModeratorBe sure to set the IDENTITY_INSERT ON option for the new table, and then to reset the new table max IDENTITY value using DBCC CHECKIDENT RESEED after the last move from the holding table.
"You cannot do a kindness too soon because you never know how soon it will be too late." -Ralph Waldo Emerson -
Monday, November 30, 2009 5:24 PM
The only reason I haven't considered the other approach is it takes two maintenance windows. One 2 hour window to do the transfer and another window to swap the tables, change the column to identity and create FK's. I am also concerned about the overhead of the triggers on the source table.
I do like your approach and I am going to do a test to gather data points of time needed to complete each. It maybe worth to overhear.
Oh, with my "other approach" I meant the simple DBCC CHECKIDENT approach.
Maybe it can give you some time until a longer maintenance windows, such as a long holiday as 4th of July or something, where you can get longer windows (24+ hours) to complete a full switch. -
Monday, November 30, 2009 8:40 PMModeratorSuperb switchover plan Arnie!
Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com -
Tuesday, December 01, 2009 1:02 AMModeratorI just found a blog post on this topic http://blogs.lessthandot.com/index.php/DataMgmt/DBAdmin/MSSQLServerAdmin/what-to-do-when-your-identity-column-max
Premature optimization is the root of all evil in programming. (c) by Donald Knuth
Naomi Nosonovsky, Sr. Programmer-Analyst
My blog -
Thursday, June 02, 2011 4:25 PM
I also had the same issue this morning. As a quick fix we went with
DBCC CHECKIDENT('MySchema.MyTableName', RESEED, -2147483648)For a long run, BIGINT looks promising because we get around 50 million records per day.
- Marked As Answer by Kalman TothMicrosoft Community Contributor, Moderator Thursday, June 02, 2011 9:16 PM

