Moving ntext to nvarchar(max)
-
Wednesday, May 24, 2006 8:24 AM
I just move our SQL server to version 2005. In new version ntext field is deprecated and documentation says that ntext(max) should be used.
If I have table Table1 and ntext column Column1. When I execute following SQL statements:
alter table Table1 alter column
Column1 nvarchar(max)
go
1.) Are out of row data automatically move to in row?
2.) Or should I also execute something like this ?
update Table1 set Column1 = Column1+'' where Column1 is not null
3.) Is there way to check if data is stored out or in row?
Best regards
edvin
Answers
-
Wednesday, May 24, 2006 12:03 PMModerator
Hi,
see this article for more information:
http://msdn2.microsoft.com/en-US/library/ms189087.aspx
Find out the option on your table using sp_tableoption
HTH, Jens Suessmeyer.
---
http://www.sqlserver2005.de
--- -
Wednesday, May 24, 2006 10:54 PM
>> 1.) Are out of row data automatically move to in row?No. ALTER TABLE operation is typically a metadata only operation. There are exceptions however where we have to rewrite every single row. But in this case, there is no reason to touch every single row and text page.>> 2.) Or should I also execute something like this ?>> update Table1 set Column1 = Column1+'' where Column1 is not nullYou could, but I would recommend against doing this due to the logging requirements but it depends on the number of rows being updated and the percentage of data in the table that can be moved in-row. Future inserts will automatically be in-row depending on their length. Updates to values that are already stored out-of-row will be moved in-row depending on their length. You could recreate the table using SELECT INTO operation which will be more efficient than UPDATE but requires more logic. And you may also have to defrag the table if you update due to migration of the rows or fragmentation of pages.>> 3.) Is there way to check if data is stored out or in row?There is no easy way to check this. But you can use the query below to see the allocations happening in the table based on the data insertions. Below is some sample code that will demonstrate the whole process.use tempdbgocreate table dbo.lobtest ( i int not null identity, t ntext not null );insert into dbo.lobtest values(replicate(cast(N'x' as nvarchar(max)), 1024*100));insert into dbo.lobtest values(replicate(cast(N'x' as nvarchar(max)), 1024));-- look at allocation entries (generic query to alloc units for a table):
select au.*
, coalesce(p1.object_id, p2.object_id) as object_id
, coalesce(p1.index_id, p2.index_id) as index_id
from sys.allocation_units as au
outer apply (
select top(1) p.object_id, p.index_id
from sys.partitions as p
where p.hobt_id = au.container_id and au.type in (1, 3)
) as p1
outer apply (
select top(1) p.object_id, p.index_id
from sys.partitions as p
where p.partition_id = au.container_id and au.type = 2
) as p2
where coalesce(p1.object_id, p2.object_id) = object_id('dbo.lobtest');-- convert ntext column:
alter table dbo.lobtest alter column t nvarchar(max) not null;-- look at allocation entries again:
select au.*
, coalesce(p1.object_id, p2.object_id) as object_id
, coalesce(p1.index_id, p2.index_id) as index_id
from sys.allocation_units as au
outer apply (
select top(1) p.object_id, p.index_id
from sys.partitions as p
where p.hobt_id = au.container_id and au.type in (1, 3)
) as p1
outer apply (
select top(1) p.object_id, p.index_id
from sys.partitions as p
where p.partition_id = au.container_id and au.type = 2
) as p2
where coalesce(p1.object_id, p2.object_id) = object_id('dbo.lobtest');-- add more rows that should be inline
insert into dbo.lobtest values(replicate(cast(N'x' as nvarchar(max)), 1024));insert into dbo.lobtest values(replicate(cast(N'x' as nvarchar(max)), 1024));insert into dbo.lobtest values(replicate(cast(N'x' as nvarchar(max)), 1024));-- check allocation entries again (only IN_ROW_DATA pages increased):
select au.*
, coalesce(p1.object_id, p2.object_id) as object_id
, coalesce(p1.index_id, p2.index_id) as index_id
from sys.allocation_units as au
outer apply (
select top(1) p.object_id, p.index_id
from sys.partitions as p
where p.hobt_id = au.container_id and au.type in (1, 3)
) as p1
outer apply (
select top(1) p.object_id, p.index_id
from sys.partitions as p
where p.partition_id = au.container_id and au.type = 2
) as p2
where coalesce(p1.object_id, p2.object_id) = object_id('dbo.lobtest');drop table dbo.lobtest;
go
All Replies
-
Wednesday, May 24, 2006 12:03 PMModerator
Hi,
see this article for more information:
http://msdn2.microsoft.com/en-US/library/ms189087.aspx
Find out the option on your table using sp_tableoption
HTH, Jens Suessmeyer.
---
http://www.sqlserver2005.de
--- -
Wednesday, May 24, 2006 10:54 PM
>> 1.) Are out of row data automatically move to in row?No. ALTER TABLE operation is typically a metadata only operation. There are exceptions however where we have to rewrite every single row. But in this case, there is no reason to touch every single row and text page.>> 2.) Or should I also execute something like this ?>> update Table1 set Column1 = Column1+'' where Column1 is not nullYou could, but I would recommend against doing this due to the logging requirements but it depends on the number of rows being updated and the percentage of data in the table that can be moved in-row. Future inserts will automatically be in-row depending on their length. Updates to values that are already stored out-of-row will be moved in-row depending on their length. You could recreate the table using SELECT INTO operation which will be more efficient than UPDATE but requires more logic. And you may also have to defrag the table if you update due to migration of the rows or fragmentation of pages.>> 3.) Is there way to check if data is stored out or in row?There is no easy way to check this. But you can use the query below to see the allocations happening in the table based on the data insertions. Below is some sample code that will demonstrate the whole process.use tempdbgocreate table dbo.lobtest ( i int not null identity, t ntext not null );insert into dbo.lobtest values(replicate(cast(N'x' as nvarchar(max)), 1024*100));insert into dbo.lobtest values(replicate(cast(N'x' as nvarchar(max)), 1024));-- look at allocation entries (generic query to alloc units for a table):
select au.*
, coalesce(p1.object_id, p2.object_id) as object_id
, coalesce(p1.index_id, p2.index_id) as index_id
from sys.allocation_units as au
outer apply (
select top(1) p.object_id, p.index_id
from sys.partitions as p
where p.hobt_id = au.container_id and au.type in (1, 3)
) as p1
outer apply (
select top(1) p.object_id, p.index_id
from sys.partitions as p
where p.partition_id = au.container_id and au.type = 2
) as p2
where coalesce(p1.object_id, p2.object_id) = object_id('dbo.lobtest');-- convert ntext column:
alter table dbo.lobtest alter column t nvarchar(max) not null;-- look at allocation entries again:
select au.*
, coalesce(p1.object_id, p2.object_id) as object_id
, coalesce(p1.index_id, p2.index_id) as index_id
from sys.allocation_units as au
outer apply (
select top(1) p.object_id, p.index_id
from sys.partitions as p
where p.hobt_id = au.container_id and au.type in (1, 3)
) as p1
outer apply (
select top(1) p.object_id, p.index_id
from sys.partitions as p
where p.partition_id = au.container_id and au.type = 2
) as p2
where coalesce(p1.object_id, p2.object_id) = object_id('dbo.lobtest');-- add more rows that should be inline
insert into dbo.lobtest values(replicate(cast(N'x' as nvarchar(max)), 1024));insert into dbo.lobtest values(replicate(cast(N'x' as nvarchar(max)), 1024));insert into dbo.lobtest values(replicate(cast(N'x' as nvarchar(max)), 1024));-- check allocation entries again (only IN_ROW_DATA pages increased):
select au.*
, coalesce(p1.object_id, p2.object_id) as object_id
, coalesce(p1.index_id, p2.index_id) as index_id
from sys.allocation_units as au
outer apply (
select top(1) p.object_id, p.index_id
from sys.partitions as p
where p.hobt_id = au.container_id and au.type in (1, 3)
) as p1
outer apply (
select top(1) p.object_id, p.index_id
from sys.partitions as p
where p.partition_id = au.container_id and au.type = 2
) as p2
where coalesce(p1.object_id, p2.object_id) = object_id('dbo.lobtest');drop table dbo.lobtest;
go -
Friday, May 26, 2006 10:46 AM
Hi,
thanks for your replies. But are you really sure that simple updating all rows will not move date from out row to in row. Because in SQL documentation (http://msdn2.microsoft.com/en-US/library/ms189087.aspx) states that:
---------------
When the large value types out of row option value is changed, existing varchar(max), nvarchar(max), varbinary(max), and xml values are not immediately converted. The storage of the strings is changed as they are subsequently updated. Any new values inserted into a table are stored according to the table option in effect.
----------From this I understand that updating row data will also change the storage. Recreating tables demands also recreating constraints, indexes..etc..
Best regards
edvin -
Friday, May 26, 2006 7:15 PMThe default for large value types is to store in-row. And when you convert from ntext to nvarchar(max) the default is to store in-row. I just tested and it seems like if you update the data it moves from out-of-row to in-row if the value fits in-row. If you want to see immediate benefits you will have to recreate the table or update the rows (this is more expensive due to logging requirements than using say SELECT...INTO but it depends on the number of rows you want to update in the table). Additionally, you may have to defragment the table if the rows are updated due to data movement.
-
Thursday, June 12, 2008 12:17 AM
Hi, Jayachandran,
I found your post for moving to NVARCHAR(MAX) very useful. I know it's been two years since you've posted your answer here. But I have a question about doing the same thing. I am converting TEXT columns to VARCHAR(MAX). I understand that only new rows will be stored in-row of the table and no change will occur to the old rows. However, if I create a new table with columns of the VARCHAR(MAX) data type and insert into it from the old table, all the VARCHAR(MAX) columns (8K bytes) should be in-row, and the rest will have a pointer stored in the row.
Now the question is, will the new table size be larger or smaller and why? In a sense it should be smaller, because VARCHAR(Max) is of a variable length. However, because we're now storing data in-row, I'm not sure what it should be.
I did the test on one of the tables. And when looking at Disk Usage by Table Report in the Management Studio the table with the VARCHAR(MAX) columns has larger Data size, but smaller Unused size. Its total size almost the same as the old table's size
Thanks for your help in advance!
Narine
-
Friday, June 13, 2008 11:17 AM
-
Friday, July 30, 2010 11:13 AM
Thanks Guru. Kathavate for providing nice article link. That helps understand why the dummy updates are required. However I have seen a different post in the net, indicating that the updates should include an empty string concatenated to it for making the data in-row.
Any idea whether empty string required ?
Ex: update <table_name> set col1 = col1 + N'';
Thanks in advance!

