Many of us know that Sparse columns always save space because NULL uses 0 bytes but we may wrong Sparse columns save space only when NULL density is high. If null density is low medium, then it uses more space compare to normal nullable column.
Sparse columns can use more space
than normal NULL columns because every non-NULL sparse value carries a 4-byte
overhead, which outweighs the ~2-byte cost of a normal NULL column when the
NULL percentage is low.
If Sparse
column is not null the it will store Data + 4 bytes overhead and if value is
null then it will store zero. That 4-byte overhead is 2 bytes for column ID and 2 bytes for
sparse vector metadata.
Best choice
to use sparse column
|
NULL % |
Sparse Result |
|
< 40% |
Wastes space |
|
~50% |
Break-even |
|
> 60% |
Saves space |
|
> 90% |
Massive savings |
Let’s see the
demo
Creating two
table with same structure normal and with sparse column.
|
CREATE TABLE Sparse_Demo_Normal ( ID INT
IDENTITY(1,1) not null PRIMARY KEY, columnone TINYINT
null, columntwo SMALLINT
null , columnthree INT null, columnfour BIGINT null, columnfive DECIMAL(30,10)
null , columnsix MONEY
null, columnseven SMALLMONEY
null, columneight char(10)
null, columnnine varchar(max)null, columnten nvarchar(max)null, columneleven DATE null, columntwelve TIME null, columnthirteen DATETIME null, columnfourteen SMALLDATETIME null, columnfifteen DATETIME2 null, columnxixteen DATETIMEOFFSET null, columnseventeen UNIQUEIDENTIFIER null, columneighteen NVARCHAR(MAX) null, columnninteen BIT null, columntwenty VARBINARY(MAX) null ); CREATE TABLE Sparse_Demo ( ID INT
IDENTITY(1,1) not null PRIMARY KEY, columnone TINYINT
SPARSE null, columntwo SMALLINT
SPARSE null , columnthree INT SPARSE null, columnfour BIGINT SPARSE null, columnfive DECIMAL(30,10)
SPARSE null , columnsix MONEY
SPARSE null, columnseven SMALLMONEY
SPARSE null, columneight char(10)
SPARSE null, columnnine varchar(max)SPARSE
null, columnten nvarchar(max)SPARSE
null, columneleven DATE SPARSE null, columntwelve TIME SPARSE null, columnthirteen DATETIME SPARSE
null, columnfourteen SMALLDATETIME SPARSE
null, columnfifteen DATETIME2 SPARSE null, columnxixteen DATETIMEOFFSET SPARSE null, columnseventeen UNIQUEIDENTIFIER SPARSE
null, columneighteen NVARCHAR(MAX) SPARSE
null, columnninteen BIT SPARSE null, columntwenty VARBINARY(MAX) SPARSE null ); --inserting 1M records into this table declare @i int=1; while (@i<=100000) begin insert into Sparse_Demo_Normal
(columnone,columntwo,columnthree,columnfour,columnfive,columnsix,columnseven,columneight,columnnine ,columnten,columneleven,columntwelve,columnthirteen,columnfourteen,columnfifteen,columnxixteen,columnseventeen ,columneighteen,columnninteen,columntwenty ) values
(null,null,null,null,null,null,null,null,null,null,null,null,null,null,null, null,null,null,null,null) insert into
Sparse_Demo(columnone,columntwo,columnthree,columnfour,columnfive,columnsix,columnseven,columneight,columnnine ,columnten,columneleven,columntwelve,columnthirteen,columnfourteen,columnfifteen,columnxixteen,columnseventeen ,columneighteen,columnninteen,columntwenty ) values
(null,null,null,null,null,null,null,null,null,null,null,null,null,null,null, null,null,null,null,null); set @i=@i+1; end; go |
Both table created and inserting 100K records in both table.
As of now all data null on both
table. See the Spaces of both tables.
|
EXEC
sp_spaceused 'Sparse_Demo_Normal'; EXEC sp_spaceused 'Sparse_Demo'; |
Here we are seeing the Massive savings of space.
Now we are updating 20% data with not
null value and see the space.
|
declare @i int=1; while (@i<=20000) begin update Sparse_Demo_Normal set columnone =1 ,columntwo=1 ,columnthree=1 ,columnfour=100 ,columnfive=10.10 ,columnsix=100.00 ,columnseven=5.50 ,columneight='Bagi' ,columnnine='Bagesh' ,columnten='Bagesh kumar Singh' ,columneleven=getdate() ,columntwelve=getdate() ,columnthirteen=getdate() ,columnfourteen=getdate() ,columnfifteen=getdate() ,columnxixteen=getdate() ,columnseventeen=NEWID() ,columneighteen='Singh' ,columnninteen=1 where id=@i ; update Sparse_Demo set columnone =1 ,columntwo=1 ,columnthree=1 ,columnfour=100 ,columnfive=10.10 ,columnsix=100.00 ,columnseven=5.50 ,columneight='Bagi' ,columnnine='Bagesh' ,columnten='Bagesh kumar Singh' ,columneleven=getdate() ,columntwelve=getdate() ,columnthirteen=getdate() ,columnfourteen=getdate() ,columnfifteen=getdate() ,columnxixteen=getdate() ,columnseventeen=NEWID() ,columneighteen='Singh' ,columnninteen=1 where id=@i set @i=@i+1; set @i=@i+1; end; go |
Now see the space for both table
Good!
Now updating 40% records.
Still ok.
Now we will update 60% data in both
tables.
Upto 60% its look good. Now updating
more than 90% records.
See the deference. We have taken the
small sample size due to that we are not seeing much effect but if the data is
big then we can easily we can see.
Let’s update 100% records and see.
Size is increase.
No comments:
Post a Comment
If you have any doubt, please let me know.