Showing posts with label Index. Show all posts
Showing posts with label Index. Show all posts

Sunday, 13 October 2024

Column store fragmentation in SQL server

Columnstore index fragmentation in SQL Server can affect query performance, especially when working with large datasets in data warehouses or analytical workloads. Columnstore indexes store data in a compressed format in columnar storage, and fragmentation can occur due to the way data is organized into segments and compressed.

Types of Fragmentation in Columnstore Indexes:

Ø  Fragmentation of Row groups: Columnstore indexes store data in units called rowgroups. Each rowgroups contains approximately 1 million rows. Fragmentation can occur when rowgroups contain fewer than 1 million rows, leading to inefficiencies in storage and query processing. This is typically referred to as underpopulated rowgroups or fragmented rowgroups.

Ø  Deleted Rows: Another type of fragmentation occurs when rows are marked as deleted but are not physically removed. These deleted rows increase the size of the index and degrade performance because they occupy space and are processed during queries.

Why it’s Happen

Ø  Frequent Inserts/Updates: Columnstore indexes work best in a batch load scenario where large datasets are inserted in one go. Frequent, small inserts or updates can cause rowgroups to contain fewer rows, leading to fragmentation.

Ø  Row Deletes: When rows are deleted in a columnstore index, they are not immediately removed but are marked as deleted. This increases fragmentation as those rows still take up space.

Ø  Data Loads in Small Batches: Loading data in small batches (fewer than 102,400 rows) can cause fragmentation because the rowgroups are not fully populated.

Rebuilding the Columnstore Index

Rebuilding a columnstore index will reorganize the data, compress rowgroups, and remove deleted rows. This is a heavy operation but is necessary when there is significant fragmentation.

ALTER INDEX [index_name] ON [table_name] REBUILD;

 Reorganize the Columnstore Index

Syntax of reorganize the columnstore index.

ALTER INDEX [index_name] ON [table_name] REORGANIZE;

Difference between Rebuilding and reorganizing indexes in SQL server

 Both rebuild index and reorganize index are used to avoid the index fragmentation. Below are difference between them.

Rebuild index

Reorganize index

Rebuild index if avg_page_space_used_in_percent >=30%

Reorganize index if avg_page_space_used_in_percent >5 and <=30%

Index rebuilding involves dropping and recreating an index entirely. This process rebuilds the index from scratch, organizing the data pages in a contiguous manner.

Index reorganizing is a more lightweight approach that defragments the index by physically reordering the index pages without dropping and recreating the index itself.

When an index is rebuilt, fragmentation is eliminated, and index statistics are updated. This can result in better query performance, especially for heavily fragmented indexes.

Reorganizing reduces fragmentation by compacting the pages and the data within them, but it does not update index statistics.

Index rebuilding requires more system resources and can cause downtime, particularly on large tables or during peak usage hours. It also resets the fill factor, impacting future index maintenance.

Reorganizing is typically faster and less resource-intensive than rebuilding, making it suitable for routine maintenance tasks and scenarios where minimal downtime is essential.

Rebuilding is a drastic measure, but it does its job well. Fragmentation is completely eliminated after it.

Reorganizing may not completely eliminate fragmentation in heavily fragmented indexes, and it does not reset the fill factor.

 

Reorganize Index in SQL server

Reorganizing an index in SQL Server is a lighter operation compared to rebuilding it. When we reorganize an index, SQL Server defragments the index leaf level pages and reorders them to match the logical order of the index keys without creating a new index structure. The index reorganization process physically reorganizes the leaf nodes of the index. Reorganizing an index uses minimal system resources. It defragments the leaf level of clustered and non-clustered indexes on tables and views by physically reordering the leaf-level pages to match the logical, left to right, and order of the leaf nodes. Reorganizing also compacts the index pages.

Unlike rebuilding indexes, reorganizing indexes does not require any additional disk space but requires log space. Because reorganizing indexes is a fully logged transaction, the log can potentially grow substantially. Log usage for a reorganize could be way more than the size of the index, which depends on how much reorganizing and compaction is done, as well as the level of fragmentation distributed through the index.

Reorganizing an index does not lock the table, so it can be done online, even on editions that do not support online rebuilds. It is much faster and less resource-intensive than rebuilding but may not fully resolve severe fragmentation.

Reorganize a Specific Index

ALTER INDEX [index_name] ON [table_name] REORGANIZE;

 Reorganize All Indexes on a Table

ALTER INDEX ALL ON [table_name] REORGANIZE;

 Reorganize All Indexes in a Database

DECLARE @TableName NVARCHAR(256);
DECLARE @SQL NVARCHAR(max);
DECLARE table_cursor CURSOR FOR
  
SELECT Quotename(Schema_name(schema_id)) + '.' + Quotename(NAME)
  
FROM   sys.tables;

OPEN table_cursor;
FETCH next FROM table_cursor INTO @TableName;
WHILE @@FETCH_STATUS = 0
  
BEGIN
      
SET @SQL = 'ALTER INDEX ALL ON ' + @TableName   + ' REORGANIZE;';
      
EXEC Sp_executesql    @SQL;

      
FETCH next FROM table_cursor INTO @TableName;
  
END; 

 Reorganize with LOB_COMPACTION Option

If our index contains large objects (LOBs) such as varchar (max) or varbinary (max), we need to use the LOB_COMPACTION option to compact these objects during reorganization.

ALTER INDEX [index_name] ON [table_name] REORGANIZE WITH (LOB_COMPACTION = ON);

Rebuild index in SQL server

Rebuilding indexes completely drops and re-creates the database indexes. Technically, rebuilding consists of copying, renaming, and dropping the old one, internally. This is one of the operations we will perform to defragment our indexes since we are rebuilding them and realigning the physical and logical order of the index row in contiguous pages from scratch. Fragmentation is eliminated, pages are compacted together, and disk space is reclaimed.

Syntax

ALTER INDEX [IndexName] ON [SchemaName].[TableName]

REBUILD WITH (FILLFACTOR = 80, ONLINE = ON);

Ø  FILLFACTOR: Specifies the percentage of space on each leaf-level page to be filled with data, reserving the rest for future growth.

Ø  ONLINE: Allows the index to be rebuilt without significant locking, enabling concurrent access to the table.

Rebuild a Specific Index

ALTER INDEX [index_name] ON [table_name] REBUILD;

 Rebuild All Indexes on a Table

ALTER INDEX ALL ON [table_name]REBUILD;

 Rebuild All Indexes in a Database

DECLARE @TableName NVARCHAR(256);
DECLARE @SQL NVARCHAR(max);
DECLARE table_cursor CURSOR FOR
  
SELECT Quotename(Schema_name(schema_id)) + '.'
         
+ Quotename(NAME)
  
FROM   sys.tables;
OPEN table_cursor;
FETCH next FROM table_cursor INTO @TableName;
WHILE @@FETCH_STATUS = 0
  
BEGIN
      
SET @SQL = 'ALTER INDEX ALL ON ' + @TableName   + ' REBUILD;';
      
EXEC Sp_executesql        @SQL;

      
FETCH next FROM table_cursor INTO @TableName;
  
END;
CLOSE table_cursor;
DEALLOCATE table_cursor; 

 

Index-Level Defragmentation in SQL Server

Index-level defragmentation is the process of reorganizing or rebuilding SQL Server indexes to eliminate fragmentation, thereby improving query performance and ensuring efficient data retrieval. Fragmentation can lead to increased I/O operations, slower query responses, and overall degraded performance. Defragmentation helps maintain the health and efficiency of your SQL Server databases.

Why we need for defragmentation

Ø  Improved Performance

Ø  Efficient Storage

Ø  Reduced I/O Operations

Ø  Optimized Resource Utilization

Methods for Index Defragmentation

Below are the two primary methods to defragment indexes:

Ø  Reorganize

Ø  Rebuild

Best Practices for Index Defragmentation

Ø  Regular Maintenance

o   Schedule regular index maintenance during off-peak hours to minimize impact on users.
o   Use SQL Server Agent jobs or Maintenance Plans for automation.

Ø  Monitor Fragmentation Levels:

o   Continuously monitor fragmentation to determine the optimal frequency and method of defragmentation.

Ø  Use Appropriate Methods:

o   Reorganize for lower fragmentation (5-30%).
o   Rebuild for higher fragmentation (>30%).

Ø  Consider Fill Factor Settings:

o   Adjust FILLFACTOR based on the rate of data modifications.

o   A lower FILLFACTOR leaves more free space, reducing the need for page splits but increasing storage requirements.

Ø  Leverage Online Operations:

o   Use the ONLINE = ON option for rebuilds to minimize downtime, especially in high-availability environments.

Ø  Update Statistics:

o   Ensure that statistics are updated after defragmentation to provide the query optimizer with accurate information.

Ø  Assess Resource Availability:

o   Rebuilding indexes can be resource-intensive; ensure sufficient CPU, memory, and I/O capacity during maintenance.

Ø  Test in Non-Production Environments:

o   Validate maintenance scripts and procedures in staging environments before applying them to production systems. 

Index Fragmentation in SQL server

We know data in SQL server data are stored on 8kb pages. When we are UPDATE or INSERT data on a page that is already full, SQL Server creates a new page. The information from the original page will be split 50/50 with half being added to the new page, along with the new record. Doing this causes empty space at the end of the page(s). This empty space is known as Internal Fragmentation, it also occurs when we delete data from a page.

 


This newly created page won’t sit in sequence with the original page(s) meaning the related data is now stored in a different area of the disk. This separated data is called External Fragmentation. External Fragmentation is caused by pages that are out of order. 

Check Index Fragmentation

Below query is used to get the list of index and fragmentation percentage.

SELECT s.NAME + '.' + t.NAME                  AS table_name,
       i.NAME                                 AS index_name,
       index_type_desc,
       Round(avg_fragmentation_in_percent, 2) AS avg_fragmentation_in_percent,
       ips.fragment_count,
       ips.avg_fragment_size_in_pages,
       record_count                           AS table_record_count
FROM   sys.Dm_db_index_physical_stats(Db_id(), NULL, NULL, NULL, 'SAMPLED') ips
       INNER JOIN sys.tables t
               ON t.[object_id] = ips.[object_id]
       INNER JOIN sys.schemas s
               ON t.[schema_id] = s.[schema_id]
       INNER JOIN sys.indexes i
               ON ( ips.object_id = i.object_id )
                  AND ( ips.index_id = i.index_id )
ORDER  BY avg_fragmentation_in_percent DESC 

 


avg_page_space_used_in_percent shows the average percentage of the data storage space used on the page. This value allows us to see the internal index fragmentation.

avg_fragmentation_in_percent provides us information about external index fragmentation. For tables with clustered indexes, it indicates the percent of out-of-order pages when the next physical page allocated in the index is different from the page referenced by the next-page pointer of the current page. For heap tables, it indicates the percent of out-of-order extents, when extents are not residing continuously in data files.

fragment_count indicates how many continuous data fragments the index has. Every fragment constitutes the group of extents adjacent to each other. Adjacent data increases the chances that SQL Server will use sequential I/O and Read-Ahead while accessing the data.

After the degree of fragmentation is known, use the following table to determine the best method to correct the fragmentation.

avg_fragmentation_in_percent value

Corrective statement

> 5% and < = 30%

ALTER INDEX REORGANIZE

> 30%

ALTER INDEX REBUILD WITH (ONLINE = ON)*

* Rebuilding an index can be executed online or offline. Reorganizing an index is always executed online. To achieve availability similar to the reorganize option, you should rebuild indexes online.

Popular Posts