locked
Regarding cluster and Non cluster index RRS feed

  • Question

  • User264732274 posted

    here i like to highlight few points regarding cluster and Non cluster index.

    1) we know that we can create one cluster index in table and many non cluster index is possible for a table. so tell me on which fields we should create cluster index. most of the time people create cluster index on PK field but why ? does it increase performance ?

    2) how data is stored for cluster index and Non cluster index. please add a example regarding data storage for cluster index and Non cluster index

    3) when we store alpha numeric value in a PK field then which index would be good choice and why ?

    4) how sql server store and read data for non cluster index ?

    5) GUID is good for cluster index ?

    please discuss my each points in details with example. thanks

    Wednesday, October 26, 2016 9:46 AM

Answers

  • User-2057865890 posted

    Hi sudip_inn,

    The physical order of the data in the table is determined by the clustered index. The clustered index is similar to a telephone directory, which is arranged in the order of the last name. A table can only contain a clustered index because the aggregation index provides physical storage order in the table. But the index can contain multiple columns (combined index).
    Clustered indexes are particularly effective for those columns that often have to search for a range of values. When you use a clustered index to find a row that contains the first value, you can make sure that the rows that contain the subsequent index values are physically contiguous. For example, if a query application execution often retrieve an date within the scope of the record, the clustered index can be found including the start date is used for quickly, then all the adjacent rows in the table search, until you reach the end date. This helps to improve the performance of such queries. Similarly, if you sort the retrieved from the data in the table is usually used for a column, the table can be gathered in the column (physical ordering), avoid each query the column are sorted, thereby saving cost.

    The following table summarizes when to use clustered index or non clustered index

    The action description                                   uses a clustered index                            use a non clustered index
    Columns are often grouped                                       Yes                                                            Yes
    Returns a range of data                                              Yes                                                            No
    One or very few different values                                 No                                                            No
    Small number of different values                                Yes                                                           No
    Large number of different values                                No                                                           Yes
    Frequently updated columns                                      No                                                           Yes
    The foreign key column                                              Yes                                                           Yes
    Primary key column                                                    Yes                                                           Yes
    Frequent revision index columns                                No                                                           Yes

    reference: https://msdn.microsoft.com/en-us/library/ms190457.aspx 

    Best Regards,

    Chris

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Thursday, October 27, 2016 4:05 PM
  • User-654786183 posted

    1) we know that we can create one cluster index in table and many non cluster index is possible for a table. so tell me on which fields we should create cluster index. most of the time people create cluster index on PK field but why ? does it increase performance ?

    Clustered index determines physical sorting order of rows in a table similar to entries on yellow pages which are sorted in alphabetical order.  Suppose you have a table Employee, which contains emp_id as primary key than clustered index which is created on a primary key will sort the Employee table as per emp_id.

    2) how data is stored for cluster index and Non cluster index. please add a example regarding data storage for cluster index and Non cluster index

    Clustered indexes sort and store the data rows in the table or view based on their key values.  If a table has no clustered index, its data rows are stored in an unordered structure called a heap.  Nonclustered indexes have a structure separate from the data rows. A nonclustered index contains the nonclustered index key values and each key value entry has a pointer to the data row that contains the key value.  The pointer from an index row in a nonclustered index to a data row is called a row locator. The structure of the row locator depends on whether the data pages are stored in a heap or a clustered table. For a heap, a row locator is a pointer to the row. For a clustered table, the row locator is the clustered index key.

    3) when we store alpha numeric value in a PK field then which index would be good choice and why ?

    If it is numeric it must be integer or long, if it alphanumeric it must be varchar/text. The latter is often more "logical" in the sense of identifying a record by a meaningful feature -- an email address or a city name -- but these natural primary keys can be a great inconvenience if wrongly chosen. A user can never change their email if they are identified by their email.  The other advantage with numeric key is you can enable auto increment and dont need to worry about passing it to the DB or uniqueness

    4) how sql server store and read data for non cluster index ?

    I have already mentioned about the storage part in your 2nd question.  As I mentioned earlier for the table Employee, which contains emp_id as primary key than clustered index which is created on a primary key will sort the Employee table as per emp_id.  Nonclustered index involves one extra step which points to the physical location of the record.

    5) GUID is good for cluster index ?

    You can use GUID and you don't need to worry about the uniqueness.  Its going to take larger space compare to integer.  Please refer this MSDN post on a debate for GUID vs INT

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Thursday, October 27, 2016 4:33 PM

All replies

  • User-2057865890 posted

    Hi sudip_inn,

    The physical order of the data in the table is determined by the clustered index. The clustered index is similar to a telephone directory, which is arranged in the order of the last name. A table can only contain a clustered index because the aggregation index provides physical storage order in the table. But the index can contain multiple columns (combined index).
    Clustered indexes are particularly effective for those columns that often have to search for a range of values. When you use a clustered index to find a row that contains the first value, you can make sure that the rows that contain the subsequent index values are physically contiguous. For example, if a query application execution often retrieve an date within the scope of the record, the clustered index can be found including the start date is used for quickly, then all the adjacent rows in the table search, until you reach the end date. This helps to improve the performance of such queries. Similarly, if you sort the retrieved from the data in the table is usually used for a column, the table can be gathered in the column (physical ordering), avoid each query the column are sorted, thereby saving cost.

    The following table summarizes when to use clustered index or non clustered index

    The action description                                   uses a clustered index                            use a non clustered index
    Columns are often grouped                                       Yes                                                            Yes
    Returns a range of data                                              Yes                                                            No
    One or very few different values                                 No                                                            No
    Small number of different values                                Yes                                                           No
    Large number of different values                                No                                                           Yes
    Frequently updated columns                                      No                                                           Yes
    The foreign key column                                              Yes                                                           Yes
    Primary key column                                                    Yes                                                           Yes
    Frequent revision index columns                                No                                                           Yes

    reference: https://msdn.microsoft.com/en-us/library/ms190457.aspx 

    Best Regards,

    Chris

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Thursday, October 27, 2016 4:05 PM
  • User-654786183 posted

    1) we know that we can create one cluster index in table and many non cluster index is possible for a table. so tell me on which fields we should create cluster index. most of the time people create cluster index on PK field but why ? does it increase performance ?

    Clustered index determines physical sorting order of rows in a table similar to entries on yellow pages which are sorted in alphabetical order.  Suppose you have a table Employee, which contains emp_id as primary key than clustered index which is created on a primary key will sort the Employee table as per emp_id.

    2) how data is stored for cluster index and Non cluster index. please add a example regarding data storage for cluster index and Non cluster index

    Clustered indexes sort and store the data rows in the table or view based on their key values.  If a table has no clustered index, its data rows are stored in an unordered structure called a heap.  Nonclustered indexes have a structure separate from the data rows. A nonclustered index contains the nonclustered index key values and each key value entry has a pointer to the data row that contains the key value.  The pointer from an index row in a nonclustered index to a data row is called a row locator. The structure of the row locator depends on whether the data pages are stored in a heap or a clustered table. For a heap, a row locator is a pointer to the row. For a clustered table, the row locator is the clustered index key.

    3) when we store alpha numeric value in a PK field then which index would be good choice and why ?

    If it is numeric it must be integer or long, if it alphanumeric it must be varchar/text. The latter is often more "logical" in the sense of identifying a record by a meaningful feature -- an email address or a city name -- but these natural primary keys can be a great inconvenience if wrongly chosen. A user can never change their email if they are identified by their email.  The other advantage with numeric key is you can enable auto increment and dont need to worry about passing it to the DB or uniqueness

    4) how sql server store and read data for non cluster index ?

    I have already mentioned about the storage part in your 2nd question.  As I mentioned earlier for the table Employee, which contains emp_id as primary key than clustered index which is created on a primary key will sort the Employee table as per emp_id.  Nonclustered index involves one extra step which points to the physical location of the record.

    5) GUID is good for cluster index ?

    You can use GUID and you don't need to worry about the uniqueness.  Its going to take larger space compare to integer.  Please refer this MSDN post on a debate for GUID vs INT

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Thursday, October 27, 2016 4:33 PM