locked
Need solution RRS feed

  • Question

  • Hello I have posted this question earlier in the last week but I didn't get any solution so posting here with all facts

    Existing Table : Bills

    Id int Nuot null (Primary Key)

    Year int  not null
    week int  not null
    type int  not null
    code nchar(5) not null
    exchange  nvarchar(3)  null
    c1    nvarchar(10) null
    c2    nvarchar(10) null
    c3    nvarchar(10) null
    c4    nvarchar(10) null
    c5    nvarchar(10) null
    c6    nvarchar(10) null
    c7    nvarchar(10) null
    c8    nvarchar(10) null

    Current Cluster Unique Index : 

    Year,Week,Type,Code,Exchange,c1,c2,c3,c4,c5,c6,c7,c8

    Most of the queries where Clause  having this condition

    Alter Table Bills with new Schema:

    Id int Nuot null (Primary Key)

    Year int  not null
    week int  not null
    type int  not null
    code nchar(5) not null
    exchange  nvarchar(3)  null
    c1    nvarchar(10) null
    c2    nvarchar(10) null
    c3    nvarchar(10) null
    c4    nvarchar(10) null
    c5    nvarchar(10) null
    c6    nvarchar(10) null
    c7    nvarchar(10) null
    c8    nvarchar(10) null

    c9    nvarchar(10) null
    c10    nvarchar(10) null
    c11    nvarchar(10) null
    c12    nvarchar(10) null
    c13    nvarchar(10) null
    c14    nvarchar(10) null
    c15    nvarchar(10) null
    ......c20 nvarchar(10) null

    So Added new column from c9 to c20

    Now As per Cluster Index We cannot have more than 16 columns so  people suggest to create persistent Binary computed  column and create index on Year,Week,Type,Code,Exchange,computedcolumn

    Note: C1 to C20 is Nullable so binary of c1 to c20 is not always unique

    Problem :

    (1) In all where condition I need to use Computed Column as my index in on those column right  ? Please confirm

    (2) I am getting Set Quoted Identifier,Set  Ansi On error  so i need to alter more than procedures

    So please advice me for the same

    My Thoughts:

    (1) I am able to manage uniquness using Trigger with new alter structure so Unique is maintain with it and create non cluster index with Year,Week,Type,Code,Exchange include c1 to c20

    Is there any good solution to achieve uniqueness along with Indexing so performance will not affect

    Sunday, May 3, 2015 6:21 AM

Answers

  • Well, you have gotten plentiful of answers, but you don't seem to have appreciated them.

    However, there is some new pieces of information. You say previously said that the conditions are mainly of the type:

      c1 = @c1 AND c2 = @c2 etc

    You also say that the columns are nullable. Furthermore, you get errors about ANSI_NULL settings. This last thing is important. It seems that you rely on that when ANSI_NULLs is OFF, the condition NULL = NULL evaluates to true. ANSI_NULLS OFF is a legacy setting and there is several piece of functionality in SQL Server that does not work when this setting is OFF. One such functionality is index on computed columns.

    So it seems to me that your system is a dire need of a major overhaul. These queries need to be rewritten as:

          (exchange = @exchange OR exchange IS NULL AND @exchange IS NOT NULL)
      AND (ci = @c1 OR c1 IS NULL AND @c1 IS NULL)
      AND (c2 = @c2 OR c2 IS NULL AND @c2 IS NULL)

    This could however make any on these columns useless. A better alternative may be to make the columns non-nullable, and replace NULL with the empty string. This presumes that '' does not already have a specific meaning.

    Once you have done this change, you could add a computed column like this:

      hash AS hashbytes('SHA1', convert(char(10), c1) + convert(char(10), c2) ...

    And then all these queries need to be rewritten as

      hash = hashbytes('SHA1', convert(char(10), @c2) + convert(char(10), @c2)  ...

    You can reduced the code bloat by putting this in a user-defined function. Note that the function must have the clause WITH SCHEMABINDING to be able to use in the index.

    However, this still begs the question why you would do this in the first place. Why not just index (Year, week, type, code), the non-nullable columns? This is a question I have asked more than once, but you have not answered. Exactly what do you expect to achieve with this impossible 25-column index?

    Yes, with a non-unique index, SQL Server will have to scan a part of the index range. And? Is that really so horrible? What does this query return?

       SELECT MAX(cnt)
       FROM  (SELECT COUNT(*)
              FROM   tbl
              GROUP  BY Year, week, type, code) AS c

    Beside the ANSI_NULLS thing, you also need to address QUOTED_IDENTIFIER. For a single procedure this is a not big deal, but if you have many stored procedures, and developers have liberally been using "" as a string delimiter, it is certainly painful.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    • Proposed as answer by Anuj Tripathi Monday, May 4, 2015 11:44 AM
    • Marked as answer by Eric__Zhang Monday, May 18, 2015 3:12 AM
    Sunday, May 3, 2015 8:49 AM

All replies

  • Why do you want to create a clustered index (primary key)? Use a non-clustered unique index instead.

    Olaf Helper

    [ Blog] [ Xing] [ MVP]

    • Proposed as answer by Anuj Tripathi Monday, May 4, 2015 11:44 AM
    Sunday, May 3, 2015 6:41 AM
  • Well, you have gotten plentiful of answers, but you don't seem to have appreciated them.

    However, there is some new pieces of information. You say previously said that the conditions are mainly of the type:

      c1 = @c1 AND c2 = @c2 etc

    You also say that the columns are nullable. Furthermore, you get errors about ANSI_NULL settings. This last thing is important. It seems that you rely on that when ANSI_NULLs is OFF, the condition NULL = NULL evaluates to true. ANSI_NULLS OFF is a legacy setting and there is several piece of functionality in SQL Server that does not work when this setting is OFF. One such functionality is index on computed columns.

    So it seems to me that your system is a dire need of a major overhaul. These queries need to be rewritten as:

          (exchange = @exchange OR exchange IS NULL AND @exchange IS NOT NULL)
      AND (ci = @c1 OR c1 IS NULL AND @c1 IS NULL)
      AND (c2 = @c2 OR c2 IS NULL AND @c2 IS NULL)

    This could however make any on these columns useless. A better alternative may be to make the columns non-nullable, and replace NULL with the empty string. This presumes that '' does not already have a specific meaning.

    Once you have done this change, you could add a computed column like this:

      hash AS hashbytes('SHA1', convert(char(10), c1) + convert(char(10), c2) ...

    And then all these queries need to be rewritten as

      hash = hashbytes('SHA1', convert(char(10), @c2) + convert(char(10), @c2)  ...

    You can reduced the code bloat by putting this in a user-defined function. Note that the function must have the clause WITH SCHEMABINDING to be able to use in the index.

    However, this still begs the question why you would do this in the first place. Why not just index (Year, week, type, code), the non-nullable columns? This is a question I have asked more than once, but you have not answered. Exactly what do you expect to achieve with this impossible 25-column index?

    Yes, with a non-unique index, SQL Server will have to scan a part of the index range. And? Is that really so horrible? What does this query return?

       SELECT MAX(cnt)
       FROM  (SELECT COUNT(*)
              FROM   tbl
              GROUP  BY Year, week, type, code) AS c

    Beside the ANSI_NULLS thing, you also need to address QUOTED_IDENTIFIER. For a single procedure this is a not big deal, but if you have many stored procedures, and developers have liberally been using "" as a string delimiter, it is certainly painful.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    • Proposed as answer by Anuj Tripathi Monday, May 4, 2015 11:44 AM
    • Marked as answer by Eric__Zhang Monday, May 18, 2015 3:12 AM
    Sunday, May 3, 2015 8:49 AM
  • Thanks Erland j One question Year week type code cindex will wors same as original index ? Or is there. Any performace problem So what i understand is Make all nullBle to not nullable Create computed colum for unique Create cluster index on year week type code In worst case they will not agree to convert it for not null then What r other workaround Min counts 18 Maax counts 286 distinct
    Sunday, May 3, 2015 9:38 AM
  • Look, I know very little about your system, and I cannot magically see if there will be any performance problem if you do this and that.

    But if there are is maximum 286 rows for a single combo of (Year, week, type, code) you would need to have a fairly high query frequency to suffer from a performance problem which only has these four columns. If the current index already have exchange and c1 to c4, I find it very difficult to defend to make changes to this index.

    If you want to enforce uniqueness over all those 25 columns, then adding a computed hash column and a unique nonclustered index over ((Year, week, type, code, exchange, hash) makes sense.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Sunday, May 3, 2015 11:40 AM
  • Your design is wrong. You do not know ISO-11179 or ISO-8601 standards. 

    There is no generic "Id", "code" and "type" in RDBMS. YEAR is a data type, not a column name and week is a unit of temporal measurement. It exists within a year. In fact, your two (year, week) columns are an example of the "attribute splitting" design flaw.

    Google the ISO week format: yyyyWww-d and put it in your Calendat table (you do have a calendar?).

    Your "c1" thru "c8" looks like a repeated, a classic violation of First Normal Form (1NF).

    Can you throw this out and do it right? 


    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

    • Proposed as answer by Anuj Tripathi Monday, May 4, 2015 11:44 AM
    Sunday, May 3, 2015 2:00 PM
  • If c1-c20 is nullable then they should not qualify to be part of a clustered index. Besides nullable or not nullable, I do not see any gain in performance maybe after first 3 columns so get your column orders right for first three and so forth. 








    Sunday, May 3, 2015 3:53 PM
  • If c1-c20 is nullable then they should not qualify to be part of a clustered index.

    There is nothing with having nullable columns in a clustered index per se.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Sunday, May 3, 2015 4:30 PM
  • My client is not agree on converting Null to Not nullable then How do i maintain uniqueness?

    Can I use Trigger for it using If exist?

    Monday, May 4, 2015 4:28 AM
  • You could use isnull in the hash function. Obviously, this presumes that there is a magic value you can use in place of NULL, for instance all blanks.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Monday, May 4, 2015 7:11 AM