db design question. . . RRS feed

  • Question

  • My boss and I are having a disagreement.

    In our schema (SQL 2000), we have SSN nullable indexed.

    A year ago, we took a job that allowed one of our clients to do some batch processing that updated their HR database personnel into our database. The driver on this was an employee number. If the employee number is found, data is updated, if not it is inserted.

    Everything worked fine when it was one branch office.

    Now our client merged two branches. The old branch had different employee numbers after the merge. Unfortunately, our client merged the branch's version of our database into the main database BEFORE they imported the merge from HR. Therefore, we had duplicates in our database for the merged personnel.

    Well we fixed the problem with a little utility, but now we have ticked off our client and lost a little money in the process.

    Here is the contention. . . My boss blames the client, "They said update on employee number, and they did things out of order!"  I say, It is primarily our fault as our schema is messed up. We should not have a schema that allows duplicate SSN's in our database (or employee numbers for that matter!)  Furthermore, we are the "experts", our clients are never to blame!

    His response is we can't because SSN is not required in our app (nullable) and therefore can't be uniquely indexed.

    I contend further, because of our design, we have instilled risk when it comes to developing custom utilities for our clients. As it stands, our app has to do the check that if they enter an SSN that it is unique, before updating the SSN. This logic must be incoporated into every utility we create and we increase our chances for screwing things up. Not just for SSN's but also employee numbers, but for every entity that should be unique except for duplicate nulls.

    Here's how I say it should be addressed and I need some ammo to back me up (or shoot me down if you think its wrong. . . )

    Create a table - SSN, with an autonumber key and unique on the SSN, and insert into this table one SSN value that is null. That record is referenced by all personnel for which we don't have an SSN. I also say we should do it for employee number, Contract number and all the other entities that follow the same pattern.

    His response is that its ridiculous to have a table just for SSN's, or employee numbers, or contract numbers.

    I say its ridiculous to lose money and piss off clients.

    Comments? Suggestions? Ways to address this with my boss?

    Wednesday, May 10, 2006 6:36 AM

All replies

  • There is quite an easy way to create a unique index on a nullable column.

    Create a view on that table and select only rows where ssn is not null

    Creata a unique index on that view.


    CREATE TABLE MyTable (
    PrimaryKeyField int not null,
    field1 int null,
    field2 varchar (200) not null ) ON 'PRIMARY'

     PrimaryKeyField) ON 'PRIMARY' 

    SELECT PrimaryKeyField,Field1 FROM dbo.MyTable WHERE Field1 IS NOT NULL

     PrimaryKeyField) ON 'PRIMARY' 

    CREATE UNIQUE INDEX MyView_Idx1 ON MyView ( Field1) ON 'PRIMARY'



    Tapio Kulmala




    Wednesday, May 10, 2006 10:39 AM
  • I can't say who is at fault. Your company should have checked out the data before the merge to make sure things would work. The customer may have been at fault. At this point, it doesn't really matter. However, you MUST allow duplicate SSNs as the federal government has issued duplicates.
    Wednesday, May 10, 2006 2:39 PM
  •  It is not really a DB design question  it is a requirements question. I would follow Craig's advice and allow duplicates (in which case there's no need for the separate table).

    Anyway, one thing you would want to do is make sure your process for merging will include a backup of the "to be merged" data first :)


    Wednesday, May 10, 2006 10:18 PM
  • yes, this is true about the allowing ssn's, but there are othere places where the unique yet nullable comes into play.

    Still it comes down to some design issues. I dont like checking for uniqueness as part of the application logic.

    The indexed view is definitely a possibility. My concern would be performance.  

    Anyway, one thing you would want to do is make sure your process for merging will include a backup of the "to be merged" data first :)

    Oh yes. .  that was in the usage guidelines of the utility. Unfortunately, with database, things appear to be fine. . . until a month down the road when you find that needle in the haystack that breaks the camels back (how's that for a mixed metaphor???)

    Thursday, May 11, 2006 5:36 AM