locked
Use Regular Expressions To Check SQL Server 2012 Table Fields RRS feed

  • Question

  • Hi,

    I'm using Microsoft SQL Server 2012, and I need to create a database table in which the values given to some fields must be checked against some regular expressions. Sorrowfully I can't find anywhere a description of regular expression supported by this database (more accurately, I found such a description, but when I used it to create tables, the regular expressions seemed not to do their job).

    For example, I want to define an ID field that's made of exactly 9 decimal digits. I'd expect SQL Server 2012 to accept something like '^[0-9]{9}$'. But the only way I could do that was using the following primitive definition:

    CREATE TABLE [dbo].[Person]
    (
    [Id] CHAR (9) NOT NULL CHECK (Id LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'),
    [LastName] VARCHAR (255) NOT NULL,
    [FirstName] VARCHAR (255) NOT NULL,
    [Address] VARCHAR (255) NULL,
    [BirthDate] DATE NULL,
    [HeightMeters] FLOAT (53) DEFAULT ((1.70)) NULL,
    PRIMARY KEY CLUSTERED ([Id] ASC)
    )

    Could anybody direct me to a source that describes what more intelligent regular expressions does SQL Server 2012 support?

    Thanks in advance!


    Ofer

    Sunday, August 25, 2013 12:54 PM

Answers

All replies

  • What you are looking for is the SIMILAR TO predicate in ANSI/ISO Standard SQL. This feature is based on the POSIX version of  grep. And T-SQL does not have it. Sorry. You can get it in DB2 and many other SQL products, tho.

    Side note: Your example is full of ISO-11179 and basic data modeling violation. Why do you need a FLOAT for a decimal value with a limited range (ever see anyone over 3m)? Why do you violate ISO-11179 and basic data modeling by including the unit of measurement meta data? There is no such thing as a generic "id", etc. 




    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

    Sunday, August 25, 2013 1:21 PM
  • You can define a UDF CHECK constraint:

    http://www.sqlusa.com/bestpractices/udf-check-constraint/

    There is a performance overhead associated with a UDF CHECK constraint.

    Also consider the front-end to do more data checking and send clean data to the RDBMS server.


    Kalman Toth Database & OLAP Architect sqlusa.com
    New Book / Kindle: Pass SQL Exam 70-461 & Job Interview: Programming SQL Server 2012


    Sunday, August 25, 2013 1:22 PM
  • Hi Celko,

    1) So how do you explain the fact that CHECK (Id LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]') does the job?

    2) My example is contains the SQL code automatically generated by Microsoft's SQL Server Data Tools for Visual Studio 2012. I originally used the following code:

    CREATE TABLE Person
    (
    Id char(9) PRIMARY KEY CHECK (Id LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'),
    LastName varchar(255) NOT NULL,
    FirstName varchar(255) NOT NULL,
    Address varchar(255),
    BirthDate date,
    HeightCm float DEFAULT 1.70
    )


    Ofer


    Sunday, August 25, 2013 1:48 PM
  • Hi Kalman,

    Thanks for the link, but the demo functions at the page you mentioned (and at MSDN Library: CHECK Constraints that's mentioned there) don't include one that explains how to check against regular expressions.

    Can you direct me to such a demo?


    Ofer

    Sunday, August 25, 2013 2:08 PM
  • You can use any UDF in a CHECK CONSTRAINT.

    You would need to get or develop a regular expression UDF, then use it in a check constraint:

    http://msdn.microsoft.com/en-us/magazine/cc163473.aspx


    Kalman Toth Database & OLAP Architect sqlusa.com
    New Book / Kindle: Pass SQL Exam 70-461 & Job Interview: Programming SQL Server 2012



    • Edited by Kalman Toth Sunday, August 25, 2013 8:07 PM
    Sunday, August 25, 2013 8:07 PM
  • >> So how do you explain the fact that CHECK (generic_id LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]') does the job? <<

    1970's Sybase SQL Server was written on UNIX and in C. They kept a lot of C in their product. Look at != instead of <>, the % instead of MOD(), etc. They left the simplest part of grep() in their T-SQL. This does not port t other SQLs. 

    Did you ever wonder why only SQL uses % and _ in the LIKE predicate? IBM's System R was written for EBCDIC on punch cards! I always thought this was a serious flaw. It is a bitch to read multiple underscores on a good modern printer. And nobody else uses % for a wildcard. 

    >> 2) My example contains the SQL code automatically generated by Microsoft's SQL Server Data Tools for Visual Studio 2012. <<

    Microsoft sample code is a mess. Remember Hungarian notation? 

    But look at your DDL! How many people have you ever met with a name that is 255 letters long?? But since you have allowed it by sloppy programming, you will get one. It will be obvious garbage, if you are lucky. But more likely, you will pass data to a presentation layer. The developer will have to allow for a display of that length! 

    The USPS uses VARCHAR(35) for US postal addresses lines. This follows the CASS standards and several other conventions that all go back to a 3.5 inch, five line label used for mailings.

    A FLOAT is absurd for the height of this single person in your data model. How many people do you know that are ~167772 meters tall? Even sillier, what is a person with negative height? You allowed it; it must be valid. It will happen when you have no data integrity. 

    A “person” is both singular and too generic to be a table name. Like “thing” or “object” or “action” etc. Remember “The Law of Identity” from freshman logic? It is usually summarized as “A is A”; this fundamental law of logic says that “to be is to be something in particular. To be nothing in particular, or anything in general, is to be nothing at all. It is to have no identity! Without an identity, no logic is possible.” Let's build a table for developers, a very specific kind of person. 

    CREATE TABLE Developers
    (developer_id CHAR(9) NOT NULL PRIMARY KEY 
    CHECK (developer_id LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'),
    last_name VARCHAR(20) NOT NULL,
    first_name VARCHAR(20) NOT NULL,
    street_address VARCHAR(255), -- wrong! We need city, state, postal columns
    birth_date DATE CHECK (birth_date > '2000-01-01" )   ,

    developer_hgt DECIMAL ((3,2) DEFAULT 1.70 NOT NULL

      CHECK (developer_hgt BETWEEN 0.50 AND 3.00));



    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

    Sunday, August 25, 2013 8:13 PM
  • Hi Celko,

    I enjoyed to read your criticism, and I agree with almost every word. The point is that this is just a test SQL command, the types (e.g., real for height) were used just in order to intruduce into the code several different types, and it isn't meant in any way to create a real database. The selection of Person, which is single, is in accord with the naming convention used by SQL Server itself, which I personally prefer for some reasons (believe me, there are good reasons for this convention too, what makes it a matter of taste). The 255-char long strings were taken from another demo, one given by w3schools.

    That said, I thank you for the improved command, and with your permission will use it in my demo.

    Thanks!


    Ofer

    Sunday, August 25, 2013 8:37 PM
  • You can use Regular Expressions in a CLR UDF.

    Here's three different articles walking you through it.  

    Regular Expressions Make Pattern Matching And Data Extraction Easier

    Use RegEx in SQL with CLR Procs

    Regular Expression Replace in SQL 2005 (via the CLR)

    David

    David http://blogs.msdn.com/b/dbrowne/

    • Proposed as answer by Candy_Zhou Monday, August 26, 2013 6:43 AM
    • Marked as answer by Ofer Elboher Monday, August 26, 2013 7:02 AM
    Sunday, August 25, 2013 9:48 PM