none
Check constraint with UDF problem with lot's of data?

    Question

  • I have a Constraint that is based on a UDF, the UDF checks data that is already in the table and compares it with data being inserted, and if the check constraint is TRUE then the record is inserted/updated...

    If I try to insert just some of the records the constraint captures the error, but if what I'm [INSERT - INTO]ing has multiple thousands of records seems like sql server gets "overwhelmed" and the constraints don't work...

    Has anyone experienced this behavior? I'm considering on using Triggers instead but this will have an impact on the performance of our system...

    Forgot to mention that the errors are actually being inserted, meaning that the UDF compares one record inserted to the existing records, then once the record is inserted a second one (causing the problem) is inserted... this is what the Constraint is not capturing, I know I can clean the data before it goes in, but this is not happening and DBAs are expecting (rightly so) to have the constraints capture the errors...
    Thursday, February 14, 2013 7:32 PM

Answers

  • It's a bad idea to use a UDF based constraint that use table data to reach a result.

    If you run the code below, you will see that both rows with value "2" are already in the table before the first UDF is executed.

    Apart from that, with big inserts or updates, you won't get very happy with the one by one processing that this method incurs, because it is bound to be slow.

    create table dbo.testtable (id int) create function dbo.testfunction (@id int) returns tinyint with schemabinding as Begin return (select count(*) from dbo.testtable where id=@id) End alter table dbo.testtable add constraint testconstraint check (dbo.testfunction(id)=1) insert into dbo.testtable values (1)

    insert into dbo.testtable select id from ( select 2 as id union all select 2 as id ) t -- alter table dbo.testtable drop constraint testconstraint -- drop function dbo.testfunction -- drop table dbo.testtable

    You'd be much better off with triggers.


    Gert-Jan

    Thursday, February 14, 2013 8:38 PM

All replies

  • Can you share UDF code ?

    Serg

    Thursday, February 14, 2013 7:42 PM
  • In general UDF based constraints is a bad idea. Even triggers are better. So, I suggest to change this architecture. 

    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog

    Thursday, February 14, 2013 7:47 PM
    Moderator
  • It's a bad idea to use a UDF based constraint that use table data to reach a result.

    If you run the code below, you will see that both rows with value "2" are already in the table before the first UDF is executed.

    Apart from that, with big inserts or updates, you won't get very happy with the one by one processing that this method incurs, because it is bound to be slow.

    create table dbo.testtable (id int) create function dbo.testfunction (@id int) returns tinyint with schemabinding as Begin return (select count(*) from dbo.testtable where id=@id) End alter table dbo.testtable add constraint testconstraint check (dbo.testfunction(id)=1) insert into dbo.testtable values (1)

    insert into dbo.testtable select id from ( select 2 as id union all select 2 as id ) t -- alter table dbo.testtable drop constraint testconstraint -- drop function dbo.testfunction -- drop table dbo.testtable

    You'd be much better off with triggers.


    Gert-Jan

    Thursday, February 14, 2013 8:38 PM
  • Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, data types, etc. in your schema are. Learn how to follow ISO-11179 data element naming conventions and formatting rules. Temporal data should use ISO-8601 formats. Code should be in Standard SQL as much as possible and not local dialect. 

    This is minimal polite behavior on SQL forums. You can read the rules at the front of this website.

    >> I have a Constraint that is based on a UDF, the UDF checks data that is already in the table and compares it with data being inserted, and if the check constraint is TRUE then the record [sic] [sic: rows are nothing like records [sic]] is inserted/updated... <<

    UDFs cannot be optimized or ported. SQL programmers do not use them; VB and C# programers who do not know how to do declarative programming yet, do use them. It makes their code look familiar to them and locks them into MS software. We SQL programmers hate them. 

    >> If I try to insert just some of the records [sic] the constraint captures the error, but if what I'm inserting has multiple thousands of records [sic] seems like SQL Server gets "overwhelmed" and the constraints don't work... <<

    Yep! You went from a set-oriented, parallel SQL model of data back to a 1950's record [sic] oriented model of data. Google up the term RBAR (“Row By Agonizing Row” -- Jeff Modem) and get a feel for how wrong this is. 

    Do you know the old joke about the Polish Lumberjack who got a chainsaw. The punch line is that he tried to use it like a regular without any gasoline and did not understand why it took so long to cut a tree. Push, pull, push, pull, ..

    >> I'm considering on using Triggers instead but this will have an impact on the performance of our system...<<

    TRIGGERs suck, too. Any procedural code in a declarative language will take orders of magnitude longer to run and orders of magnitude more storage. I tell people that they should only write five of them in their entire career

    >> Forgot to mention that the errors are actually being inserted, meaning that the UDF compares one record [sic] inserted to the existing records [sic], then once the record [sic] is inserted a second one (causing the problem) is inserted... this is what the Constraint is not capturing<<

    Nor should it. Constraints are declarative, so they guarantee a “state of being” in the tables and the schema. They do not guarantee a process, a sequence of steps. You put that into the insertion procedure. Unfortunately, we have no specs, no DDL and no idea about this UDF. 

    >> I know I can clean the data before it goes in, but this is not happening and DBAs are expecting (rightly so) to have the constraints capture the errors... <<

    No, they are wrong. The “presentation layer guys” also have obligations to the tiered architecture. Their big one is “Do not throw garbage data over the wall to the database guys!” That is why we do not see '2013-02-30' in a date coming at us in a good system. 

    If you can explain the specs better, we can probably do this with a MERGE, CHECK (<CASE expression from hell>), DERI actions or other declarative code. 



    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

    Friday, February 15, 2013 1:48 AM