locked
lookup transformation using partial cache mode RRS feed

  • Question

  • This question concerns SQL Server 2008 and SSIS 2008 ...

    I am processing a CSV file that has UPC information and I have a lookup transformation that is used to determine if a UPC code exists in the table dim_upc.  If a UPC exists, the 'lookup match' stream goes down an update path.  If not, the 'lookup no match' stream goes down an insert path where a record is inserted into the dim_upc table.

    The lookup transformation has partial cache mode selected and the 'enable cache for no match rows' is disabled.  I would expect this configuration to allow the lookup to find the rows that were inserted.

    In other words...

    1. UPC xyz comes into the lookup transformation.
    2. UPC xyz doesn't exist and is inserted into the dim_upc table.
    3. UPC xyz comes to the lookup transformation again ( the source CSV file has multiple entries for UPC xyz )
    4. The lookup transformation should determine that UPC xyz exists and choose the 'lookup match' stream.

    Unfortunately, this is not the case.  The 'lookup no match' stream is chosen and results in an error for a duplicate UPC in the dim_upc table.

    What am I doing wrong?  Are there specific options I need to select on the lookup transfomation or the downstream OLE DB Destination used to do the insert to make this work?

    I have tried the 'no cache' mode with similar results.

    Thanks for your help.

    Tuesday, March 17, 2009 4:12 PM

Answers

  • Your difficulty may be arising because the rows sent down the "no match" output are not immediately added to your lookup table.  It has to do with buffers.  The rows that "don't match" are not passed on immediately to the components you have ready to add those rows to your lookup table.  They accumulate until the buffer is "full", or until all rows have passed through the Lookup component.  This behaviour makes the SSIS Data Flow very efficient, but doesn't make your design work - or work reliably.

    You'll need to use a Cascading Lookup pattern, as Matt Masson describes on his blog.  However, even given that pattern, I'd expect similar behaviour to what you've seen so far... but I'd read that over and try it.  At the very worst, you'd have to do a "double-check" in the Script that's presented there to make sure the row wasn't already added to the lookup table.
    Todd McDermid's Blog
    Wednesday, August 5, 2009 12:07 AM