locked
SSIS Lookup not matching on equal values of datetime if cache mode is partial or no cache RRS feed

  • Question

  • We have a Lookup to match incoming records from temp table and filter out duplications before entering the destination table. The temp and destination tables have exactly the same definition. The matching compares 3 columns with type of text, numeric, and datetime.

    It works correctly under full cache mode, but will run out of memory when data grows to realistically massive size. The Lookup will load the entire destination table into memory and triggers too much swapping.

    For performance tried to change it to partial cache and no cache, and the changes breaks correctness because the equal values on datetime column fails to match and repeats are escaping the filter into destination data table.

    The other columns are in text and numeric types, and they seem to be OK. Just for test, if exclude the datetime column the Lookup matches as expected with the rest columns.

    I suspect there might be something wrong in collation or other stuff related, I am not sure how to confirm and do not have a solution.

    The environment is SQL Server 2016, Visual Studio 2015, and Windows Server 2016. And the matching failure happens in Visual Studio debugging. More information available if needed, and thanks a lot in advance.



    • Edited by Mike_YZ Monday, November 19, 2018 4:08 AM Typo, datetype -> datetime
    Saturday, November 17, 2018 12:29 AM

Answers

  • We have a Lookup to match incoming records from temp table and filter out duplications before entering the destination table. The temp and destination tables have exactly the same definition. The matching compares 3 columns with type of text, numeric, and datetime.

    It works correctly under full cache mode, but will run out of memory when data grows to realistically massive size. The Lookup will load the entire destination table into memory and triggers too much swapping.

    For performance tried to change it to partial cache and no cache, and the changes breaks correctness because the equal values on datetime column fails to match and repeats are escaping the filter into destination data table.

    The other columns are in text and numeric types, and they seem to be OK. Just for test, if exclude the datetime column the Lookup matches as expected with the rest columns.

    I suspect there might be something wrong in collation or other stuff related, I am not sure how to confirm and do not have a solution.

    The environment is SQL Server 2016, Visual Studio 2015, and Windows Server 2016. And the matching failure happens in Visual Studio debugging. More information available if needed, and thanks a lot in advance.



    Hi Mike_YZ, 

    According to my test, it seems that the datetime column is converted to different format in OLE DB Source and Lookup(Partial/No Cache Mode). So I convert the column to DT_STR both in the two components. 

    Right click on OLE EB Source, select Advanced Editor, and change the output column data type from DT_DBTIMESTAMP to DT_STR. 

    In Lookup transformation->Connection, select Use results of an SQL query.

    And input(Col3 is the datetime data type column): 

    SELECT Col1, Col2, Convert(varchar,Col3,121) Col3 FROM Table_1

    And in Columns, map the columns to be matched. 


    Related links: 

    Lookup cache modes

    SQL Server CONVERT() Function


    MSDN Community Support Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com

    • Marked as answer by Mike_YZ Tuesday, November 20, 2018 4:42 PM
    Monday, November 19, 2018 9:10 AM

All replies

  • We have a Lookup to match incoming records from temp table and filter out duplications before entering the destination table. The temp and destination tables have exactly the same definition. The matching compares 3 columns with type of text, numeric, and datetime.

    It works correctly under full cache mode, but will run out of memory when data grows to realistically massive size. The Lookup will load the entire destination table into memory and triggers too much swapping.

    For performance tried to change it to partial cache and no cache, and the changes breaks correctness because the equal values on datetime column fails to match and repeats are escaping the filter into destination data table.

    The other columns are in text and numeric types, and they seem to be OK. Just for test, if exclude the datetime column the Lookup matches as expected with the rest columns.

    I suspect there might be something wrong in collation or other stuff related, I am not sure how to confirm and do not have a solution.

    The environment is SQL Server 2016, Visual Studio 2015, and Windows Server 2016. And the matching failure happens in Visual Studio debugging. More information available if needed, and thanks a lot in advance.



    Hi Mike_YZ, 

    According to my test, it seems that the datetime column is converted to different format in OLE DB Source and Lookup(Partial/No Cache Mode). So I convert the column to DT_STR both in the two components. 

    Right click on OLE EB Source, select Advanced Editor, and change the output column data type from DT_DBTIMESTAMP to DT_STR. 

    In Lookup transformation->Connection, select Use results of an SQL query.

    And input(Col3 is the datetime data type column): 

    SELECT Col1, Col2, Convert(varchar,Col3,121) Col3 FROM Table_1

    And in Columns, map the columns to be matched. 


    Related links: 

    Lookup cache modes

    SQL Server CONVERT() Function


    MSDN Community Support Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com

    • Marked as answer by Mike_YZ Tuesday, November 20, 2018 4:42 PM
    Monday, November 19, 2018 9:10 AM
  • Hi, Yang.Z,

    Thank you for your prompt help. A tentative fix seems to be working now following your hint.

    The lookup matches incoming data flow with a reference database table. When not in full cache mode, it seems that the referenced database table skips the millisecond portion of timestamp, therefore causing mismatch as the data flow still has milliseconds.

    I modified the SQL query to add a column of string in style 121 ("yyyy-MM-dd hh:mm:ss.fff") including the milliseconds, then use a string compare. Just not sure if this is the best practice.



    • Edited by Mike_YZ Tuesday, November 20, 2018 4:34 PM
    Tuesday, November 20, 2018 4:33 PM