locked
Automatic Updating Table from CSV file RRS feed

  • Question

  • I have a CSV file AllLabore.csv that I like to import to an existing table of a database every time the file renews. But only the records that don’t exist in the table. What is the best approach I can take? All recommendations are welcome.  

    SQL Server 2012

    Sample AllLabore.csv:

    0.00,0.62,"7:00","8:00","06/18/12"," Monday","Sheridan","107006182012"

    0.00,4.68,"8:00","9:00","06/18/12"," Monday","Sheridan","108006182012"

    6.42,3.82,"9:00","10:00","06/18/12"," Monday","Sheridan","109006182012"

    44.28,4.68,"10:00","11:00","06/18/12"," Monday","Sheridan","1010006182012"

    461.82,8.69,"11:00","12:00","06/18/12"," Monday","Sheridan","1011006182012"

    Sample table:

    Sales                      money 

    Hours                    numeric(18, 0)  

    StartTime            time(7)

    EndTime              time(7)

    Date                      date      

    Day                        text       

    Store                     text       

    ID                            float      

    ID field is unique

    Tuesday, July 23, 2013 11:33 PM

Answers

  • I would recommend to create an and schedule a SQL Agent to run when the file is refreshed.

    The job can have 2 steps;

    1. use BCP to load the csv file into a staging table, who's schema is exactly the same as the file structure - follow this link for information on bcp - http://msdn.microsoft.com/en-us/library/ms162802.aspx

    2. use TSQL MERGE statement to insert only the newly added rows into the main table that has data from the last time this file was loaded - follow this link for information on using MERGE - http://msdn.microsoft.com/en-us/library/bb522522(v=sql.105).aspx

    May i ask why do you need to ONLY add the new rows from the spreadsheet into the sql server table ? why cant you just truncate (delete all data) from the table and load it with entire csv file each time ? It will be easier and faster.


    Sanil Mhatre | Database Developer | MCTS | If you find my reply useful in any way, please vote it as helpful. If it has helped answer your question, please mark it as Answer. http://sqlwithsanil.com

    • Proposed as answer by Heidi-Duan Tuesday, July 30, 2013 11:34 AM
    • Unproposed as answer by Heidi-Duan Tuesday, July 30, 2013 11:34 AM
    • Proposed as answer by Heidi-Duan Wednesday, July 31, 2013 1:52 AM
    • Marked as answer by Elvis Long Monday, August 5, 2013 8:47 AM
    Wednesday, July 24, 2013 2:51 AM

All replies

  • Suggestion,

    1. Create a table "UPDATE TIME" where File Update Datetime will be stored or inserted. Store the latest datetime only if there is variation. We can use script task to read the file last update datetime. Schedule this package to run for every half hour interval as a saperate package. We can also achieve this without using any ssis package i.e by writing a script file and then call from window scheduler. The script will read the datetime of the file.

    2.Create another package to Check "UPDATE TIME" table for any variation, if there is any variation import into a temporary table and remove all rows that are duplicate compare to the original table. Try using FOR LOOP or schedule this package for every interval or in script task we can use timer to control repetatively. Finally import the remaining rows.


    Regards, RSingh

    Wednesday, July 24, 2013 2:19 AM
  • I would recommend to create an and schedule a SQL Agent to run when the file is refreshed.

    The job can have 2 steps;

    1. use BCP to load the csv file into a staging table, who's schema is exactly the same as the file structure - follow this link for information on bcp - http://msdn.microsoft.com/en-us/library/ms162802.aspx

    2. use TSQL MERGE statement to insert only the newly added rows into the main table that has data from the last time this file was loaded - follow this link for information on using MERGE - http://msdn.microsoft.com/en-us/library/bb522522(v=sql.105).aspx

    May i ask why do you need to ONLY add the new rows from the spreadsheet into the sql server table ? why cant you just truncate (delete all data) from the table and load it with entire csv file each time ? It will be easier and faster.


    Sanil Mhatre | Database Developer | MCTS | If you find my reply useful in any way, please vote it as helpful. If it has helped answer your question, please mark it as Answer. http://sqlwithsanil.com

    • Proposed as answer by Heidi-Duan Tuesday, July 30, 2013 11:34 AM
    • Unproposed as answer by Heidi-Duan Tuesday, July 30, 2013 11:34 AM
    • Proposed as answer by Heidi-Duan Wednesday, July 31, 2013 1:52 AM
    • Marked as answer by Elvis Long Monday, August 5, 2013 8:47 AM
    Wednesday, July 24, 2013 2:51 AM