none
How to read limited rows from CSV file in SSIS RRS feed

  • Question

  • Hi,

    I have 2 years amount of data in my csv file . I only want to read last one month of data from csv file. I don't want to use conditional split for this. Please suggest any other ways in SSIS. Can I achieve using this using script task? any suggestions?

    Thanks,

    Suvani

    Saturday, September 14, 2019 9:41 PM

All replies

  • Hi 

    I can think of two possible approaches 1 - Using Conditional Split (Which you don't like) and 2 - Having a script task in Place.

    We can embed a script task with the filter logic to achieve this functionality in SSIS.

    Below shared links holds few blogs explaining the similar functionality which expects few tweaks to the code to work as desired.

    http://www.sqlis.com/sqlis/post/Using-the-Script-Component-as-a-Conditional-Split.aspx

    http://www.techbrothersit.com/2014/09/ssis-how-to-delete-top-n-rows-from-csv.html

    Hope this is helpful !!

    Thank you


    If this post helps to resolve your issue, please click the "Mark as Answer" of that post and/or click Answered "Vote as helpful" button of that post. By marking a post as Answered and/or Helpful, you help others find the answer faster.

    Sunday, September 15, 2019 9:16 AM
  • Just as an alternative, I thought I'd mention that most web servers such as this one allow "ranged" partial requests. Look in response header  for something like "Accept-Ranges: bytes", an absence of this may also mean default "yes". This is because big files are served to us in chunks and the browser reassembles them. If one chunk fails, the client can re-request just that part of the file, by start and end byte index range. That means you can get the server to do the hard work and just get the bit you want returned.

     

    public class RangedHttpClient : WebClient
    {
        private readonly long from;
        private readonly long to;
    
        public RangedHttpClient(long from, long to)
        {
            this.from = from;
            this.to = to;
        }
    
        protected override WebRequest GetWebRequest(Uri address)
        {
            var request = (HttpWebRequest)base.GetWebRequest(address);
            request.AddRange(this.from, this.to);
            return request;
        }
    }
     

    Just thought I'd throw it into your bag of tricks.

    Assuming you're comfortable with scripting tasks in your SSIS.

    If the rows are variable length, the script could just work backwards requesting previous chunks and prepending until it has X number of carriage returns = number of lines worth of data you wanted from the end. If you're scripting, you could then be parsing each row, until it finds a date before your desired range.

    Also, a web request returns the file size (content-length) in the response headers, during a request handshake before transfer begins. So you could send a sample request to get the file size, then cancel the request before the content is transferred. Then you know where to work back from :) 

      

    https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests

     

    All the best,
    Pete

     

       


    #PEJL
    Got any nice code? If you invest time in coding an elegant, novel or impressive answer on MSDN forums, why not copy it over to TechNet Wiki, for future generations to benefit from! You'll never get archived again, and you could win weekly awards!

    Have you got what it takes to become this month's TechNet Technical Guru? Join a long list of well known community big hitters, show your knowledge and prowess in your favoured technologies!


    Sunday, September 15, 2019 8:02 PM
    Moderator
  • Hi Suvanichikati,

    Here is an elegant way to handle your scenario via single SSIS Execute SQL Task.

    First, check it directly in SSMS.

    SQL:

    ;WITH rs AS
    (
    	SELECT *
    	FROM  OPENROWSET(BULK 'e:\Temp\DataCow.csv'
    		, FORMATFILE = 'e:\Temp\DataCow.xml'  
    		, ERRORFILE = 'e:\Temp\DataCow.err'
    		, FIRSTROW = 2 -- real data starts on the 2nd row
    		, MAXERRORS = 100
    		) AS t1
    )
    -- INSERT INTO destinationTable -- uncomment it when you are ready
    SELECT * FROM rs
    WHERE LogDate IS NOT NULL; -- or any other filter condition
    

    'e:\Temp\DataCow.csv':

    "ID","Name","Colour","LogDate"
    41,"Orange","Orange",2018-09-09 16:41:02.000
    42,"Cherry, Banana","Red,Yellow",
    43,"Apple","Yellow",2017-09-09 16:41:02.000
    

    FORMATFILE = 'e:\Temp\DataCow.xml':

    <?xml version="1.0"?>
    <BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
       <RECORD>
          <FIELD ID="1" xsi:type="CharTerm" TERMINATOR=',"' MAX_LENGTH="70"/>
          <FIELD ID="2" xsi:type="CharTerm" TERMINATOR='","' MAX_LENGTH="70" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
          <FIELD ID="3" xsi:type="CharTerm" TERMINATOR='",' MAX_LENGTH="70" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
          <FIELD ID="4" xsi:type="CharTerm" TERMINATOR='\r\n' MAX_LENGTH="70" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
       </RECORD>
       <ROW>
          <COLUMN SOURCE="1" NAME="ID" xsi:type="SQLVARYCHAR"/>
          <COLUMN SOURCE="2" NAME="Name" xsi:type="SQLVARYCHAR"/>
          <COLUMN SOURCE="3" NAME="Colour" xsi:type="SQLVARYCHAR"/>
          <COLUMN SOURCE="4" NAME="LogDate" xsi:type="SQLDATETIME"/>
       </ROW>
    </BCPFORMAT>

    Monday, September 16, 2019 2:30 AM
  • Hi Suvanichikati,

    Check if this tutorial video is helpful.

    Please refer to How to Read Excel Sheet Data after Skipping Rows in SSIS Package by using Script Task.

    Regards,

    Zoe


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com

    Monday, September 16, 2019 3:06 AM
  • Hi

    Just checking in to see if my initial answer helped or you still facing any issues.

     

    If my initial reply answers your query, do click “Mark as Answer” and Up-Vote for the same which might be beneficial to other community members reading this thread .

    And, if you have any further query do let us know.

    Thank you


    If this post helps to resolve your issue, please click the "Mark as Answer" of that post and/or click Answered "Vote as helpful" button of that post. By marking a post as Answered and/or Helpful, you help others find the answer faster.

    Wednesday, September 18, 2019 3:04 AM
  • Hi Suvanichikati,

    What's the latest?
    Did the proposed solution help you to resolve the issue?

    Wednesday, September 18, 2019 8:24 PM