none
How to Split text file into multiple text files?

    Question

  • Hi experts

    I'm looking for help to split a large text file into multiple text files based on size of the File.

    Can anybody help me in this?

     

     

    Friday, February 23, 2007 3:55 AM

All replies

  • Some more information about your requirements would help. What exactly do you mean when you say "based on size of the File"? If you wanted each file to be a maximum of a certain size you could do something like:

    string sourceFileName = @"C:\VS2005 SP1.exe";

    string destFileLocation = @"C:\";

    int index = 0;

    long maxFileSize = 52428800;

    byte[] buffer = new byte[65536];

     

    using (Stream source = File.OpenRead(sourceFileName))

    {

        while (source.Position < source.Length)

        {

            index++;

     

            // Create a new sub File, and read into t

            string newFileName = Path.Combine(destFileLocation, Path.GetFileNameWithoutExtension(sourceFileName));

            newFileName += index.ToString() + Path.GetExtension(sourceFileName);

            using (Stream destination = File.OpenWrite(newFileName))

            {

                while (destination.Position < maxFileSize)

                {

                    // Work out how many bytes to read

                    int bytes = source.Read(buffer, 0, (int) Math.Min(maxFileSize, buffer.Length));

                    destination.Write(buffer, 0, bytes);

     

                    // Are we at the end of the file?

                    if (bytes < Math.Min(maxFileSize, buffer.Length))

                    {

                        break;

                    }

                }

            }

        }

    }

    • Proposed as answer by ryguy72 Wednesday, December 05, 2012 9:37 PM
    Friday, February 23, 2007 5:16 AM
  • Thanks for the reply  Sean Hederman

    Suppose if I've text(.txt) file more than 500KB , I want to split it into muliple files.

    Ex :

    temp.txt is file with 1209KB size

    Now the result should be

    temp1.txt

    temp2.txt

    temp3.txt

    Friday, February 23, 2007 5:21 AM
  • But what size should each file be? Whatever that size is, set the maxFileSize to that, and run my code, and it will automatically split the file for you into the directory specified in destFileLocation.

    Friday, February 23, 2007 5:37 AM
  • Thank u , I'll try that.
    Friday, February 23, 2007 8:14 AM
  •  

    can u tell the maxSize field here, what it is,

    I want to split at 500KB each file.

    If the main file exceeds 500KB then I want to split it.

    Friday, February 23, 2007 8:55 AM
  • Well, 500KB is 500x1024 bytes which means the maximum file size should be 512000.

    Friday, February 23, 2007 10:19 AM
  •  

    Thankq  Got

    But I'm reading lines from the text file

    This method , splitting the line and place the truncated one in other file

     

     

    Friday, February 23, 2007 10:56 AM
  • It would have been useful to know that you needed intact lines upfront.

    Well, basically you'd rewrite it to use StreamReader and ReadLine, then for each line you'd have to decide if the line would take your file beyond it's max size, and if it didn't write it to the file, and if it did, start a new file.

    Friday, February 23, 2007 11:12 AM
  • i already split it out..but it takes time..my file size is more than 2-3G..even to split also need time, not even read and insert into db yet.. any suggestion to read csv file and store to mysql database in efficient/fastest way?? pls help..TQ

    Tuesday, July 12, 2011 4:02 AM
  • Hi,

    Steps the achieve the goal :

    1. Find the size of temp.txt and divide it by 500. This will help you decide you got to split in how many files. Like 1209/500 = 2.41, so you will need 3 files.
    2. Create a StringBuffer and start reading line by line using ReadLine of StreamReader.
    3. On reading each line calculate the size of StringBuffer in bytes. If it is < 500, thne continue reading and storing. If it turned > 500 remove the last line from StringBuffer.
    4. Copy the StringBuffer contents in a file# respectively.
    5. Continue reading lines till you have reached EOF and saving in file. Repeat 3 & 4 steps.

    Step 3 can be done in another way also :

    StringBuffer sb = new StringBuffer();

    line = streamReader.ReadLine();

    sb_bytes  = // Find the byte size of sb

    line_bytes = // Find the byte size of line

    if ( (sb_bytes + line_bytes) <= 500)

       sb.Append(line)

    else

       // Write to File

     

     

    Hope this helps. If you hae ny concern feel free to ask.

     


    Thanks
    If you find any answer helpful, then click "Vote As Helpful" and if it also solves your question then also click "Mark As Answer".
    Tuesday, July 12, 2011 8:44 AM
  • Hi!

    Can you please test this code:

     

    Public Function SplitFile(ByVal Filename As String, ByVal RecordsToRead As Integer, ByVal Parts As Integer) As Boolean
      Dim filesname As String = Nothing
      Dim data() As String = IO.File.ReadAllLines(Filename)
      If (Parts * RecordsToRead <= data.Length) Then
       Dim portion(RecordsToRead - 1) As String
       For i As Integer = 0 To Parts - 1
        Array.ConstrainedCopy(data, RecordsToRead * i, portion, 0, RecordsToRead)
        Array.Clear(data, 0, RecordsToRead)
        IO.File.WriteAllText(Filename.Replace(".", i + 1 & "."), String.Join(vbCrLf, portion))
       Next
      Else
       Return False
      End If
      Return True
    End Function
    

     

    Here 'Filename' is the name of file, 'RecordsToRead'  is the number of records you want to read from a file and put it into a new file, 'Parts' in how many files you want to create. this can put some light on your issue to resolve.

     

    regards,

    Shahan

     


    • Edited by NeverHopeless Tuesday, July 12, 2011 10:21 AM typo error
    Tuesday, July 12, 2011 10:20 AM
  • Thank you so much with your code and some others I came up with the following solution! I have added a link at the bottom to some code I wrote that used some of the logic from this page. I figured I'd give honor where honor was due! Thanks!

    Below is a explanation about what I needed:

    Try This, I wrote this because I have some very large '|' delimited files that have \r\n inside of some of the columns and I needed to use \r\n as the end of the line delimiter. I was trying to import some files using SSIS packages but because of some corrupted data in the files I was unable to. The File was over 5 GB so it was too large to open and manually fix. I found the answer through looking through lots of Forums to understand how streams work and ended up coming up with a solution that reads each character in a file and spits out the line based on the definitions I added into it. this is for use in a Command Line Application, complete with help :). I hope this helps some other people out, I haven't found a solution quite like it anywhere else, although the ideas were inspired by this forum and others.

    http://stackoverflow.com/a/12640862/1582188

    Friday, September 28, 2012 1:22 PM