none
Question about carriage returns and line feeds RRS feed

  • Question

  • I have a good size text file composed of lets say many paragraphs.  Each paragraph can have maybe 20 to 100 lines.  What I would like to do is to make sure that there is only 1 crlf between each paragraph.  If there is more then one crlf between any of the paragraphs I would like to delete however many are there until the entire file has only 1 crlf between each paragraph.  I am sure through alot of coding I can probably come up with something but knowing you guys out here there always seems to be a trick of the trade. <S>

    Appreciate any help,

    Les

    Tuesday, December 19, 2017 6:38 AM

Answers

  • Try something as in the following.
    Dim CRLF As String = Chr(13) & Chr(10)
    If vbCrLf = CRLF Then
        TextBox2.Text = "Yes"
    Else
        TextBox2.Text = "No"
    End If

    You will see that vbCrLf = Chr(13) & Chr(10)


    Sam Hobbs
    SimpleSamples.Info

    • Marked as answer by Les2011 Wednesday, December 27, 2017 10:42 PM
    Thursday, December 21, 2017 6:50 AM

All replies

  • What I would like to do is to make sure that there is only 1 crlf between each paragraph.  If there is more then one crlf between any of the paragraphs I would like to delete however many are there until the entire file has only 1 crlf between each paragraph.

    You would first need to confirm exactly what separates a paragraph - a CR, a LF, or both.

    One option is to create a loop that replaces all occurrences of two paragraph separators with one occurrence of the separator, using the String.Replace function.   You loop around and repeat the replace until the length of the string before and after the replace is unchanged.

    Tuesday, December 19, 2017 7:10 AM
  • Hi Acamar,

    I understand your logic I will play with that idea tomorrow thank you for always being so helpful.  I will keep you posted as to what I find.

    Les

    Tuesday, December 19, 2017 7:45 AM
  • What Acamar is good. You could read the entire file at once into a string. Then use the String.IndexOf Method to find the first occurrence. Then get the text and append to a StringBuilder. Go into a loop and while there are more occurrences increment the pointer. Then use the String.IndexOf Method (String, Int32) to find the next occurrence. So you would have two loops. The outer loop would be to find the first of a sequence of occurrences and the inner loop would eat subsequent occurrences in a sequence.

    Or you could use a regular expression. I am not an expert on that, I seldom use them.



    Sam Hobbs
    SimpleSamples.Info

    Tuesday, December 19, 2017 6:37 PM
  • Read the lines in and filter blank lines.  Then write the file back out.

            Dim lns() As String
    
            lns = IO.File.ReadAllLines(path).Where(Function(l) l <> "").ToArray
    
            IO.File.WriteAllLines(path, lns)
    


    "Those who use Application.DoEvents() have no idea what it does and those who know what it does never use it" - MSDN User JohnWein

    Tuesday, December 19, 2017 6:52 PM
  • Hi db,

    Thx for the code snippet.  I plan on trying Acamar and youor methods out either tonight or tomorrow night.  Let me ask a question about your snippet.  How large a file can I store in the string variable.  I seem to remember a while back when I had large files I decided to look at them in parts since there was some kind of limit as to what I could store in a variable.  I think some of these files can be larger then a GB.  What I finally did was worked on a paragraph at a time which actually worked out great.  Now if I could just manipulate the entire file with your code or some type of regular expression that would be great.   BTW if at times I just need to make sure that the file ends in a singel crlf and not more then that is there a simple way of just checking the end of the file?  I will keep you posted.

    Thx

    Les


    • Edited by Les2011 Wednesday, December 20, 2017 1:59 AM
    Wednesday, December 20, 2017 12:58 AM
  • Hi Acamar,

    Ok here is what I have done so far.

    Lets say I have the following

    Les and testing this sentence VBCRLF

    Les and testing this sentence VBCRLF

    Les and testing this sentence VBCRLF

    Les and testing this sentence VBCRLF

    VBCRLF

    VBCRLF

    Les and testing this sentence VBCRLF

    Les and testing this sentence VBCRLF

    Les and testing this sentence VBCRLF

    Les and testing this sentence VBCRLF

    VBCRLF

    Les and testing this sentence VBCRLF

    Les and testing this sentence VBCRLF

    Les and testing this sentence VBCRLF

    Les and testing this sentence VBCRLF

    VBCRLF

    VBCRLF

    Ok notice I have after the first paragraph 2 VBCRLF which I only want one

    Then after the next paragraph there is one which is what I want

    Then after the next paragraph I have 2 VBCRLF and only want one.

    Now the following code snippet is what I tried to do but it doesn't even seem to see the VBCRLF??

            Do
                If InStr(epd, vbCrLf & vbCrLf) > 0 Then
                    epd = epd.Replace(vbCrLf & vbCrLf, vbCrLf)
                Else
                    Exit Do
                End If
            Loop

    What I am doing is first I read the lines into a string variable in this case epd.  Then I look for anytime there are 2 VBCRLF together and I replace it with one VBCRLF and loop through again until there are no more two VBCRLF remaining.  Maybe I am missing something.

    Maybe is there a way of just going through an entire file and when I see two VBCRLF together that I can change one of them and continue to go through the file?  I am working with some large files and prefer not to read the entire file at once.

    Thx,

    Les



    • Edited by Les2011 Wednesday, December 20, 2017 4:37 AM
    Wednesday, December 20, 2017 4:33 AM
  • Now the following code snippet is what I tried to do but it doesn't even seem to see the VBCRLF??

    Do you mean that the If is always returning false?   That suggests that the lines are not separated by a VBCrLf?  For instance, have you looked at the file with a binary editor?  That's the only sure way to determine what line separator is actually being used.  Once you know that you can actually create two strings from the binary values, so you are not relying on whatever it is that VB uses as the VBCrLf - eliminating any possible discrepancy.

    /Edit.  Depending on the source of your data, it is possible that the double-spaces you are seeing are actually being created by something like <10><13><10>.  It may be worth looking at the source of the file to see if something in the way it is created gives a reliable guide to why there is intermittent additional line breaks. But looking at the file with a binary viewer is the only 100% reliable method.

    • Edited by Acamar Wednesday, December 20, 2017 4:58 AM add
    Wednesday, December 20, 2017 4:50 AM
  • Your code works for me. So it is possible that "vbCrLf & vbCrLf" does not match the data you have.

    You say you have a large file. Your code will start from the beginning every time so that would be inefficient for a large file. As for "when I see two VBCRLF together that I can change one of them and continue to go through the file" my post describes a way of doing that. It could be adapted to reading a file line-by-line and writing the new data to a different file. If you are going to ignore me then I won't bother making some code.



    Sam Hobbs
    SimpleSamples.Info

    Wednesday, December 20, 2017 5:17 AM
  • Sam,

    I never intended to avoid you.  I appreciate everyone who has helped me over the years.  I am gathering info from all of you so that I can try to figure it out.  Please I am reading up and playing with alot of thoughts and suggestions so please dont take it personally.  I have mentioned that the files can be very large in the GB's so I prefer not to read and rewrite files as that I can do.  I also have read that there is pretty much no way other then reading through the entire file to see the end of it.  I have got some other ideas and will keep you posted Sam.

    Thx so much for your help,

    Les

    Wednesday, December 20, 2017 7:38 AM
  • Hi Acamar,

    You know you might be right about that!  All this time I thought I was doing something wrong and it may be we are not dealing with VBCRLF at all <S> I will have to check out your suggestion with an editor

    Good thinking I will get back to you.

    Les

    Wednesday, December 20, 2017 7:40 AM
  • Hi Acamar,

    Ok for my text editor I used UltraLight.  This is what I found in Hex mode:

    124"]

    1. d4 (

     1    2    4    "    ]                              1    .           d    4          ( 

    31  32  34  22  5D  0D  0A  0D  0A  31  2E  20  64  34  20  7B

    Looks like the 0D  0A  0D  0A has something to do about this but I have no idea what it means??  It seems as though at the end of every line are those 4 hex codes??  BTW the symbol I see outside of hex code is like a backwards letter P and the vertical line in the P is like doubles??   But I think I have an idea to resolve my problem but requires some differnet logic to rewrite I will let you know how it works out.  As mentioned before I only want to read the file once and only in blocks at a time.

    Les


    • Edited by Les2011 Thursday, December 21, 2017 4:33 AM
    Thursday, December 21, 2017 4:32 AM
  • Hi Sam,

    This is what I have figured out so far.

    Ok for my text editor I used UltraLight.  This is what I found in Hex mode:

    124"]

    1. d4 (

     1    2    4    "    ]                              1    .           d    4          ( 

    31  32  34  22  5D  0D  0A  0D  0A  31  2E  20  64  34  20  7B

    Looks like the 0D  0A  0D  0A has something to do about this but I have no idea what it means??  It seems as though at the end of every line are those 4 hex codes??  BTW the symbol I see outside of hex code is like a backwards letter P and the vertical line in the P is like doubles??   But I think I have an idea to resolve my problem but requires some differnet logic to rewrite I will let you know how it works out.  As mentioned before I only want to read the file once and only in blocks at a time.

    Les

    Thursday, December 21, 2017 4:33 AM
  • 0D0A is carriage return (decimal 13) and line feed (decimal 10). So that would be VbCrLf. So your program should be able to find vbCrLf & vbCrLf.


    Sam Hobbs
    SimpleSamples.Info

    Thursday, December 21, 2017 5:29 AM
  • What I am doing is first I read the lines into a string variable in this case epd.  Then I look for anytime there are 2 VBCRLF together and I replace it with one VBCRLF and loop through again until there are no more two VBCRLF remaining.

    Depending on how you read the data it might be replacing the CRLF before it gives you the data.

    Since you are concerned about the size of the data you could read and write the data a record at a time except ignore lines with zero length strings. When you read a record it will read up to a CRLF and then most input functions will remove the CRLF. In C, some functions will convert the CRLF to a LF only but .Net would not do that. So if you get a record with a zero length string then it was just a CRLF.



    Sam Hobbs
    SimpleSamples.Info

    Thursday, December 21, 2017 5:38 AM
  • Ok for my text editor I used UltraLight.  This is what I found in Hex mode:

     0D  0A  0D  0A  is CR LF CR LF.  You would code it in bytes as 13 10 13 10.  It will be displayed as two newlines.  So that seems to be what you are looking for.   If you replace the VBCrLf in your original code with a string made of of CHR(13) and Chr(10), does the If return true?

    Thursday, December 21, 2017 6:31 AM
  • Try something as in the following.
    Dim CRLF As String = Chr(13) & Chr(10)
    If vbCrLf = CRLF Then
        TextBox2.Text = "Yes"
    Else
        TextBox2.Text = "No"
    End If

    You will see that vbCrLf = Chr(13) & Chr(10)


    Sam Hobbs
    SimpleSamples.Info

    • Marked as answer by Les2011 Wednesday, December 27, 2017 10:42 PM
    Thursday, December 21, 2017 6:50 AM
  • Hi Sam,

    Ok I see what you mean.  Let me give that some thought aand as always I will let you know how things are going.

    Thx

    Les

    Thursday, December 21, 2017 7:49 PM
  • Hi Acamar,

    I will give that a try a bit later.

    Thx

    Les

    Thursday, December 21, 2017 7:50 PM
  • I will try this later thx

    Les

    Thursday, December 21, 2017 7:51 PM
  • Hi Sam,

    Yep thats it. Geez staring me right in front <S>. Thank you for your help and Acamar.  Have a great holdiay guys.Sorry it took a while to respond but I am having a differnet kind of problem totally unrelated to this but I will be asking a question about installing VB6.

    Les

    Wednesday, December 27, 2017 10:41 PM