none
A macro that splits lines after 80 characters RRS feed

  • Question

  • having a hard time with this, I need a macro for reformatting a Word  document by making all lines no longer than 80 characters per line, all lines should end with Paragraph Mark. Any ideas ?

     

    Thanks

    Wednesday, August 10, 2011 8:34 AM

Answers

  • I assume you want to split the lines at SPACE's, and not randomly.  You can try this VBA Macro.  It retains paragraph breaks; and also trims multiple consecutive spaces within the paragraph to just a single space (but that can be changed if necessary).

    Note in the Macro that you must set a reference to Microsoft VBScript Regular Expressions 5.5, as we use that engine to do the parsing.  If you are going to distribute this, we can change that to late binding, but setting the reference allows you to see the arguments for the functions, so might be simpler at this time.

    Also note that we start at the last paragraph and work I way to the first.  Otherwise we would wind up in an infinite loop.

    Also, the macro replaces the original, so be sure to do this on a copy.

     

    =============================================

    Option Explicit
    Sub SetLineLength()
    'Requires setting reference to Microsoft VBScript Regular Expressions 5.5
    'Will split at a space UNLESS a single word is longer than LineLength, in which
    'case it will split at LineLength characters
        Const LineLength As Long = 80
        Dim RE As RegExp, MC As MatchCollection, M As Match
        Dim Ps As Paragraphs, P As Paragraph
        Dim i As Long
        Dim Doc As Document
        Dim sIn As String, sOut As String
    Set RE = New RegExp
        RE.Global = True
    Set Doc = ActiveDocument

    'Replace multiple spaces with one
    'Leave paragraphs intact
    'Trim to line length
    Set Ps = Doc.Paragraphs
    For i = Ps.Count To 1 Step -1
        Set P = Ps(i)
            RE.Pattern = "\s{2,}"
        sIn = RE.Replace(P.Range.Text, " ")
            RE.Pattern = "\S.{0," & LineLength - 1 & "}(?=\s|$)|\S{" & LineLength & "}"
        If RE.Test(sIn) = True Then
            Set MC = RE.Execute(sIn)
            sOut = ""
            For Each M In MC
                sOut = sOut & M & vbNewLine
            Next M
            P.Range.Text = sOut
        End If
       
    'Uncomment for debugging
    '    Stop

    Next i
       
    End Sub
    =================================


    Ron
    • Marked as answer by steelponey Wednesday, August 31, 2011 6:21 AM
    Sunday, August 21, 2011 10:49 AM
  • Hi,

    Here is how I would approach it. I'd create a collection that I'd use to store arrays with a set size of 80 characters. I'd scan through a paragraph in the document character by character filling up all 80 characters of a new array instance. Once filled I'd add to the array to the collection and repeat until all characters of the paragraph were used up. That then becomes a formatted paragraph. I'd repeat for all paragraphs.

    The thing is though whitespace characters, for example a tab is one character but it represents maybe 4 space characters. So a tab would fill 4 array spaces. There might be other characters like this. It's not so easy.

     


    "The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by exertion of the imagination." - Fred Brooks
    • Marked as answer by steelponey Wednesday, August 31, 2011 6:22 AM
    Wednesday, August 10, 2011 11:53 AM
  • Would this help? Though it is not a very efficient way of doing it :)

     

    Sub Sample()
     Dim doc As Document
     Set doc = ActiveDocument
     
     For i = doc.Range.Characters.Count To 1 Step -1
     If i Mod 80 = 0 Then
     doc.Range.Characters(i) = doc.Range.Characters(i) & vbNewLine
     End If
     Next
    End Sub
    

     

    The reason why I say it is not efficient is because it will split up the words or ignore existing paragraphs etc. If you have text with no spaces and is something like the image below then it will work :)


    Sid (A good exercise for the Heart is to bend down and help another up)

    Please do not email me your questions. I do not answer questions by email unless I get paid for it :) If you want, create a thread in Excel forum and email me the link and I will help you if I can.




    • Marked as answer by steelponey Wednesday, August 31, 2011 6:22 AM
    Thursday, August 18, 2011 7:22 PM
    Moderator
  • Beers are good.  But I may sip a nice single malt today as I watch our hurricane go by.  Actually, by the time it gets here, it'll probably just be a tropical storm.

    If you don't want to set the reference to VBScript Regular Expressions 5.5 explicitly, you can create an object within the macro.  You need to change two lines.  The disadvantage is that you won't have the prompting for arguments when you are writing the macro, but other than that it should work about the same.  Note the commented out lines that begin with Dim RE and Set RE.

    Also, you never wrote back about what you wanted to do with regard to the reducing of multiple included spaces.

    =====================

    Option Explicit
    Sub SetLineLength()
    'May Require setting reference to Microsoft VBScript Regular Expressions 5.5 if you
    '   use the commented out lines instead of the line below that start with Dim RE or Set RE
    'Will split at a space UNLESS a single word is longer than LineLength, in which
    'case it will split at LineLength characters
       
        Const LineLength As Long = 15
       
        'Dim RE As RegExp, MC As MatchCollection, M As Match
        Dim RE As Object, MC As Object, M As Object
       
        Dim Ps As Paragraphs, P As Paragraph
        Dim i As Long
        Dim Doc As Document
        Dim sIn As String, sOut As String

    'Set RE = New RegExp
    Set RE = CreateObject("vbscript.regexp")

        RE.Global = True
    Set Doc = ActiveDocument

    'Replace multiple spaces with one
    'Leave paragraphs intact
    'Trim to line length
    Set Ps = Doc.Paragraphs
    For i = Ps.Count To 1 Step -1
        Set P = Ps(i)
            RE.Pattern = "\s{2,}"
        sIn = RE.Replace(P.Range.Text, " ")
            RE.Pattern = "\S.{0," & LineLength - 1 & "}(?=\s|$)|\S{" & LineLength & "}"
        If RE.Test(sIn) = True Then
            Set MC = RE.Execute(sIn)
            sOut = ""
            For Each M In MC
                sOut = sOut & M & vbNewLine
            Next M
            P.Range.Text = sOut
        End If
      
    'Uncomment for debugging
    '    Stop

    Next i
      
    End Sub
    =========================


    Ron
    • Marked as answer by steelponey Wednesday, August 31, 2011 6:21 AM
    Sunday, August 28, 2011 10:37 AM
  • By the way, I don't know why your macro to add a reference to VBScript Regular Expressions 5.5 is not working.

    Do you get an error message?  Perhaps "programmatic access to visual basic project is not trusted"?

    In Office 2007 and later to have to specifically allow this access in the Trust Center.

    If that is not the issue, see what happens if you run it without the On Error statement.

     


    Ron
    • Marked as answer by steelponey Wednesday, August 31, 2011 6:23 AM
    Sunday, August 28, 2011 7:54 PM

All replies

  • Hi,

    Here is how I would approach it. I'd create a collection that I'd use to store arrays with a set size of 80 characters. I'd scan through a paragraph in the document character by character filling up all 80 characters of a new array instance. Once filled I'd add to the array to the collection and repeat until all characters of the paragraph were used up. That then becomes a formatted paragraph. I'd repeat for all paragraphs.

    The thing is though whitespace characters, for example a tab is one character but it represents maybe 4 space characters. So a tab would fill 4 array spaces. There might be other characters like this. It's not so easy.

     


    "The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by exertion of the imagination." - Fred Brooks
    • Marked as answer by steelponey Wednesday, August 31, 2011 6:22 AM
    Wednesday, August 10, 2011 11:53 AM
  • Would this help? Though it is not a very efficient way of doing it :)

     

    Sub Sample()
     Dim doc As Document
     Set doc = ActiveDocument
     
     For i = doc.Range.Characters.Count To 1 Step -1
     If i Mod 80 = 0 Then
     doc.Range.Characters(i) = doc.Range.Characters(i) & vbNewLine
     End If
     Next
    End Sub
    

     

    The reason why I say it is not efficient is because it will split up the words or ignore existing paragraphs etc. If you have text with no spaces and is something like the image below then it will work :)


    Sid (A good exercise for the Heart is to bend down and help another up)

    Please do not email me your questions. I do not answer questions by email unless I get paid for it :) If you want, create a thread in Excel forum and email me the link and I will help you if I can.




    • Marked as answer by steelponey Wednesday, August 31, 2011 6:22 AM
    Thursday, August 18, 2011 7:22 PM
    Moderator
  • Thanks for help, I didn't get to work on this project much though. The first answer seems more sophisticated than my knowledge of VBA. But the second answer, I wonder if by changing conditions to that " if statement" it wouldn't work well. Like instead of i Mod 80 =0, to check for ( i>=80 ) and character at i  is space than insert vbNewLine. Downside here is that the lines would be split after the last word reaches 80 characters.
    Sunday, August 21, 2011 9:43 AM
  • I assume you want to split the lines at SPACE's, and not randomly.  You can try this VBA Macro.  It retains paragraph breaks; and also trims multiple consecutive spaces within the paragraph to just a single space (but that can be changed if necessary).

    Note in the Macro that you must set a reference to Microsoft VBScript Regular Expressions 5.5, as we use that engine to do the parsing.  If you are going to distribute this, we can change that to late binding, but setting the reference allows you to see the arguments for the functions, so might be simpler at this time.

    Also note that we start at the last paragraph and work I way to the first.  Otherwise we would wind up in an infinite loop.

    Also, the macro replaces the original, so be sure to do this on a copy.

     

    =============================================

    Option Explicit
    Sub SetLineLength()
    'Requires setting reference to Microsoft VBScript Regular Expressions 5.5
    'Will split at a space UNLESS a single word is longer than LineLength, in which
    'case it will split at LineLength characters
        Const LineLength As Long = 80
        Dim RE As RegExp, MC As MatchCollection, M As Match
        Dim Ps As Paragraphs, P As Paragraph
        Dim i As Long
        Dim Doc As Document
        Dim sIn As String, sOut As String
    Set RE = New RegExp
        RE.Global = True
    Set Doc = ActiveDocument

    'Replace multiple spaces with one
    'Leave paragraphs intact
    'Trim to line length
    Set Ps = Doc.Paragraphs
    For i = Ps.Count To 1 Step -1
        Set P = Ps(i)
            RE.Pattern = "\s{2,}"
        sIn = RE.Replace(P.Range.Text, " ")
            RE.Pattern = "\S.{0," & LineLength - 1 & "}(?=\s|$)|\S{" & LineLength & "}"
        If RE.Test(sIn) = True Then
            Set MC = RE.Execute(sIn)
            sOut = ""
            For Each M In MC
                sOut = sOut & M & vbNewLine
            Next M
            P.Range.Text = sOut
        End If
       
    'Uncomment for debugging
    '    Stop

    Next i
       
    End Sub
    =================================


    Ron
    • Marked as answer by steelponey Wednesday, August 31, 2011 6:21 AM
    Sunday, August 21, 2011 10:49 AM
  • Thanks Ron, that macro does work, how do you change the trimming of multiple consecutive spaces ?
    Friday, August 26, 2011 8:37 AM
  • How do you want to "change the trimming ..."?

    You do it by changing these lines:

    =========================
    RE.Pattern = "\s{2,}"
        sIn = RE.Replace(P.Range.Text, " ")
    ==========================

    If you disallow multiple consecutive spaces, then lines will always start and end with a non-space.  If you allow multiple consecutive spaces, then lines could start or end with one or more spaces; or possibly have less than the maximum number of characters, so you will need to be very specific about what you want.


    Ron
    Friday, August 26, 2011 10:58 AM
  • Thanks, it works well with setting the reference for early binding. I wish I could buy you a beer.  I tried to change to late binding with this which I found online, didn't work. Sorry, beginner here...


    Sub Make_VBS_Ref() 
    
    On Error Resume Next
      'in case ref exists
      ActiveDocument.VBProject.References.AddFromFile "C:\WINDOWS\SYSTEM32\vbscript.DLL\3"
      End Sub
    
    
    and
    Sub SetLineLength()
    'Requires setting reference to Microsoft VBScript Regular Expressions 5.5
    'Will split at a space UNLESS a single word is longer than LineLength, in which
    'case it will split at LineLength characters
    
    Make_VBS_Ref
    
    Const LineLength As Long = 80
    Dim RE As RegExp, MC As MatchCollection, M As Match
    (...)
    

    but that doesn't work. How do you get the late binding working ?

     


    steelponey
    Sunday, August 28, 2011 4:06 AM
  • Beers are good.  But I may sip a nice single malt today as I watch our hurricane go by.  Actually, by the time it gets here, it'll probably just be a tropical storm.

    If you don't want to set the reference to VBScript Regular Expressions 5.5 explicitly, you can create an object within the macro.  You need to change two lines.  The disadvantage is that you won't have the prompting for arguments when you are writing the macro, but other than that it should work about the same.  Note the commented out lines that begin with Dim RE and Set RE.

    Also, you never wrote back about what you wanted to do with regard to the reducing of multiple included spaces.

    =====================

    Option Explicit
    Sub SetLineLength()
    'May Require setting reference to Microsoft VBScript Regular Expressions 5.5 if you
    '   use the commented out lines instead of the line below that start with Dim RE or Set RE
    'Will split at a space UNLESS a single word is longer than LineLength, in which
    'case it will split at LineLength characters
       
        Const LineLength As Long = 15
       
        'Dim RE As RegExp, MC As MatchCollection, M As Match
        Dim RE As Object, MC As Object, M As Object
       
        Dim Ps As Paragraphs, P As Paragraph
        Dim i As Long
        Dim Doc As Document
        Dim sIn As String, sOut As String

    'Set RE = New RegExp
    Set RE = CreateObject("vbscript.regexp")

        RE.Global = True
    Set Doc = ActiveDocument

    'Replace multiple spaces with one
    'Leave paragraphs intact
    'Trim to line length
    Set Ps = Doc.Paragraphs
    For i = Ps.Count To 1 Step -1
        Set P = Ps(i)
            RE.Pattern = "\s{2,}"
        sIn = RE.Replace(P.Range.Text, " ")
            RE.Pattern = "\S.{0," & LineLength - 1 & "}(?=\s|$)|\S{" & LineLength & "}"
        If RE.Test(sIn) = True Then
            Set MC = RE.Execute(sIn)
            sOut = ""
            For Each M In MC
                sOut = sOut & M & vbNewLine
            Next M
            P.Range.Text = sOut
        End If
      
    'Uncomment for debugging
    '    Stop

    Next i
      
    End Sub
    =========================


    Ron
    • Marked as answer by steelponey Wednesday, August 31, 2011 6:21 AM
    Sunday, August 28, 2011 10:37 AM
  • By the way, I don't know why your macro to add a reference to VBScript Regular Expressions 5.5 is not working.

    Do you get an error message?  Perhaps "programmatic access to visual basic project is not trusted"?

    In Office 2007 and later to have to specifically allow this access in the Trust Center.

    If that is not the issue, see what happens if you run it without the On Error statement.

     


    Ron
    • Marked as answer by steelponey Wednesday, August 31, 2011 6:23 AM
    Sunday, August 28, 2011 7:54 PM
    • I hope the hurricane didn't affect you, I was pretty far away from it.
    • The "object" version  works well on at least one computer with the references set to off. Thank You. I want to try it on several different computers, win xp, office 2007-2010.
    • The spacing issue, I just changed to "replace 1 space with 1 space", because I might actually want both versions (one that leaves spacing as it is and just splits the lines) so I want to be able to easy change the code if I need.
    • The macro to add a reference to VBScript Regular Expressions 5.5 is not working by saying "Compile error. User defined type not defined !" and shows "RE As RegExp highlighted".

    steelponey
    Wednesday, August 31, 2011 6:18 AM
  • The hurricane was no worse than a typical nor'easter, except for having rain instead of snow.

    The version I supplied with the late binding (createobject method within the body of the macro) should work on the various versions of Office.  I don't think there's anything I used that was introduced after XP.

    If you don't remove the "multiple spaces", you might consider also changing this line:

    sOut = sOut & M & vbNewLine
      to
    sOut = sOut & Trim(M) & vbNewLine

    The VBA Trim function (unlike the worksheet TRIM function), only removes leading and trailing spaces.

    The error message you are getting indicates that the reference is not being added by that macro.  You can confirm this by looking at the list of references.  If that is the case, commenting out the "on error resume next" line might help you in troubleshooting.  Or you can just use the last version I provided with the createobject line.


    Ron
    Wednesday, August 31, 2011 10:26 AM