none
Types of Breaks RRS feed

  • Question

  • Hi,

    Quick one this, I don't think it is possible but I thought it would be worth asking. 

    I have a word document that has different type of section and page breaks in it, continuous section break, next page section break, page break etc. Is it possible to distinguish between these type of breaks when iterating through the paragraphs.

    For example I know that there is a page break present as it is normally denoted by the character symbol '\f' however this could be any of the type of page breaks.

    Any ideas?  

    Wednesday, July 25, 2012 2:36 PM

Answers

  • Hi Will

    As I'm not intimately familiar with your code, I didn't know what kind of Range would be appropriate. It could be anything at all - I used Range in the very general sense of "take any Range object", Sections(1) will be a property of it, no matter what the origin of the Range.

    There will always be a character after a break; at the very least, it will be a lone paragraph mark at the end of the document. The paragraph mark - that occurs for every paragraph in the document - is ANSI character 13 (in C# "\r").

    To find out what type of section each para is, building on your code snippet, you'd use:

      para.Range.Sections[1].PageSetup.SectionStart;

    A Range could contain more than one section; the above gives you the first section in that Range. If you're concerned that there could be more than one section within a paragraph, you could precede this with para.Range.Sections.Count. If it returns > 1 there is more than one section in the paragraph. In that case, you could loop through the Range.SEctions collection to determine what types of section breaks there are (any that are IN the paragraph will be continuous, by definition).

    As you "walk" the paragraphs you can determine whether a section has changed using:

      int secIndex = para.Range.Sections[1].Index;

    Index will be the number of the section counted from the beginning of the document.

    Note: If you need all the properties I've outlined above, then it would be worthwhile using a Range object (Word.Range rngPara = para.Range;), rather than always referring to para.Range.

    Going back to moving one character and checking whether the section type changed: If you have rngPara you can collapse it to its end-point, then move it one character:

      object collapseEnd = Word.WdCollapseDirection.wdCollapseEnd;
      object count1 = 1;
      object unitChar = Word.WdUnit.wdCharacter;
      rngPara.Collapse(ref collapseEnd); 
      rng.MoveStart(ref unitChar, ref count1);

    Now rng.Text should be the character following what was previously in rngPara.


    Cindy Meister, VSTO/Word MVP

    Thursday, July 26, 2012 1:38 PM
    Moderator

All replies

  • Hi Smithys

    You can query the Range.Sections(1).PageSetup.SectionStart property for section breaks. That will return a member of the WdSectionStart enumeration that tells you what kind of section break precedes the given Range object.

    Page breaks are a different animal, entirely, and can't be detected in this manner.

    As you say, the various break types use the same character (personally, I prefer checking the ASCII character using the ASC function, which is twelve for the breaks. If you move one character beyond a break character and the Section.Index value does not change, then it's a page break. Otherwise it's a section break.


    Cindy Meister, VSTO/Word MVP

    Wednesday, July 25, 2012 3:52 PM
    Moderator
  • Hi Cindy,

    Thanks for your post. 

    Can you elaborate on your answer a bit more as I am still unsure how I would code this up. 

    "Range.Sections(1).PageSetup.SectionStart", is this the range of the Paragraph? 

    When you say "If you move one character beyond a break character....", I cant see how this would work as there may not be a character after the break. For example I am currently iterating over paragraphs, getting the text and looking for the '\f' character, if I find one then I know there is a break (I don't know what type, just that there is one). It looks something like this in code: 

    String text=para.Range.Text;
    String[] pageBreaks=text.Split('\f');
    
    if (pageBreaks.Length >= 2)
    {
     //print some code to user "You used page breaks"
    }

    Now if the text was something like "Hello, this is some text\f", well then there is no character after the page break character, so I am not sure what you mean by checking the index value of one character after the break.

    Thanks,

    Will

    Thursday, July 26, 2012 9:28 AM
  • Hi Will

    As I'm not intimately familiar with your code, I didn't know what kind of Range would be appropriate. It could be anything at all - I used Range in the very general sense of "take any Range object", Sections(1) will be a property of it, no matter what the origin of the Range.

    There will always be a character after a break; at the very least, it will be a lone paragraph mark at the end of the document. The paragraph mark - that occurs for every paragraph in the document - is ANSI character 13 (in C# "\r").

    To find out what type of section each para is, building on your code snippet, you'd use:

      para.Range.Sections[1].PageSetup.SectionStart;

    A Range could contain more than one section; the above gives you the first section in that Range. If you're concerned that there could be more than one section within a paragraph, you could precede this with para.Range.Sections.Count. If it returns > 1 there is more than one section in the paragraph. In that case, you could loop through the Range.SEctions collection to determine what types of section breaks there are (any that are IN the paragraph will be continuous, by definition).

    As you "walk" the paragraphs you can determine whether a section has changed using:

      int secIndex = para.Range.Sections[1].Index;

    Index will be the number of the section counted from the beginning of the document.

    Note: If you need all the properties I've outlined above, then it would be worthwhile using a Range object (Word.Range rngPara = para.Range;), rather than always referring to para.Range.

    Going back to moving one character and checking whether the section type changed: If you have rngPara you can collapse it to its end-point, then move it one character:

      object collapseEnd = Word.WdCollapseDirection.wdCollapseEnd;
      object count1 = 1;
      object unitChar = Word.WdUnit.wdCharacter;
      rngPara.Collapse(ref collapseEnd); 
      rng.MoveStart(ref unitChar, ref count1);

    Now rng.Text should be the character following what was previously in rngPara.


    Cindy Meister, VSTO/Word MVP

    Thursday, July 26, 2012 1:38 PM
    Moderator