locked
Word Paragraph.ID property - what is it for, and is it safe to use it for something else? RRS feed

  • Question

  • During a VSTO Word 2010 add in session, I'd like to "mark" Word paragraphs in some way so that my .NET add in can remember if it has processed a paragraph before (in the same session). I have a simple prototype working fine by setting and getting the Paragraph.ID property with my own ID value.

    Can anyone expand on the MSDN documentation? I have no idea what chain of events causes the ID property to be used in "real life". Something to do with saving a web page according to:

    http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word.paragraph.id.aspx

    Unlike the Paragraph.ParaID property, which is reserved for internal use (otherwise I would use it), the ID property seems fair game to be used for what I am using it for. But can someone tell me if it is safe to do so, or if there is a better way to remember Paragraphs you have seen before during a VSTO Word 2010 session?

    Thursday, March 8, 2012 12:49 AM

Answers

  • Hi George,

    As you already know if you Bing Paragraph.ID one of the results is the following

    Paragraph.ID Property
    http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word.paragraph.id.aspx

    “Returns or sets the identifying label for the specified object when the current document is saved as a Web page”.

    Your interpretation is as valid as mine, but what I see is that you need to declare a specific paragraph as a (class) paragraph object which then has properties including an ID string.
    Or ‘label’. The MSDN content suggests the label can be a hyperlink to be used
    in other web pages as a link to the paragraph. It travels with the paragraph. At
    some time in the Microsoft Word product manager’s specification process it has
    been added to the interface deliberately to provide a way to add a hyperlink
    within a web page to pull the labeled paragraph onto the browser workspace. That
    doesn’t say it cannot be used in another way.

    I don’t see a way in that discussion for the end-user to implement a Paragraph.ID through the UI. You can embed a hyperlink into a web page – e.g. an aspx – to navigate to the specific Word document paragraph when that paragraph or document has been published as a web page.

    The MSDN content is terse. Consider taking it at its face value.
    Regards,
    Chris Jensen
    Senior Technical Support Lead



    Chris Jensen

    • Marked as answer by GeorgeMc Thursday, March 15, 2012 9:58 AM
    Tuesday, March 13, 2012 9:30 PM

All replies

  • Hi George,

    The Paragraph.ID property referred to at the link in your
    post is a way of putting ‘label’s into a Word Document saved as a web page so
    that you can use links in other web pages or within the Word Document itself such
    that you can use a url to jump into the paragraph from a hyperlink in another
    web page or back or forth in the same web page.

    If you select your entire document and add paragraph
    numbering to it you can use that to distinguish between the paragraphs you have
    seen and those yet to see. In your add-in build a hash table or other list for
    record keeping purposes. The paragraph numbers will be visible to the end-user.

    A better approach is to use the Paragraphs collection
    examining paragraphs one by one using the index of Paragraphs(index#) and keeping a table or hash of
    the index numbers you want ‘marked’

    You can see more by following this content link:
    Paragraph Interface (Microsoft.Office.Interop.Word)
    http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word.paragraph.aspx

    Please click the “Mark as answer” button if this helps you resolve your questions.
    Regards,
    Chris Jensen
    Senior Technical Support Lead









    Chris Jensen

    Thursday, March 8, 2012 10:06 PM
  • Thanks for your reply Chris.

    I am already using Paragraph collections, either from the ActiveDocument, or the current Selection. In the following I make assertions that are my best guess from observation - I will be delighted to be disabused of ignorance by better authority!

    The problem is that the Paragraph items that you retrieve from traversing a Paragraphs collection are only valid for that traversal, e.g., they are not suitable to be used as the keys of a HashSet, since you will get different items every time you perform the traversal of the collection. E.g., running the following in ThisAddIn_Startup:

    Document d;

    d = Globals.Factory.GetVstoObject(Globals.ThisAddIn.Application.ActiveDocument);

    foreach (Paragraph p in d.Content.Paragraphs)
          Debug.WriteLine(string.Format("0x{0:X}", p.GetHashCode()));
                
    foreach (Paragraph p in d.Content.Paragraphs)
          Debug.WriteLine(string.Format("0x{0:X}", p.GetHashCode()));

    If your current document has 4 paragraphs, the above code will print 8 different numbers.

    What I have found is that if I put my own ID value in a Paragraph's ID property, that value will persist in the corresponding actual paragraph data, even though I am accessing it via fresh Paragraph items.

    This is the scenario behind my question - I'm still not sure what circumstances cause some other agent to use the Paragraph.ID property - and if I don't know that I don't know whether my solution to keeping track of the exact position and existence of paragraph data is guaranteed to work. Your reply alludes to saving documents as web pages - but I'm not clear how that behaviour is accessed in the Word GUI, and how the user actually assigns the labels. I've surfed and looked at the Web option on File->"Save as" and I am none the wiser.

    If the answer is that I cannot use Paragraph.ID for this purpose, my secondary question still stands, what Paragraph property or technique can I use? The technique in your reply only works as a one off snapshot analysis of the whole document - whereas leaving persistent markers in the actual paragraph data permits more efficient and interesting scenarios (e.g. by marking existing paragraphs and tracking selection changes, I don't need to scan the whole document to find new paragraphs).

    thanks, George.


    • Edited by GeorgeMc Friday, March 9, 2012 1:26 PM typo - missed out word
    Friday, March 9, 2012 1:20 PM
  • Hello George,

    In lieu of a hash, you can determine paragraphs.count, then
    iterate through the paragraphs using the count as the index limit. You could
    also use the count for dimensioning a two-dimensional array with the first
    element the paragraph number, and the second element as a Boolean reflecting
    whether you have processed that paragraph.

    You haven’t said where the document will be seen or used. Is
    it to be published on a SharePoint, saved on a network share, or somewhere else
    where you or the end-user will be concerned about?  If you’re assigning your own ID as
    Paragraph.ID can you access it in the Word GUI by inserting a hyperlink or
    index reference to the paragraph in the Cross Reference dialog of the Insert
    Link group of the Links dropdown on the ‘Insert’ tab?

    If that works, depending on who has access to the document,
    your Paragraph.ID sounds like a workable solution to your main problem –
    knowing whether you have processed a paragraph with your Add-in logic.

    If that isn’t the way to access the Paragraph.ID through the
    GUI you may want to replace Paragraph.ID with bookmarks.  A bookmark will stay with the text even if
    the end-user splits a paragraph into two or more paragraphs.  You could iterate through the paragraphs,
    select each paragraph, set a range to the selection, and a bookmark to the
    range.
    Regards,
    Chris Jensen
    Senior Technical Support Lead



    Chris Jensen

    Friday, March 9, 2012 6:17 PM
  • Hi Chris,

    Let's forget about my add in. I'll mark this thread as answered if I get an answer to the first part of my original question, which I rephrase as:

    What user scenarios lead to the reading and writing of the Paragraph.ID property?

    Maybe it is something to do with saving to HTML, but I don't know how a normal user does this in a way that affects .ID properties. Maybe there is an existing deployed add in or VBA script that uses this property. Maybe it is a Sharepoint thing. Maybe someone at Microsoft knows the answer.

    I'll be happy with an answer where I can for myself observe the .ID property at work. I have an MSDN Premium subscription, so I can play with stuff like Sharepoint if that is what it takes.

    Regards, George.


    Saturday, March 10, 2012 12:35 AM
  • Hi George,

    As you already know if you Bing Paragraph.ID one of the results is the following

    Paragraph.ID Property
    http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word.paragraph.id.aspx

    “Returns or sets the identifying label for the specified object when the current document is saved as a Web page”.

    Your interpretation is as valid as mine, but what I see is that you need to declare a specific paragraph as a (class) paragraph object which then has properties including an ID string.
    Or ‘label’. The MSDN content suggests the label can be a hyperlink to be used
    in other web pages as a link to the paragraph. It travels with the paragraph. At
    some time in the Microsoft Word product manager’s specification process it has
    been added to the interface deliberately to provide a way to add a hyperlink
    within a web page to pull the labeled paragraph onto the browser workspace. That
    doesn’t say it cannot be used in another way.

    I don’t see a way in that discussion for the end-user to implement a Paragraph.ID through the UI. You can embed a hyperlink into a web page – e.g. an aspx – to navigate to the specific Word document paragraph when that paragraph or document has been published as a web page.

    The MSDN content is terse. Consider taking it at its face value.
    Regards,
    Chris Jensen
    Senior Technical Support Lead



    Chris Jensen

    • Marked as answer by GeorgeMc Thursday, March 15, 2012 9:58 AM
    Tuesday, March 13, 2012 9:30 PM
  • Hi Chris,

    it has finally clicked with me what is happening. The thing missing from the MSDN help from my point of view is that Paragraph.ID is not used by Word, except to read when writing Web Pages via File->"Save As"->"Web Page (.htm)". It is provided so that unspecified add ins and 3rd party scripts may set the ID. I have hitherto had a mental model that it is Word that writes to its Properties via some known feature invoked by a User; I guess the simple answer is that Word (at least currently) does not write to this property. I also confirmed the Web Page behaviour by writing some IDs using my add in, and seeing what crops up in the saved html, indeed, the following does:

    <p class=MsoNormalid=2>Second Paragraph<o:p></o:p></p>

    where "id=2" occurs because I set the Paragraph.ID to "2" for the Word paragraph with the text "Second Paragraph".

    Anyway, I got there, thanks for you help. I'll mark this closed.

    George.

    • Proposed as answer by HerbF Friday, October 12, 2012 7:48 PM
    Thursday, March 15, 2012 9:58 AM
  • One quick question wrt the same discussion, is this paragraph.id persistent. I wanted to somehow save some id with each paragraph and its value should preserve when i open the document again.

    I tried doing it for paragraph.id, but value is gone once the document is closed

    Pushpendra


    Pushpendra

    Friday, August 26, 2016 4:44 PM
  • Neither the paragraph.ID nor the range.ID are persistent.  I wish they were since I have a use for them and right now I have a horrible workaround/hack of using an invisible bookmark to persistently mark the paragraph.

    -herb

    Tuesday, December 13, 2016 10:10 PM