none
Invocations of Selection.InsertCrossReference that take 60+ seconds RRS feed

  • Question

  • For complex reasons, I need to take several primary Word documents and, for each document, generate a second document that contains a list of all numbered paragraphs (with text), table captions and figure captions. For each paragraph/caption I also need to track the bookmark name that Word creates and uses for internal document cross-references. I use the following code (irrelevant parts left out):

        Dim idxList() as String
        Dim i as Long
        
        idxList = ActiveDocument.GetCrossReferenceItems(wdRefTypeNumberedItem)
        With Selection
            For i = LBound(idxList) To UBound(idxList)
                ...
                Call .InsertCrossReference(wdRefTypeNumberedItem, wdNumberNoContext, i, False)
                ...
                ' Parse REF field for bookmark name
                ...
            Next i   
        End With

    After inserting the cross-reference, I parse the resulting REF field to extract the bookmark name.

    One of the documents is 625 pages with about 2800 numbered paragraphs. Some calls to .InsertCrossReference are effectively instantaneous (as seen stepping with the VBA Debugger). Many, however, take 65-75 seconds--yes, more than a minute. I timed one call at 90 seconds. These times are not for a body of code with many statements and loops. These times are for single calls to .InsertCrossReference. I step through one statement at a time in the debugger. Timings start when the Debugger Step command is issued and stop when the Debugger returns control and highlights the next statement to execute.

    I tried all of the following cross-reference alternatives, all with equivalent results
                Call .InsertCrossReference(wdRefTypeNumberedItem, wdNumberNoContext, i, False)
                Call .InsertCrossReference(wdRefTypeNumberedItem, wdNumberFullContext, i, False)
                Call .InsertCrossReference(wdRefTypeNumberedItem, wdNumberNoContext, i, True)
                Call .InsertCrossReference(wdRefTypeNumberedItem, wdNumberFullContext, i, True)

    I use 'wdRefTypeNumberedItem' instead of 'wdRefTypeHeading' because there are multiple user-defined types with Outline Levels assigned to them. I don't care about List Number styles.

    I have simultaneously watched Word 2010's memory usage in Task Manager increase by ~1.5 MB per call to .InsertCrossReference. During one run attempt, Word's memory footprint went from 400 MB when I started watching, to a gigabyte when I stopped it. I get similar results with Word 2007, though 2007 is much more frugal with memory--growing to a max of ~200 MB, then actually shrinking back to 10 or 20 MB before growing again. Both versions of Word periodically stop to inform the user that "Word has encountered a problem. You will not be able to undo this action once it has completed. Do you want to continue?" A 10+ hour process MIGHT be acceptable if it would run from start to finish without intervention, but one that requires user intervention multiple times is NOT acceptable.

    I'm running Word 2010 on Windows 7 with 2 GB RAM. I'm running Word 2007 on XP with 500 MB RAM. Both Windows systems run in VMWware virtual machines on a system with 8 GB of physical RAM.

    Is there either some way to make .InsertCrossReference run faster, or is there another way to get the name of a bookmark that will stay with numbered paragraphs as sequences change, and as text gets moved around in the document? One alternative, I know, would be to create my own bookmarks and assign one to each numbered paragraph. But that clutters the bookmark cross-reference list for users with literally thousands seemingly extraneous bookmarks. I can't create names that are hidden, like Word can.

    Wednesday, April 20, 2011 12:40 AM

All replies

  • Hi Gerry,

    You would probably do better to load all the bookmarks into a string, then test whether any of those bookmarks apply to the range you're interested in. Similarly, you can test whether any fields in the range cross-references one of those bookmarks. In vba:

    Sub Demo()
    Dim oBkMrk As Bookmark, StrBkMks As String
    Dim Rng As Range, i As Long, oFld As Field
    Dim StrRngBkMks As String, StrFldRefs As String
    With ActiveDocument
      Set Rng = Selection.Paragraphs(1).Range
      .Bookmarks.ShowHidden = True
      For Each oBkMrk In .Bookmarks
        StrBkMks = StrBkMks & "," & oBkMrk.Name
      Next oBkMrk
      .Bookmarks.ShowHidden = False
      With Rng
        For i = 1 To UBound(Split(StrBkMks, ","))
          If .Bookmarks.Exists(Split(StrBkMks, ",")(i)) Then
            StrRngBkMks = StrRngBkMks & vbCr & Split(StrBkMks, ",")(i)
          End If
          For Each oFld In .Fields
            If InStr(Trim(oFld.Code) & " ", Split(StrBkMks, ",")(i) & " ") > 0 Then
              StrFldRefs = StrFldRefs & vbCr & Split(StrBkMks, ",")(i)
            End If
          Next
        Next
      End With
      Set Rng = Nothing
    End With
    MsgBox "The following bookmarks apply" & vbCr & _
      "to the selected paragraph:" & StrRngBkMks
    MsgBox "The following cross-references apply" & vbCr & _
      "to the selected paragraph:" & StrFldRefs
    End Sub


    Cheers
    Paul Edstein
    [MS MVP - Word]
    Wednesday, April 20, 2011 6:24 AM
  • Hi Paul,

    Sorry it took so long to get back. I was traveling.

    Thanks for the detailed response, but it doesn't solve my problem.

    I tried something similar to this before trying to use .InsertCrossReference. I apologize for not mentioning that in my original post.

    In addition to Word's nine built-in heading styles, there are ten more user-defined styles that have assigned outline levels. My first attempt was to create a table of contents that included all these styles to all levels. But that seemed to hang and never produce anything. I think I now know why--it was undoubtedly spending all its time in the TOC equivalent of .InsertCrossReference.

    Next I tried a loop for each Outline-numbered style to Find all the paragraphs that use it, then ask for the bookmark(s) in the paragraph's range. That's effectively what you've done, though you've gone further with also finding cross-references to the paragraph's range. Since the bookmark that I'm interested in will start at the paragraph's start, I discovered that you still get what you want if you take the paragraph's range and collapse it to the start position, the bookmarks in that range will include what I need.

    But there are still two problems with that approach. The biggest is that Word doesn't create the bookmark if there are no cross-references to the paragraph. That puts me in the Catch-22 situation where I'm trying to avoid the expensive .InsertCrossReference call by an alternative means of discovering the bookmark, but the bookmark doesn't exist until I've executed the expensive .InsertCrossReference (or some equivalent) call.

    The second problem is, even if I get the bookmark that way, I don't know the paragraph's number, which is one of the critical pieces of information I need to record in the external document.

    The process I'm currently using works fine on a small document of ~70 pages with ~340 numbered paragraphs. It runs in a couple of minutes. I suspect that I've run up against one of Word's many limitations in dealing with large documents. It may be that Word's .InsertCrossReference algorithm never considered performance in large documents; it may have n-squared or worse performance.

    One potential alternative I do have is let the process take its time, but eliminate the need for frequent user intervention. I will ask in a separate post if there is some way to stop Word from collecting Undo information. That should keep it from asking if it's OK to continue without Undo.

    Does anyone else have other ideas on speeding up the process?

    Thanks again, Paul.

    -Gerrie

    Thursday, April 21, 2011 10:38 PM
  • Hi Gerrie,

    I'm not sure why you need a cross-reference for every numbered paragraph. As I think you've realised, numbered paragraphs don't get bookmarks until there's a cross-reference to them.

    Nevertheless, if that's the way you want to go, you could use the code I posted as the basis for a routine that checks whether a bookmark exists for the paragraph and, if not, insert a cross-reference to it. This is bound to be quicker and involve less overhead overall than insert a cross-reference to every such paragraph.

    FWIW, you can often get a considerable improvement in performance by changing:
    With Selection
    to:
    With Selection.Range.Duplicate


    Cheers
    Paul Edstein
    [MS MVP - Word]
    Thursday, April 21, 2011 11:05 PM
  • Hi Paul,

    I'm attempting to produce a semi-automatic inter-document cross-referencing mechanism, since Word doesn't support the functionality.


    My client publishes large documents that use a large number of inter-document cross-references where the referencing document needs to reference numbered paragaraphs—by number, not by content—in the target document. The documents are published asynchronously, modified, then re-published asynchronously. New documents are also added to the collection periodically. Any numbered paragraph in any document is a valid target for a cross-reference. It is not possible to know when Document A, Version n is published which of its numbered paragraphs might later be a cross-reference target from Document X or Document Y.


    Furthermore, when Document A, Version n+1 is published, it is possible that some of its cross-reference targets will have their number changed from Version n. Then, when the next version of Document X is published, the cross-reference number of each target in Document A needs to be updated to reflect its new paragraph number in Document A, Version n+1—the same way that all of Document A's internal cross-references would get updated.

    Maybe I should have asked a more fundamental question, but I was trying quickly to get around an observed performance bottleneck in a solution that is otherwise workable. The only way I know to get updatable information from one Word document into another is by either the INCLUDE or INCLUDETEXT fields. INCLUDE is clearly useless—I have no need for a whole document. INCLUDETEXT cannot be used to directly access the target document for a couple of reasons. The biggest problem is that the paragraph numbers themselves are not directly accessible in the document. You can't select the number to create a bookmark for it.

    So, what I'm doing, for each published document, is creating a standalone cross-ref document that contains each numbered paragraph's paragraph number, and its text. I bookmark each paragraph number with a new bookmark local to the cross-ref document whose name includes the original bookmark name from the published document. Every time the published document changes, I generate a new cross-ref document. With some editing discipline, every paragraph in Document A retains its same "_Refnnnnnnnnn", so my next generated cross-ref document has the same bookmark name referring to the new number for the "same" original paragraph in Document A. The field in the cross-referencing Document X looks like:

             { INCLUDETEXT "A-interdoc-cross-ref.docx" interDocCrossRef_Refnnnnnnnnn \! }

    where "interDocCrossRef" is a somewhat arbitrary string to keep the local name from starting with an underscore, and being somewhat meaningful to the curious user who might inspect Field Code in Document X.

    I was avoiding all this detail originally, because I figured readers would ignore my whole post.

    Now that I've provided most of the gory details, do you have any suggestions for alternate approaches?

    Thanks,
    Gerrie

    Friday, April 22, 2011 12:29 AM
  • I Gerrie,

    In that case, building a TOC is probably the fastest way to generate the cross-references. For that, it may or may not prove beneficial to do so without opening the document; instead you could use a TOC with an RD field in another document.


    Cheers
    Paul Edstein
    [MS MVP - Word]
    Friday, April 22, 2011 1:40 AM
  • Update: Scratch the idea of using a TOC. The cross-references created by a TOC are volatile and get deleted & recreated whenever the TOC is updated.

    Perhaps the best performance improvement you'll get is by using:
    Selection.Range.Duplicate


    Cheers
    Paul Edstein
    [MS MVP - Word]
    Friday, April 22, 2011 3:50 AM
  • Thanks, Paul,

    I hadn't tested that yet, but that was part of the reason I gave up on the TOC approach earlier. I wasn't sure they would persist.

    I think I'm going to bite the bullet and assign all my own bookmarks. It occurred to me yesterday—duh—that I have already assigned bookmarks to 1800 of my 2600 numbered paragraphs, so I would add only 800 new bookmarks. But then I never need to use .InsertCrossReference. Those 1800 have special meaning relative to document content, and I made them specifically to be able to create stable, reliable cross-references to those paragraphs. I am using them in my external cross-ref document—when there are multiple bookmarks assigned to the paragraph of interest, I always use the one I created instead of Word's "_Ref...".

    I don't see how Selection.Range.Duplicate helps. When I call Selection.InsertCrossReference, the Selection is collapsed to a point.

    Thanks for the discussion.

    -Gerrie

    Friday, April 22, 2011 4:14 PM
  • Hi Gerrie,

    Word's _Ref bookmarks are quite stable; it's the _Toc bookmarks that aren't. Indeed, the bookmarks you're assigning would probably be at greater risk of range corruption and deletion from any subsequent edits of the document.

    For whatever reason, using Selection.Range.Duplicate can cut execution dramatically in some circumstances. Try it - it might work in this case too.


    Cheers
    Paul Edstein
    [MS MVP - Word]
    Friday, April 22, 2011 9:43 PM
  • Hi Paul,

    Selection.Range.Duplicate doesn't work in my situation.

    Even though it made no sense to me how creating a static copy (.Duplicate) of a dynamic object (Selection.Range) that I would explicitly modify (with Selection.InsertCrossReference) and therefore invalidate the static copy, I tried it.

    I made a two line change, wrapping my .InsertCrossReference inside a

          With Selection.Range.Duplicate   ...  End With

    then reran my previously functioning macro, and I got bizarre, inexplicable changes in the middle of my document. Nothing had ever been done in the middle before. Before invoking any of the code which inserts text and cross-references, I carefully execute

        With d
            .Characters.Last.Select
            With Selection
                .Collapse (wdCollapseEnd)
                .InsertBreak
                d.Characters.Last.Select
                Clear_ALL_Selection_Formatting
                .Collapse (wdCollapseEnd)
            End With
        End With 'd

    Where 'd' is passed in to the Subroutine as the ActiveDocument.

    Gerrie


    • Proposed as answer by Fluid Roaster Monday, April 25, 2011 8:50 PM
    • Unproposed as answer by Gerrie Shults Monday, April 25, 2011 8:56 PM
    • Edited by Gerrie Shults Monday, April 25, 2011 9:06 PM Make it more obvious that proposed change didn't work.
    Monday, April 25, 2011 2:35 PM
  • Hey paul and Gerrie,

    nevermind my last.

    I just wanted to chime in and say that I have had similar problems using selection.InsertCrossReference. There is a definite slow down from word 2007 to word 2010. My macro went from taking 2-3 min to 2-3 hours. So if anyone finds a solution please post it.

    Thanks and sorry I could add anything useful.

    John

     

    Monday, April 25, 2011 9:00 PM
  • Hi Fluid Roaster,

    My previous post indicated that With Selection.Range.Duplicate does not work. (That failure was not stated explicitly up front. I will edit that post to make that more clear.)

    My original question as to whether there is a way to eliminiate the performance problem that I've encountered is still unanswered. Paul and I have discussed some ways to work around the problem, but the only workaround that avoids the performance problem is functionally different and has undesirable end-user-visible side effects. It may ultimately get my job done, but it doesn't answer the original question.

    Regards,

    Gerrie

    Monday, April 25, 2011 9:04 PM
  • Hi Gerrie,

    I'm not sure why you got inexplicable changes using 'Duplicate'. Try it this way:

    Sub Demo()
    Application.ScreenUpdating = False
    Dim oBkMrk As Bookmark, StrBkMks As String
    Dim Rng As Range, i As Long, j As Long
    Dim bBkMkMatch As Boolean, idxList() As String
    With ActiveDocument
      .Range.InsertAfter vbCr
      Set Rng = .Range.Characters.Last.Duplicate
      idxList = .GetCrossReferenceItems(wdRefTypeNumberedItem)
      .Bookmarks.ShowHidden = True
      For Each oBkMrk In .Bookmarks
        If InStr(oBkMrk.Name, "_Ref") > 0 Then _
        StrBkMks = StrBkMks & "," & oBkMrk.Name
      Next oBkMrk
      For i = LBound(idxList) To UBound(idxList)
        bBkMkMatch = False
        With .ListParagraphs(i)
          For j = 1 To UBound(Split(StrBkMks, ","))
            If .Range.Bookmarks.Exists(Split(StrBkMks, ",")(j)) Then
              bBkMkMatch = True
              Exit For
            End If
          Next
          If bBkMkMatch = False Then
            Rng.InsertCrossReference wdRefTypeNumberedItem, wdNumberNoContext, i, False
          End If
        End With
      Next i
      For Each oBkMrk In .Bookmarks
        If InStr(oBkMrk.Name, "_Ref") > 0 Then _
        StrBkMks = StrBkMks & "," & oBkMrk.Name
      Next oBkMrk
      .Bookmarks.ShowHidden = False
      Rng.Delete
      Set Rng = Nothing
    End With
    Application.ScreenUpdating = True
    End Sub

    With this code, only those paragraphs that don't already have an internal cross-reference bookmark get the InsertCrossReference treatment. When the macro finishes procesing the ListParagraphs, looping through the bookmarks collection a second time leaves StrBkMks with a complete set of the document's internal cross-references.

    I ran the above on an 783 page document with 1616 list paragraphs, none of which had a prior internal cross-reference. On my system, it took 16 seconds to generate all 1616 cross-references.


    Cheers
    Paul Edstein
    [MS MVP - Word]




    • Edited by macropodMVP Tuesday, April 26, 2011 3:01 AM edit: With .ListParagraphs(i)
    Tuesday, April 26, 2011 12:09 AM
  • Hi Paul,

    You have a logic flaw in your code, and it still doesn't solve my problem.

    The logic flaw is in assuming there is a 1-to-1 correspondence between items produced by .GetCrossReferences(wdRefTypeNumberedItem) and .ListParagraphs. Among other things, .ListParagraphs includes all bulleted paragraphs in addition to numbered paragraphs. For example, in my document .GetCrossReferences(wdRefTypeNumberedItem) returns 2700+ items; .ListParagraphs returns 4300+. I haven't determined whether .GetCrossReferences includes items that are not in .ListParagraphs, but that may or may not be important.

    The whole purpose of my intended operation is to produce both the associated bookmark for each numbered paragraph and the textual representation of its paragraph number. If the two lists were in 1-to-1 correspondence, I could get the paragraph number from the idxList(i). But they're not, and they never will be for my document set. 

    I'm now working on a solution to winnow both lists to the set of essential numbered paragraphs. If I can do that, I have the needed 1-to-1 correspondence. If I can't, I will need to execute

        .InsertCrossReference wdRefTypeBookmark, wdNumberNoContext, _
                                        oBkMrk.Name, False

    for each pre-existing bookmark and extract the paragraph number from the Field result text. That's exactly how I get the paragraph number today after executing

        .InsertCrossReference wdRefTypeNumberedItem, wdNumberNoContext, _
                                        i, False

    My first pass on list winnowing produced 2688 items from idxList and 2691 items from .ListParagraphs. I'll give more once I discover why they differ.

    -Gerrie

    Wednesday, April 27, 2011 4:15 PM
  • Hi Paul,

    The difference between the two list counts was due to having tracked changes in the document with "Final Show Markup" selected. Turning off the display of markup brought my two processed lists into sync.

    By the way, even though the document contains tracked changes, all of my work on this problem is being done with Change Tracking Off, so that hasn't been a factor in performance problems.

    Because of other document complexities and the need to use existing bookmarks other than the Word-generated "_Ref..." ones, I have yet to test the approach that only executes .InsertCrossReference when there is no pre-existing bookmark.

    I did, however, run one quick test with Paul's latest Demo(), to which I added timing statements. For test purposes to gather timing statistics, I called .InsertCrossReference for every numbered item. Most calls were fast, but several still took 15-30 seconds, including two calls that were for paragraphs with pre-existing bookmarks that took 26 seconds each.

    -Gerrie

    Thursday, April 28, 2011 2:05 AM