none
Automation of document content extraction RRS feed

  • Question

  • I have a routine that visits Word tables, uses ranges to find specific cell values with text matching and retrieve those values (for later db upload). My routine works fine if I step through it, but if I automate it, it breaks with the error trapped. I cannot figure out what it is since it works fine by either stepping through with F8 OR processing one Word document at a time. It does not work in batch mode with consecutive Word docs, wherein each is opened in turn.

    Please help. I need consistency.

    Wednesday, July 18, 2012 8:55 PM

Answers

  • You say "If it is "Test Concentration <paragraph return> (mg/L)" then the error is trapped outside in the calling module" and " If it is run in batch mode, the error is raised, but only for certain Word documents". In that case, perhaps the issue has to do with your 'fCleanWord' function, which you haven't provided. Another possibility is that, in some cases, the 'recurse' value is too large - you should perhaps include a test to establish how many cell there are and use the lesser of the 'recurse' value and the cell count.


    Cheers
    Paul Edstein
    [MS MVP - Word]

    Thursday, July 19, 2012 11:43 PM

All replies

  • It would help if you posted your code. Without it, we'd just be guessing. Also, check out:
    http://www.vbaexpress.com/forum/showthread.php?t=42850


    Cheers
    Paul Edstein
    [MS MVP - Word]

    Thursday, July 19, 2012 12:57 AM
  • Hi Cadet

    In addition to Paul's question, what's the error message - exact wording?


    Cindy Meister, VSTO/Word MVP

    Thursday, July 19, 2012 11:47 AM
    Moderator
  • The following function performs consistently in single Word document mode and extracts table cell contents, recursing through Word tables. If I run it in batch mode, it often fails, but after working successfully about 14 times in batch mode.

    However I seemed to have found the problem. If  the table cell contents is (sLabel=) "Test Concentration (mg/L)" it works fine. If it is "Test Concentration <paragraph return> (mg/L)" then the error is trapped outside in the calling module, not here - I never get this error message.

    The new question becomes why does a paragraph return affect .Find? Do I need to strip this out within the function so .Find is fed what it wants? Sorry, still working in VB.

    Function fRetCellValFromLabel(rngIn As Word.Range, sLabel As String, _
                                                Optional cellnum = 2, _
                                                Optional recurse = 0, _
                                                Optional bMWC = False, _
                                                Optional bFirstWord = False, _
                                                Optional bUpdateRange = False) As String
    'This function returns the adjacent cell string from found sLabel. Doesn't return dashes
    'rngIn is the range within which we are searching sLabel; updates to start of found if bUpdateRange = True
    'cellnum is the number of cells over we want the value (default is 2)
    'recurse is the number of loops we want the Find method to perform
    'bMWC is the flag for MatchWildCards - turn off for ( and #
    'bFirstWord is the flag to return the first word from a cell's content in case there is more than one word
    Dim rngFind As Word.Range
    Dim rngRef As Word.Range
    Dim strTemp As String
    Dim irecurse As Integer, i As Integer
    On Error GoTo catchit
    Debug.Print "inside function fRetCellValFromLabel for " & sLabel
    Set rngFind = rngIn.Duplicate
    Set rngRef = rngIn.Duplicate
    With rngFind.Find
        
        .Text = sLabel
        .ClearFormatting
        .Forward = True
        .MatchWholeWord = False
        .MatchWildcards = bMWC
        '.Execute
        If recurse > 0 Then
            For irecurse = 1 To recurse
                .Execute
            Next
        Else
            .Execute
        End If
        
        If .Found Then
            'Set the beginning of rngRef to the rngFind's found start
            rngRef.Start = rngFind.Start
            strTemp = fCleanWord(rngRef.Cells(cellnum).Range.Text, False)
            'Exercise first word option, if present
            If bFirstWord Then strTemp = GetFirstWord(strTemp)
            fRetCellValFromLabel = strTemp
                
        Else
            fRetCellValFromLabel = ""
        End If
    End With
    If bUpdateRange Then rngIn.Start = rngRef.Start
    Xit:
        Exit Function
    catchit:
        i = MsgBox("fRetCellValFromLabel error", vbCritical, "Error trapping")
        Stop
        Resume Xit
    End Function

    Thursday, July 19, 2012 12:57 PM
  • Hi Cadet

    VBA, VB6, VB.NET, some other "flavour"?

    If you comment out On Error so that the code should stop and throw the error as it happens do you get any error information?

    Just based on your description, my inclination would be to look for an infinite loop. The major pitfall when using Find in tables is that it doesn't necessarily continue on from the point you expect - it doesn't work in the same way as when you "Find" in contiguous text. There's tendency to "kick" the starting point back to the start of the cell the Range is in, to the start of the Row, or the start of the table. So when I know Find is going to have to restart in a table, I always include some code that moves the re-starting point to the next Cell so that I don't continue finding the same text, over and over.

    I can't tell from your code whether you've allowed for that?


    Cindy Meister, VSTO/Word MVP

    Thursday, July 19, 2012 1:38 PM
    Moderator
  • I get error 5941, again inconsistently. If I run the program in single document mode, the program runs fine. If it is run in batch mode, the error is raised, but only for certain Word documents. If I put a Stop just before calling the above posted function, then resume (in batch mode), the program works fine, again only for some documents. I do update my ranges incrementally as I advance cell-by-cell, finding the target labels, and capturing the adjacent values. This is simply inconsistent behavior which is hard to troubleshoot.  I cannot trap the .Find method on a range object (of cells). Fortunately there is a work-around which may work more consistently in batch mode.


    I am using VB 6.5
    • Edited by Cadet9 Thursday, July 19, 2012 8:39 PM
    Thursday, July 19, 2012 6:44 PM
  • You say "If it is "Test Concentration <paragraph return> (mg/L)" then the error is trapped outside in the calling module" and " If it is run in batch mode, the error is raised, but only for certain Word documents". In that case, perhaps the issue has to do with your 'fCleanWord' function, which you haven't provided. Another possibility is that, in some cases, the 'recurse' value is too large - you should perhaps include a test to establish how many cell there are and use the lesser of the 'recurse' value and the cell count.


    Cheers
    Paul Edstein
    [MS MVP - Word]

    Thursday, July 19, 2012 11:43 PM