none
Log Word Interop Find Results RRS feed

  • Question

  • I am using Word Interop to automate some Find & Replace operations on multiple files at work, but i will need to keep a record of what was changed as a log.

    The idea is that a person wouldn't have to go through all the files and perform the changes but rather simply run my program then check the log and only see the sentences that got changes in each file to verify the changes - wich should be a lot less text to mow through to check everything.

    After this, only the files that needed backtracking would be open (in case of a change that wasn't required) and there would be enough information to find the exact place of the change.

    I would think keeping the previous 2/3 words and next 2/3 words and keeping those would be enough for each changed location.

    i know i can use Range.Find.HitHighLight to highlight everything before any changes are done, but how do i loop through those highlight targets and get them into a log of some sort? it should be possible since HitHighLight shows them all.

    UPDATE: turning on Revisions for the document before executing the find & replace i manage to get sereral revisions ranges with almost what i need. Here is what i have:

    foreach (Word.Range rng in doc.StoryRanges)
                        {
                            // Show Search text as Highlight (My inicial attempt)
                            rng.Find.HitHighlight("text to find", Word.WdColor.wdColorYellow, Word.WdColor.wdColorBlack, false, false, false, false, false, true, false, false, false, false, false, false, false, false);
                            // Replace The Text
                            rng.Find.Execute("text to find", false, false, true, false, false, true, Word.WdFindWrap.wdFindContinue, false, "replacement text", Word.WdReplace.wdReplaceAll, false, false, false, false);
                            // Check what's changed
                            foreach (Word.Revision r in rng.Revisions)
                            {
                                // I could check either deleted or added
                                if (r.Type == Word.WdRevisionType.wdRevisionDelete)
                                {
                                    Word.Range rg = r.Range;
                                    rg.Expand(Word.WdUnits.wdParagraph);
                                    MessageBox.Show(rg.Text);
                                }
                            }
                        }

    The problem with this approach is i get both the text i search and the one i replace it with glued together - But i feel i am close to a solution

    Will keep looking after the weekend

    Any help is apreciated

    best regards


    Luís Rodrigues



    • Edited by 537mfb Friday, February 10, 2012 6:30 PM Update on evolution with problem
    Friday, February 10, 2012 5:16 PM

Answers

  • << The RANGE.Find RANGE alters to include what's found and Find.Excute returns true; otherwise RANGE doesn't change and Find.Execute returns false>>

    True, But there's no way to know where in the range the change took place, and in my case the range is a story, so the true return only tells me something was changed in the range - doesn't help me in getting a range around the change. Like i said, i wan't only at MOST a paragraph although just a couple words before and after the changed place would be better. - Range.Find doesn't give me that unless my range would be really small to begin with - wich in large documents would be too combersom to do  as it would mean creating what would be maybe over 1000 small ranges then go to each and apply all changes and check if any applied. And then again, if a particular change would happen to span over 2 or more words, there might be a chance i had broken those words into separate ranges and the change wouldn't be applied - Wich leads me to my conclusion that this doesn't work.

    << if he wants to invest the resources he can do so, but if there's built-in functionality already there...>>

    Actually there isn't such resource - like i said, this would be done over several documents at the same time - Word only allows your process to be done over one document at a time. The idea is to automate this over n documents so people won't have to be doing this over each document one at a time. Imagine having to apply 3 different replacements over a batch of 10 files and then check each for places where the change shouldn't have been done. And like i said, a particular client recently sent us a batch of 31 files. See where automation is required?

    Anyway, thankls for all the help you have been giving me - i really appreciate it. The idea of using the revisions collection to capture the paragraphs with changes is the one i am going with, as it seems the only one that gives me what i want so far. Something like:

    foreach (Word.Range rng in doc.StoryRanges)
                    {
                        foreach (DataGridViewRow r in Subst.Rows)
                            if (r.Cells[0].Value != null)
                                rng.Find.Execute(r.Cells[0].Value.ToString(), false, false, false, false, false, true, Word.WdFindWrap.wdFindContinue, false, r.Cells[1].Value.ToString(), Word.WdReplace.wdReplaceAll, false, false, false, false);
                        foreach (Word.Revision r in rng.Revisions)
                        {
                            if (r.Type != Word.WdRevisionType.wdRevisionInsert)
                                r.Accept();
                        }
                        foreach (Word.Revision r in rng.Revisions)
                        {
                            if (r.Type == Word.WdRevisionType.wdRevisionInsert)
                            {
                                Word.Range rg = r.Range;
                                rg.Expand(Word.WdUnits.wdParagraph);
                                Log.Items.Add(s.OLEFormat.IconLabel + " => " + rg.Text);
                                r.Accept();
                            }
                        }
                    }
    Where the DataGridView holds the replacements to be made. I do a replaceAll, accept all changes except inserts and then go through each insert and get a range on that paragraph.

    Luís Rodrigues




    • Marked as answer by 537mfb Tuesday, February 14, 2012 9:43 AM
    • Edited by 537mfb Tuesday, February 14, 2012 9:47 AM Fix Wording
    Tuesday, February 14, 2012 9:43 AM

All replies

  • Hi Luis

    I was going to suggest Revisions, and that it would be enough to display the documents with Revisions to the user. Using the Revisions pane or similar tools the user could cycle through the document(s) quite quickly and see the full context of what was changed. With the additional advantage that the information about what the text was originally would be available.

    Using the approach you're currently looking at, you could ACCEPT the revisions in the Range, which would leave the text as you've changed it.

    The approach I probably would have taken, if I don't want the user to see all the revisions, would be to perform Find/Replace in a loop. The object "rng" will contain the Found text, which I can then expand however I want, and transfer the content to a "log". Then the code loops again... Find returns a boolean value of True when the search is succesful and false when it's not, which is what you use to control the loop.


    Cindy Meister, VSTO/Word MVP

    Saturday, February 11, 2012 7:04 AM
    Moderator
  • Hi Cindy

    Thanks for the reply.

    I considered ypur first solution for a while but unfortunatly in this things the boss must have what the boss wants - and he wants to avoid having people skim through every document - Wich sometimes is understandable, just recently we received from a single client a batch of over 30 files with several pages each. it's a lengthy work, while there might be only a few changes in the all batch to be made, having people go through over 30 docs looking to see where those changes are is something he doesn't see viable. To him, just looking at a log will suffice and reduce the job to the specific places changes took place.

    As to your second solution, that's pretty much what i had in mind just about. First, accept all changes except either deleted or added, then loop again through whatever revisions are left, accept them and get the text of it's paragraph in the same way i mention in my question entry.

    As for your advised solution at the end, (approach you would take), that was about what i was thinking at first, however, although i can check the result of find.execute to know if was found a new place for a replacement, i couldn't figure out how to retrieve the region of the change. As you can see from my question entry, my 'rng' variable will always contain a Story in the document, wich in simpler documents might even be the whole document. I want at MOST the paragraph the change took place (keeping it to just a couple of words before and after the changed location would be better). So i'm sorry but i really don't see how to implement that solution. It was my first thought though. If only find would return false on fail and a region on success...

    Update: Then again, if only i could get something like line / character kind of data in every change, that might work for me aswell, if i can then use that info to get the text - Getting the text from the changed line for example. Food for thought

    Best regards


    Luís Rodrigues



    • Edited by 537mfb Monday, February 13, 2012 10:21 AM Re-thought - Update
    Monday, February 13, 2012 10:10 AM
  • Hi Luis

    <<he wants to avoid having people skim through every document >>

    Word does have a Review tab in the Ribbon with Next and Previous buttons that let you jump from one revision to the next, without having to scroll or skim. I mean, if he wants to invest the resources he can do so, but if there's built-in functionality already there...

    Also, in the Print dialog box there's a "List markup" in the "Print what" list. That will pretty much do what you want to do, printing the information to paper.

    <<If only find would return false on fail and a region on success...>>

    It does exactly that. The RANGE.Find RANGE alters to include what's found and Find.Excute returns true; otherwise RANGE doesn't change and Find.Execute returns false. The assignment you make to the rng variable will not remain static. To get the paragraph from that range: rng.Paragraphs[1].Range.Text;


    Cindy Meister, VSTO/Word MVP

    Monday, February 13, 2012 5:58 PM
    Moderator
  • << The RANGE.Find RANGE alters to include what's found and Find.Excute returns true; otherwise RANGE doesn't change and Find.Execute returns false>>

    True, But there's no way to know where in the range the change took place, and in my case the range is a story, so the true return only tells me something was changed in the range - doesn't help me in getting a range around the change. Like i said, i wan't only at MOST a paragraph although just a couple words before and after the changed place would be better. - Range.Find doesn't give me that unless my range would be really small to begin with - wich in large documents would be too combersom to do  as it would mean creating what would be maybe over 1000 small ranges then go to each and apply all changes and check if any applied. And then again, if a particular change would happen to span over 2 or more words, there might be a chance i had broken those words into separate ranges and the change wouldn't be applied - Wich leads me to my conclusion that this doesn't work.

    << if he wants to invest the resources he can do so, but if there's built-in functionality already there...>>

    Actually there isn't such resource - like i said, this would be done over several documents at the same time - Word only allows your process to be done over one document at a time. The idea is to automate this over n documents so people won't have to be doing this over each document one at a time. Imagine having to apply 3 different replacements over a batch of 10 files and then check each for places where the change shouldn't have been done. And like i said, a particular client recently sent us a batch of 31 files. See where automation is required?

    Anyway, thankls for all the help you have been giving me - i really appreciate it. The idea of using the revisions collection to capture the paragraphs with changes is the one i am going with, as it seems the only one that gives me what i want so far. Something like:

    foreach (Word.Range rng in doc.StoryRanges)
                    {
                        foreach (DataGridViewRow r in Subst.Rows)
                            if (r.Cells[0].Value != null)
                                rng.Find.Execute(r.Cells[0].Value.ToString(), false, false, false, false, false, true, Word.WdFindWrap.wdFindContinue, false, r.Cells[1].Value.ToString(), Word.WdReplace.wdReplaceAll, false, false, false, false);
                        foreach (Word.Revision r in rng.Revisions)
                        {
                            if (r.Type != Word.WdRevisionType.wdRevisionInsert)
                                r.Accept();
                        }
                        foreach (Word.Revision r in rng.Revisions)
                        {
                            if (r.Type == Word.WdRevisionType.wdRevisionInsert)
                            {
                                Word.Range rg = r.Range;
                                rg.Expand(Word.WdUnits.wdParagraph);
                                Log.Items.Add(s.OLEFormat.IconLabel + " => " + rg.Text);
                                r.Accept();
                            }
                        }
                    }
    Where the DataGridView holds the replacements to be made. I do a replaceAll, accept all changes except inserts and then go through each insert and get a range on that paragraph.

    Luís Rodrigues




    • Marked as answer by 537mfb Tuesday, February 14, 2012 9:43 AM
    • Edited by 537mfb Tuesday, February 14, 2012 9:47 AM Fix Wording
    Tuesday, February 14, 2012 9:43 AM