none
parsing and spliting a word 2007 doc RRS feed

  • Question

  • hi, My goal is to find some words in a word 2007 doc, for this i need to split each word inside it. i've saw that some words that were bulltin came with \r\a char sign. as well i've read that \u2022 is the unicode for word bullet. i tried to find more unicode so i can split everything in the doc, but couldn't find anything...

    1) can anyone help me with the unicode signs? (i need to split things like: numbering, bulletins... basiclly all the document)

    2) what is the most efficent way to search a bunch of words inside of a Word doc?

    Thanks in advance.


    Thanks, Sharon.

    Tuesday, February 28, 2012 5:30 AM

Answers

  • The following reference can help you find the Unicode number for various symbols and keyboard characters.

    http://ascii-table.com/unicode-characters.php

    Regarding a search for various words, in my opinion, the method to use would be either the Selection.Find or Range.Find.  Some agrue that the Selection.Find is faster but I prefer the Range.Find.

    If you load into an array the specific words you are after you can loop through the array and then find if the word exists in the document. If found, you'll then have to have additional code to determine if the given word exists onto itself or is a part of another word. It's doable but not a trival task.


    Kind Regards, Rich ... http://greatcirclelearning.com

    • Marked as answer by sharonapa Wednesday, February 29, 2012 6:18 AM
    Tuesday, February 28, 2012 3:03 PM