none
using linq to xml to work with Word documents RRS feed

  • Question

  • He, 

    I discovered LINQ to XML.  

    My project works with Word documents open in memory with docx.

    He must highlight some words in these documents. Is LINQ TO XML the right tool for doing this?

    Do you have some examples?

    Thank you for your reply.


    Alain

    Monday, November 27, 2017 5:48 PM

Answers

  • Hello AchLog,

    >>He must highlight some words in these documents. Is LINQ TO XML the right tool for doing this?

    Linq To XML is a technology related to how to parse data on xml format. It has no ability to make your words highlight in your document file. If you want to change the text display format, you need to do some search on office api. The following is a simple examples using Microsoft.Office.Interop.Word namespace.

    static void Main(string[] args)
            {
                object missObj = Missing.Value;
                object path = @"xxx.docx";
                Application app = new Application();
                Document doc = app.Documents.Open(ref path, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj);
    
    
                doc.Words.OfType<Range>().Select(x =>
                {
                    if (x.Text.Equals("text"))
                        x.HighlightColorIndex = WdColorIndex.wdYellow;
                    return x;
                }).ToList();      
            }
        }

    Best regards,

    Neil Hu


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    • Edited by Fei HuModerator Tuesday, November 28, 2017 5:29 AM
    • Marked as answer by AchLog Tuesday, November 28, 2017 1:03 PM
    Tuesday, November 28, 2017 5:25 AM
    Moderator
  • Hi Alain,

    You can limit your work to specific paragraph by Paragraphs property of Document.

    Below is an example to set "hello" in odd paragraphs red highlighted.

    static void Main(string[] args) { object path = @"xxx.docx"; Application app = new Application(); Document doc = app.Documents.Open(ref path); Paragraphs p = doc.Paragraphs; Console.WriteLine("{0} paragraphs in the document.", p.Count); try { for (int i = 1; i <= p.Count; i += 2) { p[i].Range.Words.OfType<Range>().Select(x => { if (x.Text.Equals("hello")) { x.HighlightColorIndex = WdColorIndex.wdRed; } return x; }).ToList(); } } catch (Exception ex) { Console.WriteLine("Exception: {0}", ex.Message); } finally { doc.Close(); } Console.ReadKey(); }

    The result is:

    You can use Paragraphs property to do what you want, one thing to notice, index of paragraphs is from 1, not 0.

    Best Regards,

    Charles He


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Thursday, December 7, 2017 5:28 AM

All replies

  • Hello AchLog,

    >>He must highlight some words in these documents. Is LINQ TO XML the right tool for doing this?

    Linq To XML is a technology related to how to parse data on xml format. It has no ability to make your words highlight in your document file. If you want to change the text display format, you need to do some search on office api. The following is a simple examples using Microsoft.Office.Interop.Word namespace.

    static void Main(string[] args)
            {
                object missObj = Missing.Value;
                object path = @"xxx.docx";
                Application app = new Application();
                Document doc = app.Documents.Open(ref path, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj, ref missObj);
    
    
                doc.Words.OfType<Range>().Select(x =>
                {
                    if (x.Text.Equals("text"))
                        x.HighlightColorIndex = WdColorIndex.wdYellow;
                    return x;
                }).ToList();      
            }
        }

    Best regards,

    Neil Hu


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    • Edited by Fei HuModerator Tuesday, November 28, 2017 5:29 AM
    • Marked as answer by AchLog Tuesday, November 28, 2017 1:03 PM
    Tuesday, November 28, 2017 5:25 AM
    Moderator
  • Hi Neil Hu

    Thank you for your answer and the example.
    it works well.

    Best regards


    Alain

    Tuesday, November 28, 2017 1:03 PM
  • Hi,

    The method you specified for highlighting words in a Word document works well, but is very long when the document is large. Is it possible to limit this work to a paragraph whose number (index) is known?

     I do not see how to do that.

    Thank you for your reply.


    Alain

    Wednesday, December 6, 2017 3:26 PM
  • Hi Alain,

    You can limit your work to specific paragraph by Paragraphs property of Document.

    Below is an example to set "hello" in odd paragraphs red highlighted.

    static void Main(string[] args) { object path = @"xxx.docx"; Application app = new Application(); Document doc = app.Documents.Open(ref path); Paragraphs p = doc.Paragraphs; Console.WriteLine("{0} paragraphs in the document.", p.Count); try { for (int i = 1; i <= p.Count; i += 2) { p[i].Range.Words.OfType<Range>().Select(x => { if (x.Text.Equals("hello")) { x.HighlightColorIndex = WdColorIndex.wdRed; } return x; }).ToList(); } } catch (Exception ex) { Console.WriteLine("Exception: {0}", ex.Message); } finally { doc.Close(); } Console.ReadKey(); }

    The result is:

    You can use Paragraphs property to do what you want, one thing to notice, index of paragraphs is from 1, not 0.

    Best Regards,

    Charles He


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Thursday, December 7, 2017 5:28 AM
  • Hi Charles He,

    Thank you very much for your answer that works well. And I note that the index on Paragraphs starts at one.

    Can we go further to be even faster?
    In fact I know more precisely the location of words to highlight. I also know the index on the element that contains it (w: t in XML). Can we access this element directly?

    Thank you for your help.


    Alain

    Thursday, December 7, 2017 9:32 AM
  • Hi Alain,

    Here is an example to set specific word in specific paragraph. I set the 10th word in the 5th paragraph.

            static void Main(string[] args)
            {
                object path = @"xxx.docx";
                Application app = new Application();
                Document doc = app.Documents.Open(ref path);
    
                Paragraphs p = doc.Paragraphs;
    
                Console.WriteLine("{0} paragraphs in the document.", p.Count);
    
                try
                {
                    p[5].Range.Words[10].HighlightColorIndex = WdColorIndex.wdRed;
                }
                catch (Exception ex)
                {
                    Console.WriteLine("Exception: {0}", ex.Message);
                }
                finally
                {
                    doc.Close();
                }            
    
                Console.ReadKey();
            }

    And the result is:

    Hope this will help you.

    Best Regards,

    Charles He


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.


    Thursday, December 7, 2017 11:21 AM
  • Thank you Charles He for your answer. It works indeed.
    But the word N° x (10 in your example) is not quite the same as "w:t" of rank x because "w:t" can have several words.

    Is there a way to read (and highlight) the "w:t" item of rank x?

    Thank you for your reply.

    Alain

    Friday, December 8, 2017 8:13 PM
  • Hi Alain,

    Sorry, but you make me sort of confused. Do you follow Fei Hu's suggestion to read word document as "xxx.docx" or convert your document to "xxx.xml" file and read that xml file? I saw you mentioned "w:t", it's in xml file, right?

    Best Regards,

    Charles He


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Saturday, December 9, 2017 1:23 AM
  • He Charles He,

    I have two versions of my document: a original version xxx.docx and its conversion to xxx.xml done with the Docx library.
    I analyzed the document in its xml form. This is the reason why I know very precisely the position of each word in w:t elements but not their index in the document.

    Now I want to highlight some words. I know how to modify the xml to do it. But I do not know how to transform this new xml into yyy.docx ...

    If I use Interop.Word to highlight, with the technique de Nei Hu, I have to go through the whole paragraph looking for the words concerned. Which is very long if the paragraph is long and complex.

    What is the best way to do things?

    Thank you for your help on this problem.


    Alain

    Saturday, December 9, 2017 9:09 AM
  • Hi Alain,

    Now I get you. I think you have figure out how to highlight words in .xml file, yes? And now you want to know how to convert .xml file to word document, yes? If so, you can read all text from your .xml file and write it to the .doc file. Like this:

            static void Main(string[] args)
            {
                string oldpath = @"xxx.xml"; //xml file
                string newpath = @"xxx.doc"; //doc file
                Convert(oldpath, newpath);
    
                Console.ReadKey();
            }
    
            static void Convert(string oldpath, string newpath)
            {
                string content = File.ReadAllText(oldpath, Encoding.UTF8);
    
                //write to doc
                StreamWriter sw = new StreamWriter(newpath);
                sw.Write(content);
                sw.Flush();
                sw.Close();
            }

    If you replace .doc with .docx, you might have trouble opening that document because .docx file is different from .doc file. If .doc file doesn't meet your requirement, you can use Document.SaveAs2 to save .doc file as .docx file.

    Best Regards,

    Charles He


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.


    Saturday, December 9, 2017 11:16 AM
  • Hello Charles He,
    Sorry for the delay in responding to your proposal. I am unfortunately unavailable for several days.

    I have to put my project on hold.
    I will resume soon.

    Best Regards


    Alain

    Tuesday, December 12, 2017 2:25 PM
  • Hi Alain,

    It's ok, we can discuss about your project again when you get back.

    Best Regards,

    Charles He


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Friday, December 15, 2017 1:50 AM