none
Merge multiple Word 2007 documents with SDK RRS feed

  • Question

  • Hello all,

    What I need to be able to do is merge some documents, then add some text, maybe a table then merge a few more documents...change some text on the document, add more text or tables, merge...etc.  It all seems to be working fine but when I open my finished document, all my text and tables are located at the end of my document while all the merged documents are at the beginning.  I was hoping that as I added the text/tables and merged documents in order they would appear that way.  Also, how do you add page breaks?  When I need to add text/tables after a merged document, it will need to be on a new page.  So now, I find the last paragraph in the document and then add a page break.  But again, it adds them to the end of the document instead of where I thought I added them.  Below is the code I am using for merging:

     

    Paragraph lParagraph = m_MainPart.Document.Descendants<Paragraph>().Last();
    AlternativeFormatImportPart chunk = m_MainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML, pAltChunkId);        
    using (FileStream fileStream = new FileStream("C:\Test.docx", FileMode.Open))
       chunk.FeedData(fileStream);
       AltChunk altChunk = new AltChunk();
       altChunk.Id = m_MainPart.GetIdOfPart(chunk);
       lParagraph.InsertAfterSelf(altChunk);
       m_MainPart.Document.Save();

    Pretty much my code comes from here:  http://social.msdn.microsoft.com/Forums/en-US/oxmlsdk/thread/51d18120-9ee3-47f2-9fac-e5f21a264a5b/ (and here   http://blogs.msdn.com/b/ericwhite/archive/2008/10/27/how-to-use-altchunk-for-document-assembly.aspx --which basically has the same code as the one I am using). 

    What am I missing here?  Is this the proper way to merge or is there another way?  Any help and suggestion would greatly be appreciated!  Thanks in advance!!!

    Wednesday, June 23, 2010 9:25 PM

Answers

  • Hello,

    I don't think there is a way to merge just the body of a docx without using this two methods because of the relationships in document.xml with the other files of the docx (image, style...).

    This is an exemple for creating a content control on the fly and insert it in the document main part:

             
                DocumentFormat.OpenXml.Wordprocessing.Run run = new DocumentFormat.OpenXml.Wordprocessing.Run(
                       new DocumentFormat.OpenXml.Wordprocessing.RunProperties(
                            new RunStyle() { Val = "PlaceholderText" }),
                                  new DocumentFormat.OpenXml.Wordprocessing.Text(" New Text"));


                DocumentFormat.OpenXml.Wordprocessing.Paragraph paragraph = new DocumentFormat.OpenXml.Wordprocessing.Paragraph(run);

                SdtProperties sdtPr = new SdtProperties(
                        new SdtAlias { Val = "Doc1" },
                        new Tag { Val = "_myContentControl" });
                SdtContentBlock sdtCBlock = new SdtContentBlock(paragraph);

                SdtBlock contentControl = new SdtBlock(sdtPr, sdtCBlock);

    // now that we have a content control, we have to insert it at a specified place

    contentControl.InsertAfter(.....);

    .....

     

    Amine

    • Proposed as answer by Ji.ZhouModerator Friday, July 9, 2010 8:32 AM
    • Marked as answer by ChileKitty Monday, July 19, 2010 9:10 PM
    Friday, July 9, 2010 8:19 AM

All replies

  • Hi all....Anything?   We were hoping to get this part (merging) done soon as we liked to go online pretty quick.  This is a pretty important aspect in our ability to provide reports to our clients so if the way we thought cannot be done (please see above), then we need to figure out what we can do to make the reports work.  Please, any suggestion will be appreciated!
    Tuesday, June 29, 2010 1:28 PM
  • Hello,

    I'm not sure to unterstand what is your problem. You have to place the Altchunk in the document where you want to see your text appear. All your merging files are in the end of the document because you place the altchunk after the last paragraph of your document:

    Paragraph lParagraph = m_MainPart.Document.Descendants<Paragraph>().Last();

    ...

    lParagraph.InsertAfterSelf(altChunk);



    Amine

    Monday, July 5, 2010 7:57 AM
  • Hello Amine,

    Thank you for answering.  I'm sorry if I did not explain my problem very well.  And as I've said, I don't know if what I need to do is doable or if I am even doing this correctly. 

    The problem does not lie with the actual merging of files; this part seems to be just fine.  It's how the merging works when I need to add text, tables to my document, all in certain order. 

    Example:  DocA is my template.  I need to merge 3 other documents: Doc1, Doc2, Doc3 into DocA.  Open DocA.  Merge Doc1 (all works).  Now I need to add a few tables/text from the database into DocA (this also works fine).  Merge Doc2 into DocA.  Add more tables/text into DocA.  Merge Doc3 into DocA.  Add more tables/text into DocA.  Done.....but when I open the Document, I have all the merges at the beginning of the document and all the tables and text that I added at the end of the document instead of the order I need them in.

    Do I need to use content controls???  Also, why do I have a document.xml and 3 afchunk.docxs?  Is there any way just to get what I need from another document and add it to a main document without using AltChunk?   Please any suggestion or direction would be greatly appreciated!!!  Thank you.

     

    Tuesday, July 6, 2010 7:39 PM
  • Hello ChileKitty,

    the three afchunk.docx files are in your template because the three docs (Doc1,Doc2 and Doc3) are imported in your template when you use the AltChunk Tag. But when you open it with Word 2007 and save it, Word merge the content of each docx file and delete the Afchunks file.

    I think that you need to use empty content control to localise where you want to insert your document. You can use the SdtAlias property of a content control to localise it in your template (it's the title of your content control). So, when you want to add something, it will be add at the specified place.

    MainDocumentPart mainPart = _wordDocument.MainDocumentPart;
    List<SdtBlock> sdtList =mainPart.Document.Descendants<SdtBlock>().ToList();
    SdtBlock sdt = sdtList.Find(s => s.SdtProperties.GetFirstChild<SdtAlias>().Val == "Doc1");

    sdt.InsertAfterSelf(altChunk);

    sdt.Remove();

     

    There is another way to merge files with a small API (DocumentBuilder: http://blogs.msdn.com/b/ericwhite/archive/2010/01/08/updated-documentbuilder-to-work-with-dec09-ctp-of-open-xml-sdk-v2.aspx) but  I think that you can do what you want with the Altchunk Method.

    Amine

     

    • Edited by AminB Friday, July 9, 2010 1:14 PM
    Wednesday, July 7, 2010 9:17 AM
  • Hello Amine,

    Thank you for your response and explanation.  I can see how using an empty content control would help but the issue would be that I'd need to add it on the fly, programmatically.  See what we have is a template which contains styles and defaults that are needed but that is all.  We use this template for all our reports in which some reports require other documents to be merged in conjunction with adding tables/text based on data from the database.  Obviously, we were able to accomplish this in our earlier client software by automating office.  We have since moved to a web server and automating office is not an option; thus open xml. 

    So pretty much I will need to add a content control on the fly in order to merge and add tables/text together and have it come out in the order I need, correct?  If this is the case, then do you have an example or suggestion on how to add a content control? 

    My last question is, is there a way without using AltChunk or DocumentBuilder to retrieve just the body of another document and just add to the main document body (or something like this)??? 

    Thank you for your help and patience!!!

    Thursday, July 8, 2010 2:35 PM
  • Hello,

    I don't think there is a way to merge just the body of a docx without using this two methods because of the relationships in document.xml with the other files of the docx (image, style...).

    This is an exemple for creating a content control on the fly and insert it in the document main part:

             
                DocumentFormat.OpenXml.Wordprocessing.Run run = new DocumentFormat.OpenXml.Wordprocessing.Run(
                       new DocumentFormat.OpenXml.Wordprocessing.RunProperties(
                            new RunStyle() { Val = "PlaceholderText" }),
                                  new DocumentFormat.OpenXml.Wordprocessing.Text(" New Text"));


                DocumentFormat.OpenXml.Wordprocessing.Paragraph paragraph = new DocumentFormat.OpenXml.Wordprocessing.Paragraph(run);

                SdtProperties sdtPr = new SdtProperties(
                        new SdtAlias { Val = "Doc1" },
                        new Tag { Val = "_myContentControl" });
                SdtContentBlock sdtCBlock = new SdtContentBlock(paragraph);

                SdtBlock contentControl = new SdtBlock(sdtPr, sdtCBlock);

    // now that we have a content control, we have to insert it at a specified place

    contentControl.InsertAfter(.....);

    .....

     

    Amine

    • Proposed as answer by Ji.ZhouModerator Friday, July 9, 2010 8:32 AM
    • Marked as answer by ChileKitty Monday, July 19, 2010 9:10 PM
    Friday, July 9, 2010 8:19 AM
  • Thank you Amine for your quick response and attention.....I will try this out and hopefully this will be the resolution I am looking for. 

    I will update as soon as I am able.

    Thanks again!!!

    Monday, July 12, 2010 2:41 PM
  • Hello Amine,

    Using a content control on the fly is what we needed to retain the building order for each report :)

    I got a little confused on the "contentControl.InsertAfter(.....);" part but figured it out --> "body.InsertAfter(contentControl, paragraphLast);"

     

    Please one last question/issue we have:
    Please one last question/issue we have:

     

    You said the 3 afchunk files will merge when the document is opened and saved in Word.  Is this the only way for everything to get 'merged' and reordered?  Is there something, anything I can do when they open it from the memory stream?

    Thank you for all your help…!!

     

    Monday, July 12, 2010 10:51 PM
  • Hello,

    when you insert the content control you can insert your altchunk after the content control and then remove the content control like this:

    contentControl.InsertAfterSelf(Altchunk);

    contentConrol.Remove();

    For your second question, I don't think that this is possible to merge the afchunk files without using word. But the user will not see the difference on the content because when he will open the file with word, word start by resolving the afchunk files. If the user save the file, word will not resolve the afchunk files again at the next openning.

     

    Amine.

     

     

     

    Tuesday, July 13, 2010 10:21 AM
  • Hello Amine,

    Thank you for all your help and patience on this issue.  Another reason I ask on the afchunk is because once we merge the document into the main document, we'd like to replace some text in the 'merged' document with data from the database. 

    Since 'merged' documents are not part of the body in the main document at this time, how would I go about replacing text?  Any suggestion on this would be appreciated!

    Tuesday, July 13, 2010 3:05 PM
  • Hello,

    I don't know but can you replace the text  from the database and then merge the document? I think it will be easier.

    Amine

    Thursday, July 15, 2010 8:09 AM
  • I noticed your thread and have a related question, just in case you have bumped into this problem.

    I have multiple word documents with rich text content controls.  I'm inserting xml data into the content controls using the MainDocumentPart.CustomXmlParts, this works fine.

    All of these merged documents are saved separately to disk.  I can open them and see that the content controls have been filled with the xml data correctly.

    I'm now attempting to combine these word documents using the method you describe above, using Chunk/AltChunk from a filestream and a content control added on the fly.  I arbitrarily choose one of my merged documents as the main document, and the rest of my merged documents I append to this main document. 

    I then open the combined document.  In the first part of the document, which comes from the main document, the content controls have their xml data.  But in the rest of the document, which comes from the documents that were added to the main document, the content controls have lost their data (they show the default "Click here to enter text" message).

    I'm guessing that the Chunk/AltChunk method of appending whole documents does not look to see if the content controls have xml data.

    Do you happen to know, is there a property or method I can set on either the Chunk or AltChunk objects that will tell it to fetch the xml data along with the Control Controls?


    Tom Regan
    Friday, July 16, 2010 4:07 PM
  • Hello Amine,

    First thank you for all your help on this; it made a whole of difference on how we want to build our documents.  I will look into your suggestion and see. 

    Thanks!!  ChileKitty :)

    Monday, July 19, 2010 9:10 PM
  • Tom, did you find an answer to this??

    I'm experincing the exact same issue, this works fine in office 2007 but not in -2010,

    as you describe.
    /Stefan

    Saturday, September 18, 2010 5:43 PM
  • Examine the combined word doc as an archive and you'll see the problem.  The custom xml does not get loaded properly when documents are merged.  Of course it could be fixed with a lot of low-level shunting about of xml, or perhaps designing a schema, but I don't have time budgeted for all that.  I stopped trying to combine documents.

    I found one 3rd party control, from Aspose, that works very well at combining documents that have different styles, but even it did not successfully combine the custom xml.  Far as I know there is no elegant way to do it using the OpenXml sdk, but if you find there is please add to this thread, I'd be interested in seeing it.


    Tom Regan
    Saturday, September 18, 2010 10:28 PM