none
Open XML and MailMerge RRS feed

  • Question

  • Hi All
    I’m using Word 2013 and a SQL Server 2005.

    I’ve got a VBA solution in a Word Template which is handling the mail merge action for end users. The end user choses the type of document and the recipients with our database solution and we then call our Word 2013 template where we ask the final questions and then execute the mail merge for the end user. We’ve used that technique since Word 2000 and it worked very well (Generated 50 pages in 6 seconds).

    Word 2013 is enormously slow doing the mail merge. It takes 3 times the time as Word 2000. I’ve been told to look into Open XML and that’s where I’m struggling.

    We have hundreds of Word templates setup to be used for the mail merge having lots of MERGEFIELDS placed in the document. We even use the Field DATABASE to populate tables.
    I plan to keep the VBA code dealing with all the pre-mailmerge tasks interacting with the end user before he hits the button «Do the letters for me». At the point where I execute the VBA code to do the mail merge I’d like to call a C# DLL which I’ve created with Visual Studio 2010 using Open XML doing the mail merge with Open XML.

    At that time I’d have the Word template and the dataset (in a SQL string or all data saved in a RTF file as data source) ready.

    Is it possible to write that cute little C# black box DLL using Open XML virtually replacing the Mail Merge VBA command from Word hoping it works faster the good old word mail merge engine? If yes can anybody point me to the right direction for examples, tutorials or help material?

    Thanks a lot.
    Martin

    Wednesday, September 11, 2013 3:04 PM

Answers

  • Hi Martin

    Sorry, I've been busy/away and apparently there's little to no support here from the MS side these days...

    The place to start for learning is OpenXMLDeveloper.org. There are tutorials there which should give you an idea of the basics.

    The other thing I found helpful when getting started was to open the Zip file package and look at the "guts" of the document. More specifically, locate the XML elements you need to work with and note in which xml file they are. (In your case, almost certainly and exclusively document.xml, which is a blessing for you.) Look at their position in the XML - what are their parent elements? What attributes and child elements do they use? Which are relevant for your requirement?

    In the case of field codes, they can be "simple" or "complex". (There's a blog article and, I believe, a video on openxmldeveloper.org about simple and complex fields, you don't need to understand everything that material explains, but you need to recognize simple vs. complex.) You need to see if the mergefields are either, or both, as that will affect the code you need to interact with them.

    In brief, I believe the code you require will need to go through document.xml, picking up all the merge fields. You'll need to retrieve the field name of each, then lookup the data for that field for the current record being processed. The mergefield should be deleted and the database content inserted in its place.

    Here's some sample code to illustrate the process, roughly. This sample does work, but isn't tested for robustness.

        //Will not deal with mergefields nested in other fields (IF, for example)
        private void btnGetMergeFields_Click(object sender, EventArgs e)
        {
            string fileName = @"C:\test\MergeFields.docx";
            using (WordprocessingDocument pkgDoc = WordprocessingDocument.Open(fileName, true))
            {
                string fieldList = string.Empty;
                Document doc = pkgDoc.MainDocumentPart.Document;
                //Get all field code elements in the document
                IEnumerable<FieldChar> fldChars = doc.Descendants<FieldChar>();
                if (fldChars == null) return; //No field codes in the document
    
                // bool fldStart = false;
                FieldChar fldCharStart = null;
                FieldChar fldCharEnd = null;
                FieldChar fldCharSep = null;
                FieldCode fldCode = null;
                string fldContent = String.Empty;
                foreach (FieldChar fldChar in fldChars)
                {
                    string fldCharPart = fldChar.FieldCharType.ToString();
                    switch (fldCharPart)
                    {
                        case "begin": //start of the field
                            fldCharStart = fldChar;
                            //get the field code, which will be an instrText element
                            // either as sibling or as a child of the parent sibling
                            fldCode = fldCharStart.Parent.Descendants<FieldCode>().FirstOrDefault();
                            if (fldCode == null) //complex field
                            {
                                fldCode = fldCharStart.Parent.NextSibling<Run>().Descendants<FieldCode>().FirstOrDefault();
                            }
                            if (fldCode != null && fldCode.InnerText.Contains("MERGEFIELD"))
                            {
                                fldContent = fldCode.InnerText;
                                fieldList += fldContent + "\n";
                            }
                            break;
                        case "end": // end of the field
                            fldCharEnd = fldChar;
                            break;
                        case "separate": //complex field with text result
                            //we want to put the database content in this text run
                            //yet still remove the field code
                            //If there's no "separate" field char for the current field,
                            //we need to insert it somewhere else
                            fldCharSep = fldChar;
                            break;
                        default:
                            break;
                    }
                    if ((fldCharStart != null) && (fldCharEnd != null)) //start and end field codes have been found
                    {
                        if (fldCharSep != null)
                        {
                            DocumentFormat.OpenXml.Wordprocessing.Text elemText = (DocumentFormat.OpenXml.Wordprocessing.Text)fldCharSep.Parent.NextSibling().Descendants<DocumentFormat.OpenXml.Wordprocessing.Text>().FirstOrDefault();
                            elemText.Text = fldContent;
                            //Delete all the field chars with their runs
                            DeleteFieldChar(fldCharStart);
                            DeleteFieldChar(fldCharEnd);
                            DeleteFieldChar(fldCharSep);
                            fldCode.Remove();
                        }
                        else
                        {
                            DocumentFormat.OpenXml.Wordprocessing.Text elemText = new DocumentFormat.OpenXml.Wordprocessing.Text(fldContent);
                            fldCode.Parent.Append(elemText);
                            fldCode.Remove();
                            //Delete all the field chars with their runs
                            DeleteFieldChar(fldCharStart);
                            DeleteFieldChar(fldCharEnd);
                            DeleteFieldChar(fldCharSep);
                        }
                        fldCharStart = null;
                        fldCharEnd = null;
                        fldCharSep = null;
                        fldCode = null;
                        fldContent = string.Empty;
                    }
    
                }
                this.txtMessages.Text = fieldList;
            }
        }
        private void DeleteFieldChar(OpenXmlElement fldCharStart)
        {
            Run fldRun = (Run) fldCharStart.Parent;
            fldRun.RemoveAllChildren();
            fldRun.Remove();
        }
        }
    


    Cindy Meister, VSTO/Word MVP, my blog

    • Marked as answer by Martin Koenig Thursday, October 17, 2013 6:00 AM
    Thursday, October 3, 2013 5:35 PM
    Moderator

All replies

  • Hi Martin

    Sorry, I've been busy/away and apparently there's little to no support here from the MS side these days...

    The place to start for learning is OpenXMLDeveloper.org. There are tutorials there which should give you an idea of the basics.

    The other thing I found helpful when getting started was to open the Zip file package and look at the "guts" of the document. More specifically, locate the XML elements you need to work with and note in which xml file they are. (In your case, almost certainly and exclusively document.xml, which is a blessing for you.) Look at their position in the XML - what are their parent elements? What attributes and child elements do they use? Which are relevant for your requirement?

    In the case of field codes, they can be "simple" or "complex". (There's a blog article and, I believe, a video on openxmldeveloper.org about simple and complex fields, you don't need to understand everything that material explains, but you need to recognize simple vs. complex.) You need to see if the mergefields are either, or both, as that will affect the code you need to interact with them.

    In brief, I believe the code you require will need to go through document.xml, picking up all the merge fields. You'll need to retrieve the field name of each, then lookup the data for that field for the current record being processed. The mergefield should be deleted and the database content inserted in its place.

    Here's some sample code to illustrate the process, roughly. This sample does work, but isn't tested for robustness.

        //Will not deal with mergefields nested in other fields (IF, for example)
        private void btnGetMergeFields_Click(object sender, EventArgs e)
        {
            string fileName = @"C:\test\MergeFields.docx";
            using (WordprocessingDocument pkgDoc = WordprocessingDocument.Open(fileName, true))
            {
                string fieldList = string.Empty;
                Document doc = pkgDoc.MainDocumentPart.Document;
                //Get all field code elements in the document
                IEnumerable<FieldChar> fldChars = doc.Descendants<FieldChar>();
                if (fldChars == null) return; //No field codes in the document
    
                // bool fldStart = false;
                FieldChar fldCharStart = null;
                FieldChar fldCharEnd = null;
                FieldChar fldCharSep = null;
                FieldCode fldCode = null;
                string fldContent = String.Empty;
                foreach (FieldChar fldChar in fldChars)
                {
                    string fldCharPart = fldChar.FieldCharType.ToString();
                    switch (fldCharPart)
                    {
                        case "begin": //start of the field
                            fldCharStart = fldChar;
                            //get the field code, which will be an instrText element
                            // either as sibling or as a child of the parent sibling
                            fldCode = fldCharStart.Parent.Descendants<FieldCode>().FirstOrDefault();
                            if (fldCode == null) //complex field
                            {
                                fldCode = fldCharStart.Parent.NextSibling<Run>().Descendants<FieldCode>().FirstOrDefault();
                            }
                            if (fldCode != null && fldCode.InnerText.Contains("MERGEFIELD"))
                            {
                                fldContent = fldCode.InnerText;
                                fieldList += fldContent + "\n";
                            }
                            break;
                        case "end": // end of the field
                            fldCharEnd = fldChar;
                            break;
                        case "separate": //complex field with text result
                            //we want to put the database content in this text run
                            //yet still remove the field code
                            //If there's no "separate" field char for the current field,
                            //we need to insert it somewhere else
                            fldCharSep = fldChar;
                            break;
                        default:
                            break;
                    }
                    if ((fldCharStart != null) && (fldCharEnd != null)) //start and end field codes have been found
                    {
                        if (fldCharSep != null)
                        {
                            DocumentFormat.OpenXml.Wordprocessing.Text elemText = (DocumentFormat.OpenXml.Wordprocessing.Text)fldCharSep.Parent.NextSibling().Descendants<DocumentFormat.OpenXml.Wordprocessing.Text>().FirstOrDefault();
                            elemText.Text = fldContent;
                            //Delete all the field chars with their runs
                            DeleteFieldChar(fldCharStart);
                            DeleteFieldChar(fldCharEnd);
                            DeleteFieldChar(fldCharSep);
                            fldCode.Remove();
                        }
                        else
                        {
                            DocumentFormat.OpenXml.Wordprocessing.Text elemText = new DocumentFormat.OpenXml.Wordprocessing.Text(fldContent);
                            fldCode.Parent.Append(elemText);
                            fldCode.Remove();
                            //Delete all the field chars with their runs
                            DeleteFieldChar(fldCharStart);
                            DeleteFieldChar(fldCharEnd);
                            DeleteFieldChar(fldCharSep);
                        }
                        fldCharStart = null;
                        fldCharEnd = null;
                        fldCharSep = null;
                        fldCode = null;
                        fldContent = string.Empty;
                    }
    
                }
                this.txtMessages.Text = fieldList;
            }
        }
        private void DeleteFieldChar(OpenXmlElement fldCharStart)
        {
            Run fldRun = (Run) fldCharStart.Parent;
            fldRun.RemoveAllChildren();
            fldRun.Remove();
        }
        }
    


    Cindy Meister, VSTO/Word MVP, my blog

    • Marked as answer by Martin Koenig Thursday, October 17, 2013 6:00 AM
    Thursday, October 3, 2013 5:35 PM
    Moderator
  • That was quite helpful, thank you! I believe this is the blog Cindy references. http://ericwhite.com/blog/map/generating-open-xml-wordprocessingml-documents-blog-post-series/
    Tuesday, October 15, 2013 8:26 PM