Creating a WordprocessingDocument from XML
-
2009年11月16日 下午 11:10I have an XML stream that contains a valid Word XML document (If I save it to a file I can open it with Word). How can I create a WordprocessingDocument from this XML other than to cycle it through Word.
Ockert
所有回覆
-
2009年11月17日 上午 01:07Try to use "WordprocessingDocument Create(stream, wordprocessingDocumentType)"
Z.J.- 已標示為解答 Lanqing BrownellMicrosoft Employee 2010年1月18日 上午 05:00
- 已取消標示為解答 Lanqing BrownellMicrosoft Employee 2010年1月18日 上午 05:04
-
2009年11月17日 上午 10:08Try to use "WordprocessingDocument Create
Saç Ekimi oyun Haber fuar hostesi Evden eve nakliyat
haber Gündem Zeytinburnu Haber Şırnak
-
2009年11月17日 下午 01:40I'm using the following code and it fails with "The OpenXmlPackage.Validate method found an error in the document." If I open the same XML with Word it opens without errors or warnings.
string xmlText;
.
.
.
byte[] xmlBytes;
MemoryStream xmlStream;
xmlBytes = (new UTF8Encoding()).GetBytes(xmlText);
xmlStream = new MemoryStream();
doc = WordprocessingDocument.Create(xmlStream, WordprocessingDocumentType.Document);
xmlStream.Write(xmlBytes, 0, xmlBytes.Length);
doc.Validate(new OpenXmlPackageValidationSettings());
Ockert -
2009年11月18日 上午 09:31In order to see more detail about the cause of a validation failure, you need to create an event handler method that accepts OpenXmlPackageValidationEventArgs and attach this method to the EventHandler event of your OpenXmlPackageValidationSettings instance.
-
2009年11月18日 下午 03:30For some reason the WordprocessingDocument object does not parse the XML after the stream are written. Am I missing a step or something?
Ockert -
2009年11月19日 上午 04:05
You cannot convert XML to a Word document directly. Word docuement is a packge, it has internal structure(If you create a Word document with Word application and change file extension".docx" to ".zip" then you can see its internal structure). So you need build internal structure when you create a document.
Following is a sample, hope it can help you.
==============================================================================================
string xmlText;
... ...
byte[] xmlBytes = (new UTF8Encoding()).GetBytes(xmlText);using(WordprocessingDocument package = WordprocessingDocument.Create(@"d:\test.docx", WordprocessingDocumentType.Document))
{
MainDocumentPart mainDocumentPart = package.AddMainDocumentPart();mainDocumentPart.GetStream().Write(xmlBytes, 0, xmlBytes.Length);
}
===============================================================================================
Z.J. -
2009年11月19日 上午 04:07Sorry, one thing need to be mentioned is xmlText must be valid Open XML such as:
string xmlText = "<?xml version=\"1.0\" encoding=\"utf-8\"?>"
+ "<w:document xmlns:wp=\"http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing\" xmlns:a=\"http://schemas.openxmlformats.org/drawingml/2006/main\" xmlns:pic=\"http://schemas.openxmlformats.org/drawingml/2006/picture\" xmlns:r=\"http://schemas.openxmlformats.org/officeDocument/2006/relationships\" xmlns:w=\"http://schemas.openxmlformats.org/wordprocessingml/2006/main\">"
+ "<w:body>"
+ "<w:p w:rsidR=\"00A2180E\" w:rsidRDefault=\"00EC4DA7\">"
+ "<w:r>"
+ "<w:t>t</w:t>"
+ "</w:r>"
+ "</w:p>"
+ "<w:sectPr w:rsidR=\"00A2180E\" w:rsidSect=\"00A2180E\">"
+ "<w:pgSz w:w=\"11906\" w:h=\"16838\" />"
+ "<w:pgMar w:top=\"1440\" w:right=\"1800\" w:bottom=\"1440\" w:left=\"1800\" w:header=\"851\" w:footer=\"992\" w:gutter=\"0\" />"
+ "<w:cols w:space=\"425\" />"
+ "<w:docGrid w:type=\"lines\" w:linePitch=\"312\" />"
+ "</w:sectPr>"
+ "</w:body>"
+ "</w:document>";
Z.J. -
2009年11月19日 下午 02:59
It is hard to believe that WordprocessingDocument object does not contain a native way to parse a XML Word document. I’m not sure if it is in the roadmap to eventually have it. The bottom line is that you need to create each part and stream the content into the newly added part.
The code below perform the basic function. It is by no means complete. The only media type included in the code below is image parts.
Document = WordprocessingDocument.Create(saveFileDialog1.FileName, DocumentFormat.OpenXml.WordprocessingDocumentType.Document); XmlNode relPart = xmlDoc.SelectSingleNode("pkg:package/pkg:part[@pkg:name='/word/_rels/document.xml.rels']/pkg:xmlData", nsm).FirstChild; XmlNodeList parts = xmlDoc.SelectNodes("pkg:package/pkg:part", GetNSM(xmlDoc)); MainDocumentPart mainPart = null; Stream tempStream; byte[] tempBytes; string PackageName, id; foreach (XmlNode part in parts) { switch (part.Attributes["pkg:name"].Value) { case "/word/document.xml": //add the main part partName = "Main part"; //add the main document part mainPart = Document.AddMainDocumentPart(); //stream the content to the main part tempStream = mainPart.GetStream(); tempBytes = (new UTF8Encoding()).GetBytes(part.FirstChild.InnerXml); tempStream.Write(tempBytes, 0, tempBytes.Length); break; case "/word/settings.xml": //Add settings part partName = "Settings part"; //get the settings name PackageName = GetPackageName(part); //find the settings' id in the relations tag id = GetPackageID(relPart, PackageName); //add the settings part DocumentSettingsPart settingPart = mainPart.AddNewPart<DocumentSettingsPart>(id); //stream the content to the settings part tempStream = settingPart.GetStream(); tempBytes = (new UTF8Encoding()).GetBytes(part.FirstChild.InnerXml); tempStream.Write(tempBytes, 0, tempBytes.Length); break; case "/word/webSettings.xml": //Add web settings part partName = "Web settings part"; //get the web settings name PackageName = GetPackageName(part); //find the web settings' id in the relations tag id = GetPackageID(relPart, PackageName); //add the web settings part WebSettingsPart webSettingPart = mainPart.AddNewPart<WebSettingsPart>(id); //Stream the content to the web settings part tempStream = webSettingPart.GetStream(); tempBytes = (new UTF8Encoding()).GetBytes(part.FirstChild.InnerXml); tempStream.Write(tempBytes, 0, tempBytes.Length); break; case "/docProps/core.xml": //Add core file properties part partName = "Core file properties part"; //Add the core file properties CoreFilePropertiesPart corePart = Document.AddCoreFilePropertiesPart(); //stream the content to the core file properties part tempStream = corePart.GetStream(); tempBytes = (new UTF8Encoding()).GetBytes(part.FirstChild.InnerXml); tempStream.Write(tempBytes, 0, tempBytes.Length); break; case "/docProps/app.xml": //Add extended file properties part partName = "Core file properties part"; //Add the extended properties part ExtendedFilePropertiesPart extendedPart = Document.AddExtendedFilePropertiesPart(); //stream the content to the extended property part tempStream = extendedPart.GetStream(); tempBytes = (new UTF8Encoding()).GetBytes(part.FirstChild.InnerXml); tempStream.Write(tempBytes, 0, tempBytes.Length); break; case "/word/fontTable.xml"://Add font table part partName = "Font table part"; //get the font table name PackageName = GetPackageName(part); //find the font table's id in the relations tag id = GetPackageID(relPart, PackageName); //Add the font part FontTablePart fontPart = mainPart.AddNewPart<FontTablePart>(id); //Stream the content to the font part tempStream = fontPart.GetStream(); tempBytes = (new UTF8Encoding()).GetBytes(part.FirstChild.InnerXml); tempStream.Write(tempBytes, 0, tempBytes.Length); break; case "/word/styles.xml"://Add style part partName = "Style part"; //get the style name PackageName = GetPackageName(part); //find the style's id in the relations tag id = GetPackageID(relPart, PackageName); //Add the style part StyleDefinitionsPart stylePart = mainPart.AddNewPart<StyleDefinitionsPart>(id); //stream the content to the style part tempStream = stylePart.GetStream(); tempBytes = (new UTF8Encoding()).GetBytes(part.FirstChild.InnerXml); tempStream.Write(tempBytes, 0, tempBytes.Length); break; default: //add the media parts if (part.Attributes["pkg:name"].Value.Contains("/word/media")) { //add the image parts if (part.Attributes["pkg:contentType"].Value.Contains("image")) { partName = "Image part"; //get the image type string[] imageTypeParts = part.Attributes["pkg:contentType"].Value.Split('/'); string imageTypeName = string.Format("{0}{1}", imageTypeParts[1].Substring(0, 1).ToUpper(), imageTypeParts[1].Substring(1)); ImagePartType imagePartType = (ImagePartType)Enum.Parse(typeof(ImagePartType), imageTypeName); //get the package name PackageName = GetPackageName(part); //find the image's id in the relations tag id = GetPackageID(relPart, PackageName); //Add the image part ImagePart imagePart = mainPart.AddImagePart(imagePartType, id); //Stream the image date to the image part tempBytes = Convert.FromBase64String(part.SelectSingleNode("pkg:binaryData", GetNSM(xmlDoc)).InnerText); tempStream = new MemoryStream(tempBytes); imagePart.FeedData(tempStream); } } //add the theme if (part.Attributes["pkg:name"].Value.Contains("/word/theme")) { partName = "Theme part"; //get the package name PackageName = GetPackageName(part); //find the theme's id in the relations tag id = GetPackageID(relPart, PackageName); //Add the theme part mainPart.AddNewPart<ThemePart>(id); ThemePart themePart = mainPart.ThemePart; //Stream the content to the theme part tempStream = themePart.GetStream(); tempBytes = (new UTF8Encoding()).GetBytes(part.FirstChild.InnerXml); tempStream.Write(tempBytes, 0, tempBytes.Length); } break; } } Document.Validate(validationSettings); Document.Close();
Ockert- 已標示為解答 Ockert Labuschagne 2009年11月19日 下午 02:59
-
2009年11月20日 上午 02:58
Open XML SDK is complex at current stage, after all it's only a CTP now. I belive it will improve in the future :)
Z.J. -
2009年11月20日 上午 03:46
The old validation method OpenXmlPackage.Validate () need an event handler to see detail info. Actually, you can try to use new validation feature ,
for example:
string testfile=@"d:\test.docx";
OpenXmlValidator validator = new OpenXmlValidator();
var errors = validator.Validate(testfile); -
2009年12月30日 上午 02:53
Adding an image seems *MUCH* too complex. I should be able to append an image into a body like a paragraph, or some text, or a run, rather than having to go way up into the Main document just to create an "imagepart". This should NOT exist:
ImagePart imagePart = mainPart.AddImagePart(imagePartType, id);
If I simply want to "paste" an image somewhere in a sequence of items being appended to a body, it should be more like:
Dim I as Image = New Image("C:\MyPic.jpg")
myBody.Append(I)
Done! -
2010年1月7日 上午 08:26
Hi Carl Cook,
In order to add an image to a body, not only do we need to use MainDocumentPart.AddImagePart(), but also we have to append image reference in document body. It is somehow complex. My suggestion is that you can:
1. Create a new document and insert a picture to paragraph with Office Word, then close it.
2. Open the document with Open XML SDK2.0 Productivity Tool for Microsoft Office (which can be downloaded from: http://www.microsoft.com/downloads/details.aspx?FamilyID=C6E744E5-36E9-45F5-8D8C-331DF206E0D0&displaylang=en)
3. Use "Reflect Code" to see how to generate it using SDK2.0
Hope this will help you. If you have any question, please let me know.
Thanks,
Lu -
2010年1月7日 下午 06:25Do you have this for the other direction too? DOCX -> Word XML?
Windward Reports - World's Greatest SharePoint Reporting & DocGen -
2010年1月15日 下午 04:48I have been working on that. The process is to create the the core structure of the xml document:
WordprocessingDocument doc = WordprocessingDocument.Open(<DOCX document path>, true);
private static XNamespace ns_rel = http://schemas.openxmlformats.org/package/2006/relationships;
private static XNamespace ns_pkg = http://schemas.microsoft.com/office/2006/xmlPackage;
XProcessingInstructionprivate
static XNamespace ns_rel = http://schemas.openxmlformats.org/package/2006/relationships;
private static XNamespace ns_pkg = http://schemas.microsoft.com/office/2006/xmlPackage;
XProcessingInstruction
static XNamespace ns_rel = http://schemas.openxmlformats.org/package/2006/relationships;
private static XNamespace ns_pkg = http://schemas.microsoft.com/office/2006/xmlPackage;
XProcessingInstructionstatic XNamespace ns_rel = http://schemas.openxmlformats.org/package/2006/relationships;
private static XNamespace ns_pkg = http://schemas.microsoft.com/office/2006/xmlPackage;
XProcessingInstructionapp = new XProcessingInstruction("mso-application", "progid=\"Word.Document\"");
XDocument part;
OpenXmlPart xmlPart;
XElement FileRelationshipsPart,fileRelationships,DocumentRelationshipsPart,documentRelationships;
X_Document = new XDocument(app
,new XElement(ns_pkg + "package"
,new XAttribute(XNamespace.Xmlns + "pkg", ns_pkg)
,FileRelationshipsPart = new XElement(ns_pkg + "part"
,new XAttribute(ns_pkg + "name", "/_rels/.rels")
,new XAttribute(ns_pkg + "contentType", "application/vnd.openxmlformats-package.relationships+xml")
,new XAttribute(ns_pkg + "padding", "512")
,new XElement(ns_pkg + "xmlData"
,fileRelationships = new XElement(ns_rel + "Relationships"
,new XAttribute("xmlns", ns_rel))))
,DocumentRelationshipsPart = new XElement(ns_pkg + "part"
,new XAttribute(ns_pkg + "name", "/word/_rels/document.xml.rels")
,new XAttribute(ns_pkg + "contentType", "application/vnd.openxmlformats-package.relationships+xml")
,new XAttribute(ns_pkg + "padding", "256")
,new XElement(ns_pkg + "xmlData"
,documentRelationships = new XElement(ns_rel + "Relationships"
,new XAttribute("xmlns", ns_rel))))));
fileRelationships.Add(new XElement(ns_rel + "Relationship"
,new XAttribute("Id", "rId3")
,new XAttribute("Type", http://schemas.openxmlformats.org/officeDocument/2006/relationships/extended-properties)
,new XAttribute("Target", "docProps/app.xml"))
,new XElement(ns_rel + "Relationship", new XAttribute("Id", "rId2")
,new XAttribute("Type", http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties)
,new XAttribute("Target", "docProps/core.xml"))
,new XElement(ns_rel + "Relationship"
,new XAttribute("Id", "rId1")
,new XAttribute("Type", http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument)
,new XAttribute("Target", "word/document.xml")));
Then add the XML for each part by iterating through the parts in the document and add the part’s Xml to the Xml document:
xmlPart = doc.MainDocumentPart;
Ockert -
2010年1月18日 上午 05:10Hi Ockert,
The Word document as a whole cannot be simply represented as an XML stream. It uses the zip technology to pack individual parts. Is your XML stream in Office 2003 format?
Also the Open() method on WordprocessingDocument could open an stream and parse its content. The stream needs to be conformant to the Open XML File Format (Word2007 and beyond) though.
Thanks,
--L. -
2010年1月20日 下午 04:46
Lanqing,
The main advantage of Office 2007 and beyond for me is that documents can be represented as an XML stream.
The Open XML SDK for Microsoft Office is a great library that provides all sorts of API calls to work on the documents exspecially version 2.0. Unfortunately it does not provide me with a way to manipulate the document as I need to. My project is storing the XML streams of document fractions (each a valid XML representation of a document) in a database. I then need to make small changes to the stored documents as I assemble a set of them into a new document. I have been using the Office Object Model but it has proven to be a bit unpredictable, Office throwing COM errors which is difficult to manage from C# and it is requiring Office to run in a unattended mode which is not recommended. Having the ability to manipulate XML will give me a lot more flexibility and should be more reliable.Thanks
Ockert
Ockert -
2010年7月22日 下午 04:30
Guys,
Any update on this? i am facing the same problem. is this has been solved in Open XML 2.0?
Noam
-
2010年7月27日 下午 07:43how to count cards short male soda acne lemon for ____ job putting together a lovely looking group of ladies. However, i've never picked up an escort bayan OL wondering how good the girls will look in this months edition, and I dont plan on doing that any time soon. There are plenty of other publications escort bayan that wont force me to use my weak imagination nearly as much, and that are'nt as easily accessible to young children. partner (for good reason) Now, if i want porno izle to see beautiful women in the woods and waters of our awesome nation, i guess i'll just have porno izle to hope that OL puts Tiffany Lakosky in next months print!!! porno izle story comes with a young man escort bayanlar Oh and Big Brian at Clash was the first person escort bayan
-
2010年7月29日 上午 11:38Hi Ockert,
The Word saç ekimi document as a whole cannot be simply represented as an evden eve nakliyat XML stream. It uses the zip technology to pack individual parts. muzik dinle Is your XML stream in Office 2003 format? -
2010年8月16日 下午 05:00
A Word 2007 document can be represented as an XML string as a whole.
Ockert- 已提議為解答 Hemant Sir 2012年7月24日 上午 07:08
- 已取消提議為解答 Hemant Sir 2012年7月24日 上午 07:18
-
2012年7月24日 上午 07:18
XML Word document can be converted in Docx format, I was working on it recently I got the solution for this. If you have the MS word Xml document saved in DB or have that on you system. you will need to load that doc in XMLDocument. I did it as follows:
XMLDocument Xdoc=new XMLDocument.Load("D:\xmlDoc.xml");
XmlNodeList bodycontent = XDoc.GetElementsByTagName("w:body"); // extract the body part from the word xml document.
XmlNode body_node = bodycontent[0];
using (WordprocessingDocument mainDocument = WordprocessingDocument.Create(@"D:\RawDoc.docx",DocumentFormat.OpenXml.WordprocessingDocumentType.Document))
{
MainDocumentPart mainPart = mainDocument.AddMainDocumentPart();
// Create the document structure and add some text.
mainPart.Document = new DocumentFormat.OpenXml.Wordprocessing.Document();
XElement tempBody = XElement.Parse(body_node.OuterXml);
mainDocument.MainDocumentPart.Document.AppendChild(new Body(tempBody.ToString()));
mainDocument.MainDocumentPart.Document.Save();
mainDocument.Package.Flush();
}- 已編輯 Hemant Sir 2012年8月18日 上午 09:48

