locked
XPS to .doc or .docx conversion

    Question

  • Word can SaveAs .xps but not open. 

     

    Is there ever likely to be a plugin to open .xps directly in Word?

    Are there any other convertors out there?

     

    Thanks

    Michael

    Monday, July 16, 2007 9:32 AM

All replies

  • Most likely it's going to be a plugin, if it's not already in the newer versions of word.  The only way i know to convert right now is to write the program to do it yourself... however, there are some limitations as XPS supports features (i would imagine) that Doc doesn't, such as absolute positioning?  (i'm not sure about that, not super familiar with the office file formats)
    Tuesday, July 17, 2007 3:49 PM

  • XPS is a fixed document (and print/spoolfile-) format and not  flow document, why do you want to open (and/or edit) it in Word? Do you want to open your PS/PCL-files also directly in word? (c;

    I don't think there'll be a plugin for Word from Microsoft, mabye 3rd party...but you'll never know...

    Wednesday, July 18, 2007 10:08 AM
  • Yes, I'm aware it's a fixed document forment, etc, and no, I don't need to open PS/PCL files in Word!  The question is really directed at anyone at Microsoft who can offer an authorative reply.

     

    There are going to be many situations where users will want to open and edit documents saved in XPS format.  Being a fixed document format it is paginated and saved for a particular page size and margins but that shouldn't prevent it from being loaded into an editor, modified and repaginated for a different page setup if desired.

     

    My app needs to display various user generated documents.  The new version uses WPF which can load XPS docs directly.  It therefore makes sense for my users to save some of their documents (policies, minutes of meetings and many others) in XPS format.  The alternative, which is the way the previous version handled this, is to in-place activate Word and load DOC files.  At the moment it looks like we will need to keep both versions but that means the user has to remember to SavaAs .XPS and .DOC every time they make a change with plenty of scope for forgetting and getting the two out of sync.  It would make much more sense if they could standardize on .XPS for these particular documents.

     

    As far as I'm aware, there are XPS to PDF convertors and PDF to DOC convertors but that doesn't really make life any easier.

     

    There is no reason that I can think of as to why XPS docs shouldn't be editable.  It would be nice to know from someone at Microsoft if they have any plans for a plugin.  The more people use WPF and XPS, the more this question will get asked.

     

    Thanks

    Michael

     

     

    Wednesday, July 18, 2007 2:32 PM
  • I'm also interested in direct conversion.   I’ve found two programs that may help, but my experimentation proved them to be more cumbersome than the standard copy and paste.  I’m sure you’re aware that you can copy and paste into word from XPS and MDI.  The programs should you want to check them out:

    http://nixps.com/download.html

    http://www.html-to-pdf.net/xps-library.aspx

     

    Friday, July 27, 2007 3:28 AM
  • Thanks for the links.  I'll keep an eye on them but I'm really looking for interoperabiltiy with Word.  I might consider using .rtf but that is likely to confuse my users as well.  I just wonder why Microsoft remains silent on this; a) 'no it will never happen' or b) 'maybe in a service pack or future version' would help me plan a strategy.

     

    Thanks

    Michael

    Tuesday, July 31, 2007 2:55 PM

  • why not contact xpsinfo@microsoft.com and ask them directly..?
    if you'll get an answer, please post it also here for all the others as well, thanx.

    Wednesday, August 01, 2007 4:21 PM
  • I am looking for a way to export a WPF visual object to doc/docx format. Or else any way to convert XPS into DOC/DOCX?
    I noticed that all these posts here are more than a year old. I was hoping that there will be some progress by Microsoft by now on this.

    Any one aware of any way to convert XPS into DOC / DOCX?
    • Proposed as answer by JBrophy Thursday, December 18, 2008 7:27 PM
    Tuesday, December 16, 2008 7:48 PM
  • Not an official Microsoft answer, but I've been trying to do this kind of thing for years with earlier formats  (PDF, PCL, RTF, HTML, XML:FO, etc) - and the short answer is:  It can be done - but only in a way that makes it useless for all but maybe 2 people in the entire country.

    The problem is that XPS is fixed layout - which is only a hairs-breadth away from being a picture of the document - similar to a scanned fax copy of the document.  And in fact, if you read up on how XPS represents the document, each page is essentially just that: a picture.  Of course it's a picture that might be drawn using some high-level routines provided by XAML and WPF, but it's still just a picture.

    Word's DOCX format on the other hand is an editable format - which includes conceptual layout ideas such as: margins, paragraphs, line-wrapping, repeating headers and footers, etc.

    The primary means to produce an XPS document today is to manually use the XPS printer driver from Microsoft.  Of course there are more and more places out there writing direct-to-XPS tools, but none of those are really mainstream yet.

    And since most of the document sources that are printing into XPS are pre-Vista, they will be using the old Windows printing API to draw each page content using the old Windows GDI graphics commands.  And for some software (Adobe Reader specifically) for which the Windows GDI is insufficient, they do their own rendering of each page down to a bitmap - and then send that bitmap to the GDI printer.  The XPS can only contain the information that was sent to it via the printer driver:  All graphics drawing commands.

    As a result, an XPS "document" contains no knowledge of those layout concepts mentioned above - and does not even have a means by which they can be specified.

    As I stated to begin with, you can convert an XPS document to DOCX (probably by writing your own tool), but the DOCX will contain only the same information that was contained in the XPS - which means that there are no margins, no paragraphs, no headers, no footers, and possibly no text.  If the XPS was produced with each page as a bitmap (such as when printed by Adobe Reader) - then each page of the DOCX will simply contain one page-sized image.

    Even if the XPS was created to contain text, each word of the text will probably be positioned independent of the remaining text.   The best this could be represented in Word would be to have each text word (or possibly line of text) placed in a separate Text Box with a fixed location on the page.

    In any case, because the XPS is missing most of the layout metadata that Word needs, so even if you could get an accurate conversion from XPS back to DOCX, the resulting document could be viewed or printed from Word - but not edited.

    And since you can already view or print XPS, the only reason to convert to DOCX is for editing.  This is why nobody bothers.

    Alternatively, if all you want is the document content and could care less about the layout, you could write a conversion tool that simply extracted the text (if present) and created a simple TXT file that could be loaded into Word.  Of course, since you've lost all the layout information, it probably won't be useful without a lot of manual editing and layout work.
    Thursday, December 18, 2008 7:47 PM
  • There are third party products that do a decent job of converting from fixed formats (like XPS and PDF) to flow formats (like Office Open XML). An illustrative example is PDF Convertor 5 from Nuance.

    Typically, there's three levels of behavior that a conversion application can adopt. They can convert the fixed format 'as seen', resulting in a flow document that may, or may not have content in the same order as you'd read it; they can take advantage of additional document structure provided in the fixed format file (i.e. non-visible metadata that provides additional information on how content is related that is used by, amoung other things, screen readers for accessibility) to help ensure the converted content flows together correctly; and they can use OCR-like techniques to identify regions of the document (footers, columns, text blocks etc) to regenerate flow structure.

    /aiddy
    http://blogs.msdn.com/adrianford
    Saturday, December 20, 2008 2:49 AM
  • I have a theory... however immature and inadequate as it may be... A.)If you can get your .XPS file to fit on your screen, B.)push the "print screen" button. generally when you do this, nothing happens, however it adds the picture to you clipboard. C.)open Microsoft paint or other picture viewing application.D.)paste your picture, and print. After you print it, Scanner tools are a lot easier to find than conversion tools for this matter. Find a tool that scans documents for editing ( http://office.microsoft.com/en-us/word/HA102548791033.aspx ). E.)When you scan the document to WORD, you HAVE to proof read it, because it usually isn't perfect. However, it will retain a vest majority of the wording (for editing) and the pictures, if any.

    Hope this helps,
    Sorry I'm not a MS Pro,

    Josh
    Wednesday, September 02, 2009 8:54 PM
  • There is no need of any convertor, You can download a xps viewer from the microsoft website.
    Thursday, February 11, 2010 11:08 AM
  • Hi Rahul,

    I tried doing the same but of now use.I have booked tickets from a site and my bro in his ofc can not open .xps file.So actually downloading the viewer is not a good idea either he can not install any programs,it doesn't help.

     

    I guess there should be a way to get it right..I ain't techy so m facing this problem. Please help me with this as it creates problems every now and then.

     

    Thanks

    Thursday, April 08, 2010 7:39 AM
  • Hi Christ fox,

    well, there exists lot of XPS-viewer outside but didn't came across an real editor (except maybe one who could edit XAML) which is able to edit XPS like in Word for example. As already mentioned the XPS is a fixed (print/spoolfile-)format, means all positions and text are fix.
    Think of an paragraph in Word f.e., where Word arranges text on-the-fly as long as it fits into a line, depending on your page-size/margin. This is not the case in XPS, text is always fixed (text, position, (subsetted) fonts used etc.), hence it's paginated and saved for a particular page size and margin... Ofcourse it would be possible to add more text etc. but it doesn't rearrange text if you would insert a line between two others, like word does, you would have to recalculate lines/position on your own etc... and like I said, I didn't came across an real XPS-editor which could do this...

    Hope you get an idea what I mean... btw, the post from JBophy does explain it already good...

    Jo

     

    Monday, August 23, 2010 10:55 AM
  • Thanks for taking the time to provide a clear understanding of why a seemingly simple process I was trying was not working.   In other words,  simply using CTRL A to capture and trying to paste into Word of Excel was not comiing out right format wise.  

    Thanks again.

     

    Monday, July 04, 2011 2:15 AM
  • If you rename an .XPS file to .ZIP and unpack it, you can discover your pages under Documents/1/Pages/#.fpage. Open that file and take a look what it contains inside - it's XAML-like layout mostly out of Path and Glyphs objects, with OriginX and OriginY specified for each single Glyphs object much like on a Canvas in WPF. Also, you might note, that every Glyphs object has FontUri property set to some .odttf font and Indices property. The XPS format is designed primarily as a portable fixed-layout snapshot of the document, that is guaranteed to be displayed correctly on any machine you open it on regardless of what fonts are available on that machine. So when Word or XPS printer is creating an XPS document on your local machine where you have all the full fonts available (and some Chinese fonts can be more than 100 mb in size), it inspects all text in your document and creates a stripped-down version of fonts that you used. So ODTTF font file would be a projection of the original TTF file containing only Glyphs that were used in your document. Most of the fonts contain several Glyphs for a single character for different text size, and then it contains sets of such glyphs for each character it supports.


    Now assume you converted a document with the only text "Hello World!" FontSize=20 written in MyFavouriteFont installed on your machine into a XPS format. As an output, you are going to get a file, that has an ODTTF font, which has Glyphs from MyFavouriteFont only for characters H, e, l, o, W, r, l, d and ! sized 20. And our XPS specifies, that this string is offset to 100 px from the top and 20 px from the left (how it looked on your machine). That is ALL the information contained in the XPS. Yes, it does also contain a helper "UnocodeString" property on the Glyphs, but that's it.


    Now think about how can this possibly be converted back to DOCX? You can theoretically get raw text content by relying on UnicodeString property values of the Glyphs, but you will first of all lose your fonts if you want to edit, because you have no idea which font was the document originally written with. Also, some algorithm needs to guess how were the paragraphs originally outlined, which part of text were tables, because in XPS they will all look like a bunch of separate text strings and path objects.

    Hope this helps.


    Wednesday, July 06, 2011 7:49 PM