none
Saving data from url to docx corrupts its format RRS feed

  • Question

  • I am fetching data from a url using DownloadData and saving the data to a doc file using BinaryWriter and it works perfectly fine. If I try to save the same data to a docx file it don't open showing error that format is corrupted. How can I same url content to a docx file without corrupting format?
    Friday, March 11, 2011 2:26 PM

Answers

All replies

  • Hi himania,

    Thanks for posting in the MSDN Forum.

    Do you write the data in to Word document via Word Object Model? If it is, would you show me the snippet to reproduce your issue?

    Have a good day,

    Tom


    Tom Xu [MSFT]
    MSDN Community Support | Feedback to us
    Get or Request Code Sample from Microsoft
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.

    Monday, March 14, 2011 6:01 AM
    Moderator
  • Hi Tom,

    I am not using word object model. The code snippet is:

    		string fileName = activeDir + @"\" + title;
                     string extension = ".doc";
                        WebClient request = new WebClient();
                        request.UseDefaultCredentials = true;
                        byte[] fileContent = request.DownloadData(pinnedItem.Url);
                        string type = request.ResponseHeaders[Constants.ContentType];
     
                        if (type.StartsWith(Constants.HtmlContent, StringComparison.CurrentCulture))
                        {
                            fileName += extension;
                            FileStream fs = File.Create(fileName);
                            BinaryWriter bw = null;
                            bw = new BinaryWriter(fs);
                            bw.Write(fileContent);
                            bw.Close();
                            fs.Dispose();
                        }

    Thanks & Regards,

    Himani

    Monday, March 14, 2011 1:15 PM
  • Hi himania,

    As far as I see, that .docx file is a package of several xml files in it. And .doc file is only a bytes file. So you can save .doc file via byte array. If you want to save it to a docx file you need to use Word object model and parse you html content to the docx document.

    Have a good day,

    Tom


    Tom Xu [MSFT]
    MSDN Community Support | Feedback to us
    Get or Request Code Sample from Microsoft
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.

    Wednesday, March 16, 2011 11:04 AM
    Moderator
  • If you are downloading data to a file, you do not have control over the format of that data - if it's in .doc format the file will be a .doc file, if it's in .docx format the file will be a .docx file.

    If you want to convert the file from .doc format to .docx format you would be best to use Word. if all you want to do is make sure you save with the right extension, you can check the file signature - it's not a perfect solution but if you know it's a Word document to begin with, it will allow you to distinguish .doc from .docx.


    Enjoy,
    Tony
    www.WordArticles.com
    Wednesday, March 16, 2011 11:41 AM
  • Hi Tom Xu,

    Can you please post a sample code that parses content of some html URL and saves it in as docx?

    Regards,

    Himani

    Tuesday, March 22, 2011 2:53 PM
  • Hi Tony,

    I am not downloading file but downloading the content of some URL and saving the content as Word docx document.

    Regards,

    Himani

    Tuesday, March 22, 2011 3:59 PM
  • What do you mean by the content of a url, if not a file? A stream? Whatever it is, it is in a format decided by whoever put it there. If you want to change the format you must use software that understands what you are changing from, and to.

     


    Enjoy,
    Tony
    www.WordArticles.com
    Tuesday, March 22, 2011 6:03 PM
  • Hi Tony,

    I mean for e.g. we have a URL of wikipedia. I need to get the content of wikipedia page and save it in a docx file.

     

    Regards,

    Himani

    Wednesday, March 23, 2011 4:39 PM
  • How do you convert that to a .doc file at the moment? I woud have thought the only practical way would be to use Word, and would suggest the same thing to create a .docx, although you might be able to write an xlst transform to do the job.
    Enjoy,
    Tony
    www.WordArticles.com
    Thursday, March 24, 2011 7:58 AM
  • Many hours ago, I clicked on "Report As Abuse" - one click (which could easily be done by mistake) and nothing appeared to happen. Nothing has continued to happen and I am forced wonder whether my click actually did anything at all; I have just clicked it again and, once again, nothing appears to have happened. Does anything happen? Does anybody care?

     


    Enjoy,
    Tony
    www.WordArticles.com
    Thursday, March 24, 2011 5:24 PM
  • Hi Tony,

    This same attack hit a bunch of threads and I saw it marked as Already Reported in those. In this thread the Report As Abuse link was still active and I clicked to report. After doing so, I got the same behavior as you've seen ... nothing marked.

    Maybe they're working on it. :-)


    Regards
    Thursday, March 24, 2011 5:37 PM
    • Marked as answer by himania Thursday, November 10, 2011 10:29 AM
    Thursday, November 10, 2011 10:29 AM