none
Processing this item failed because of a Word parser error. RRS feed

  • Question

  • I'm trying to resolve three instances of this search crawl error:

    Processing this item failed because of a Word parser error. ( Error parsing document 'http:/.../.../[somefile].docx'. Document is not a valid Open XML Word Processing document.; ; SearchID = 5AF2FEF5-8692-4C6B-8BA5-13967702B958 )

    The crawler is able to read other DOCX files just fine.  Curiously, I am able to open the problematic documents in Microsoft Word 2010 without issue - no errors present when opening.  Additionally, when I perform an inspection of the document (Word > Info > Check for Issues), it finds Document Properties, Custom XML and Headers and Footers results, but nothing else.  I even tried removing the custom XML from one of these documents, uploading it back, and then re-crawling it; but the crawler still presented an error trying to crawl this document.  I also tried "Saving as..." under a new document title, uploading it and then recrawling it, but this still didn't resolve the crawling problem: same error still presented when this document was crawled.

    Does anyone in the community have any experience encountering and resolving this kind of crawl error?  I would be grateful for your input.


    General

    Thursday, May 7, 2015 3:32 PM

Answers

All replies

  • Did you ever get this resolved? I'm having the same issue and cannot figure out what is wrong.  I've even changed all of search to run as Farm admin account to eliminate permissions as a culprit. Still doesn't fix it.  I'm also getting PDF parser and parser timeout errors.
    • Edited by RoboPA Tuesday, July 21, 2015 8:06 PM
    Tuesday, July 21, 2015 8:05 PM
  • Unfortunately not.  The document owners simply deleted the documents.
    Tuesday, July 21, 2015 9:54 PM
  • Hi,

    We have the same problem. The cause is a malformed hyperlink URI in the Word document.

    When you remove the hyperlink, the document is crawled successfully. You can easily check if your document is affected by editing the document in Office Web Apps. Normally you get an error message that the document is corrupt. Or in Office365 you get a message when editing in Word Online that there are malformed hyperlinks.

    I’m going to open a case because this is a real problem. We have more than 1000 documents that are affected by this problem.

    Same problem in Office365.

    Regards,

    Johan

    Thursday, May 26, 2016 1:03 PM
  • Hi Johan, 

    Did you get a resolution to this?  I have a large volume of files with this error as well.

    Thanks,

    Brian

    Friday, February 10, 2017 3:02 PM
  • Sorry, Forgot to respond.


    Yes this specific problem with malformed URI’s is solved for SP2013 when you install the CU September 2016 update.
    For SharePoint 2016, their current estimation for this to be delivered is 2017 Q1.

    https://support.microsoft.com/en-us/help/3118269/september-13,-2016,-update-for-sharepoint-server-2013-kb3118269


    "Word documents and PowerPoint presentations that have invalid hyperlinks aren't searchable."


    Regards,


    Johan

    • Marked as answer by Stephan Bren Monday, February 13, 2017 2:38 PM
    Friday, February 10, 2017 3:14 PM