locked
Search server is not indexing the content of pdf files RRS feed

  • Question

  • Hi

    I have installed SSE2008 on our Win2k3 domain controller.

    The Search Server is working well but for some reason it is not indexing the contents of pdf files.

    If I search for a pdf by filename the server will find it. If I search for content within the pdf no results are returned.

    All data (c. 122GB) being crawled is located in a single share on a separate server. Share permissions and security permissions for every folder and file contain the domain administrator account which has full control. The shared resource being indexed not only includes full control by the domain administrator, but is completely open - the Everyone account also has full control.

    The account to use for the crawl itself is set to use the domain administrator account. Nothing is excluded from the search index.

    I added the pdf extension to the list of file types that the server should index. The entry was successful and was listed as 'Adobe Acrobat Document'.

    The metadata is set so that virtually all classifications are included in scopes.

    I have searched the net for solutions but the resources I have read agree that so long as I have the latest version of Adobe Reader installed, the search server should be able to index the contents of the PDF files. The only pdf iFilter information I can find relates to problems dating back to 2007 and then that seems to apply mostly to x64 systems. Is the iFilter required for a x32 system even if Adobe Reader 9 is installed (and it was installed before SSE was installed)?

    If anyone can tell me what I may be doing wrong, or if anyone has any useful suggestions, I will be most grateful

    Thanks

    Thursday, February 12, 2009 12:40 PM

All replies

  • If you have Adobe Reader 9.0 installed on the server that should provide the indexing of pdf files for which it was previously necessary to install the (6.0) IFilter. Reader 9.0 includes the IFilter functionality.

    After installing a new IFilter or in this case after installing Adobe Reader 9.0 on the server, you will need to do a new full crawl . Have you done that ?

    P.S. I find it odd that you installed the Adobe Reader on the server before you installed MSSX.

    You might consider an uninstall and a re-install of Reader 9.0 followed by a new full crawl.

    The Adobe web site should be checked to see if they say anything else you need to do.

    WSS FAQ sites: WSS 2.0: http://wssv2faq.mindsharp.com WSS 3.0 and MOSS 2007: http://wssv3faq.mindsharp.com
    Total list of WSS 3.0 and MOSS 2007 Books (including foreign language titles) http://wss.asaris.de/sites/walsh/Lists/WSSv3 FAQ/V Books.aspx
    Thursday, February 12, 2009 12:49 PM
  • Mike Walsh MVP said:

    If you have Adobe Reader 9.0 installed on the server that should provide the indexing of pdf files for which it was previously necessary to install the (6.0) IFilter. Reader 9.0 includes the IFilter functionality.

    After installing a new IFilter or in this case after installing Adobe Reader 9.0 on the server, you will need to do a new full crawl . Have you done that ?

    P.S. I find it odd that you installed the Adobe Reader on the server before you installed MSSX.

    You might consider an uninstall and a re-install of Reader 9.0 followed by a new full crawl.

    The Adobe web site should be checked to see if they say anything else you need to do.


    WSS FAQ sites: WSS 2.0: http://wssv2faq.mindsharp.com WSS 3.0 and MOSS 2007: http://wssv3faq.mindsharp.com
    Total list of WSS 3.0 and MOSS 2007 Books (including foreign language titles) http://wss.asaris.de/sites/walsh/Lists/WSSv3 FAQ/V Books.aspx



    Hi Mike

    Thanks for replying.

    Adobe Reader had been installed on the DC for several months before I installed MSSX. This is why I am a little confused about why the contents of pdf's are not being indexed. Adobe Reader is installed through Group Policy automatically. I'll uninstall it and do a new installation from Adobe's Website, do a new Full Crawl and let you know how it goes.

    Cheers!
    Thursday, February 12, 2009 2:10 PM
  • I share your confusion. However there have been several reports over the years with the Adobe IFilters that a reinstallation and new crawl solved the problem so it's worth trying.

    Otherwise the not free option of the PDF IFilter from FoxIt 

    http://www.foxitsoftware.com/pdf/ifilter/

    is always available to you and that is said to be noticeably faster and also is said to just work. 
    WSS FAQ sites: WSS 2.0: http://wssv2faq.mindsharp.com WSS 3.0 and MOSS 2007: http://wssv3faq.mindsharp.com
    Total list of WSS 3.0 and MOSS 2007 Books (including foreign language titles) http://wss.asaris.de/sites/walsh/Lists/WSSv3%20FAQ/V%20Books.aspx
    Friday, February 13, 2009 5:14 AM
  • Hi,

    I had the same issue, but I followed the steps in this blog excellent post : http://nickwhite.spaces.live.com/blog/cns!94355F53A65D0989!734.entry and it works now.
    • Proposed as answer by Christian_Fe Monday, March 2, 2009 7:38 AM
    Friday, February 13, 2009 2:11 PM
  • I've followed the instructions in this and other blog posts and I still can't get this running.  I'm on Server 2008 x64 with WSS 3.0 and search server 2008 express.

    I installed the 9.0 ifilter x64 from adobe, and still can't see the content of the files in a search.  I check the crawl logs and i see the pdf files being crawled, but no searches to the content brings results.


    Monday, August 24, 2009 7:06 PM
  • Gerome42,
                 Send me an email to naijacoder@hotmail and i can send you a documentation on this.
    Patrick
    Tuesday, August 25, 2009 3:44 AM
  • Hello,

     

    I am also facing the same issue.

    I am able to search the PDF files but can not search the pdf file using its content(s).

    I have IFilter 9 installed on sharepoint 2007 server.

    Please let me know if you have found something.

    Thanks,

    SP~

    Friday, August 6, 2010 1:33 PM