locked
Crawling issue RRS feed

  • Question

  • Hi,

     

    We encounter a strange crawling issue: under a directory, some sub-directories can be crawled, some can't.

     

    For example:

    http://test.com/test/A/

    http://test.com/test/B/

    http://test.com/test/C/

    http://test.com/test/D/

     

    Directory A and B can be crawled, but directory C and D can't be crawled.

    The crawling log for C & D is:

    Access is denied. Verify that either the Default Content Access Account has access to this repository, or add a crawl rule to crawl this repository. If the repository being crawled is a SharePoint repository, verify that the account you are using has "Full Read" permissions on the SharePoint Web Application being crawled. (The item was deleted because it was either not found or the crawler was denied access to it.)

     

    The permissions for /test/ directory and all its subdirectories are the same. C and D are not newly created directories. A, B, C, and D exist before we installed Search Server 2008.

     

    Any help/advice would be highly appreciated!!

     

    JCGarden

     

    Tuesday, November 27, 2007 4:42 PM

All replies

  • Go to Search Administration --> Default Content Access Account, and fill the server administrator info.

     

    Monday, December 31, 2007 12:09 PM
  • After wrestling with Search Server for an entire day and reading all the posts, I finally got it to crawl my two intranet sites.  To get it to work I first created the content source, then I created a rule for the content source with (http://ypursite/* in the path using an account that has full access to the SharePoint site.  I also selected "include all items in this path"  and "crawl SharePoint content as Http pages."

    So it looks like you need a content source and a complementary rule for the search to work properly. 

     

    I set the rule to do a full crawl initially, then I changed it to incremental - once a day - after the initial crawl.

     

    Hope this helps.

     

    Hank for change.

     

     

     

    Wednesday, May 28, 2008 3:18 PM
  • This looks like an authentication issue. The search server will not crawl a web application from a different machine if it uses kerberos authentication. It will only work with NTLM or basic authentication (or forms or anonymous, but I guess that's not the case). You could change the authentication type to NTLM to see if it solves your problem, but that's only a short term solution.

    Tuesday, June 10, 2008 7:23 AM
  • JC,

    Have you tried creating a separate rule for the folders that are not being crawled.  Give that a shot and see what happens.  Make sure that the account that you use have the rights to the folders.

     

    Tuesday, June 10, 2008 1:38 PM