Some .aspx pages not being crawled
-
Thursday, August 02, 2012 1:29 PM
I have a weird situation where some .aspx-pages are nog being crawled by the SharePoint Search while other .aspx-pages in that same library are. The crawl log mentions the following warning: Content for this URL is excluded by the server because a no-index attribute.
Only about 10% of the .aspx pages is crawled without that warning. Other content (doc, docx, listitems, pdf) do not show this problem.
- SharePoint 2007 Enterprise
- ContentSource: SharePoint site: Start address='http://sharepoint.customer.lan' --> Crawl everything under the hostname for each start-address
- 4 server farm: 2x WebFrontEnd (NLB), 1x Crawler, 1x SQL Cluster
Things I have checked so far:
- Crawl rules: Deleted all crawl rules just to be sure
- Search visibility for (sub)site: Enabled; tried setting it to "Always" and "Only if not using fine-grained permissions"
- Search visibility for library: Enabled; display items in this library in search result
- Content Access Account: Account is valid AD-account and has (read-) permissions for all items
- Versioning: Items is checked-in and published. No approval required
- ContentType and PageLayout between items being crawled and items that are nog being crawled are identical
- Permissions: Identical (pages inheret permissions)
- File extensions: .aspx is included
- Robot noindex and nohtmlindex are not present in the source
Alain de Klerk
Please click "Propose As Answer" if this post solves your problem or "Vote As Helpful" if this post has been useful to you.
- Edited by Alain de Klerk Thursday, August 02, 2012 1:40 PM
- Edited by Alain de Klerk Thursday, August 02, 2012 1:40 PM

