none
Site contents are not crawled with no error SharePoint 2013 Enterprise Search RRS feed

  • Question

  • Hi All,

    Newly, I have created the SP farm and deployed the branding site with variation feature for en-English and fr-French. SharePoint search service application has been created and trying to crawl the site content.  The site would be the public facing anonymous site and off loading the SSL in f5 LB.  

    E.g. site urls:

    Default site URL: http://hostname/

    Extended Internet Zone URL: http://portal.site.com/ (Inside f5)

    Internet URL: https://portal.site.com/ (Outside f5)

    Configuration:

    SP2013 Build: August 2014, 15.0.4641.1000

    Server: 2*Application server, 2*WFE and 2*DB with SQL AlwaysOn

    Search Topology: all 2 WFE server has Query Processing and Index Partition other components are in APPServer.

    • Default content source with internal extended site url http://portal.site.com
    • Crawl rule would be http://portal.site.com/*.* http://portal.site.com/en/* and http://portal.site.com/fr/*
    • Server Name Mapping would be the address for indexing http://portal.site.com to address in search https://portal.site.com
    • The site pages libraries has been enabled to crawl and show it in search result (Allow items from this document library to appear in search results? Yes & Allow items from this document library to be downloaded to offline clients? Yes ) both "en" and "fr".
    • Whenever I have tried to start full crawl, it would complete with in 2 minutes and crawled only one link. i.e http://portal.site.com
    • There is no error or warning message appeared in crawl logs
    • The site can be browseable in Application server

    Guys, could you please help me out to resolve this issue?  I tried lot of possibilities over today and broken my head.  Now my mind blackout so jumped here to catch up you all.

    Guess you guys having the resolution in your finger tip now.  Please share you thoughts and hope you found my mistakes/misconfiguration to rectify this issue.

    Thanks in adavance.

    Regards,

    Sathiya | http://sathiya.io

    Saturday, November 19, 2016 8:22 PM

All replies

  • Hi Sathiya,

    Please modify the URL in your content source to be the default zone URL http://hostname/ and then run a full crawl.

    It is recommended to crawl the default zone in SharePoint:

    https://blogs.msdn.microsoft.com/sharepoint_strategery/2013/02/20/beware-crawling-the-non-default-zone-for-a-sharepoint-2013-web-application/

    Best Regards,

    Victoria

    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Monday, November 21, 2016 7:50 AM
    Moderator
  • Thank you Victoria, as per your suggestion, I have been changed default zone url in content source.  Getting same kinda response from Search.

    I did index reset before start full crawl for default zone site.  It has been done with http://hostname item only.

    Supposed if i add Server Name Mapping then the crawl happened only https://portal.site.com root only.

    Friday, November 25, 2016 6:56 AM
  • Hi Sathiya,

    Please check crawl log in Search Service Application now to see if the items have been crawled.

    Best Regards,

    Victoria


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Friday, November 25, 2016 7:57 AM
    Moderator
  • Hi Victoria,

    Thanks, i'm getting this message

    The content for this address was excluded by the crawler because this item was marked with a no-index meta-tag. To index this item, remove the meta-tag and recrawl. ( SearchID = BEFA5500-1659-4B4A-AF2C-44E11A80117C )

    I have added the below entry in robots.txt also not helped.

    User-agent: *

    Allow: /

    Disallow: /_layouts/ Disallow: /_vti_bin/ Disallow: /_catalogs/ Disallow: /temp/

    Thanks once again..


    Friday, November 25, 2016 8:19 AM
  • Hi Sathiya,

    Please remove all the disallows from the robots.txt and then check if the settings below have been set to Yes for each site:

    After that, please do a full crawl again and then check the crawl log.

    Best Regards,

    Victoria


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Friday, November 25, 2016 8:51 AM
    Moderator
  • Hi Victoria,

    Really thanks for spending time with my post.  But no luck it was turned on already.  Would the below screenshots help to find any solution.

    Note: I was tried to recreate the Search Service Application newly.  Supposed the content source should be fetch default application url, but in my case it was not happened. 
    Friday, November 25, 2016 5:49 PM
  • Hi Sathiya,

    Was the crawl log showed the same error in the new Search Service Application?

    And did this issue occur with all the items crawled?

    What is the authentication provide used in the default zone and extended zone for the web application?

    Best Regards,

    Victoria


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Monday, November 28, 2016 4:52 AM
    Moderator
  • Hi Victoria,

    No luck! Still its persist same error.  That mean crawled only top level site url.  

    Both zone provided as Claims Based Authentication with Enable Windows Authentication(NTLM) by default config.

    Kindly have a look on the below possibility that i have tried to short out this issue;

    • Added crawl rule http://hostname/en/pages/* http://hostname/fr/pages/*
    • Configured server name mapping that crawl url as http://hostname/ in index https://portal.site.com/
    • In Farm search administration Ignore SSL waring as NO
    • Default content access account has full Read permission for the Web Application
    • 4*Noderunner.exe running in crawler machines
    • In web application point of view, Search and Offline Availability has been enabled
    • Allow items from this document library to appear in search results - Yes (Both variation site pages library)

    Hope this may help you to narrow down the issue.  I guess, there is no issue with Search Service application.  Do you want me to check anything over the web application like variation site settings or search settings?

    Thank again for your valuable time.

    Tuesday, November 29, 2016 6:59 AM
  • Hi Sathiya,

    Please provide a screenshot of your Alternate Access Mappings settings in Central Administration here for further research.

    And what is the full URL of your Source variation site?

    Best Regards,

    Victoria


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Tuesday, November 29, 2016 12:01 PM
    Moderator
  • Hi,

    Kindly have look on the below screenshots.

    Source Variation Site Url

    AAM

    Thanks.

    Tuesday, November 29, 2016 6:38 PM
  • Hi Sathiya,

    Please go to Site Settings > Search and Offline Availability > set the "Always index all Web Parts on this site" to true in each site. After that do a full crawl again.

    Best Regards,

    Victoria


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Thursday, December 1, 2016 4:04 AM
    Moderator
  • Hi Victoria,

    It has been enabled all sites including root site. This is what i'm getting the into the index.

    Thanks again!

    Thursday, December 1, 2016 7:11 AM
  • Hi Sathiya,

    From the screenshot, it seems that you are still crawling the internet zone URL instead of default zone URL in your content source.

    Please change the URL to http://guj-wbapp-v01 in your content source and then do a full crawl.

    Best Regards,

    Victoria


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Thursday, December 1, 2016 9:12 AM
    Moderator
  • Hi,

    I have configured default zone URL only in content source and Server Name Mapping as http://guj-wbapp-v01 - https://wab1.***.ly 

    I guess server name mapping might be changed in that search index list.  So I have removed and tried again, but no luck! 

    Solved: http://sathiya.io/sharepoint/sharepoint-crawling-not-working-with-non-default-zone-public-facing-site.php



    Friday, December 2, 2016 6:46 AM