locked
Search of Content Source stops after 1 second?! RRS feed

  • Question

  • Hello,

     

    I have a website that runs on port 81. Additionally I have set in the IIS in the 'Advanced Web Site Settings' multiple identities for this web site:

     

    TCP PORT         HOST HEADER VALUE

    81

    80

    80                      myhomepage

     

    So the homepage is accessible from intern under the following addresses:

    http://pcname/

    http://pcname:81/

    http://myhomepage/

     

    Creating a Content Source with type 'Web Site' and start adress http://pcname:81/ starting the search, everything works and I have several hundred success.

     

    When entering start adress http://myhomepage/ the search stops after one second and has no success?!

     

    What is wrong?

     

    Thanks for reply!

     

    Regards from Germany,

    Steven***

     

     

    Thursday, February 21, 2008 9:08 AM

All replies

  • Here is what the Crawl Log says:

     

    http://myhomepage
    The specified address was excluded from the index. The crawl rules may have to be modified to include this address. (The item was deleted because it was either not found or the crawler was denied access to it.)

     

    - On my search server I have NO crawl rules!

    - The web site is accessible under http://myhomepage

     

    A similar scenario I have with another server, but there it works!

     

     

    Thursday, February 21, 2008 9:23 AM
  •  

    Can you check if there's any robots.txt on the server, or is there any javascript redriection? Or maybe, the redriected page is not the same domain of start address.
    Thursday, February 21, 2008 12:39 PM
  • Yes, there is a robots.txt with foloowing content:

     

    User-agent: *
    Disallow: /

     

    There is no redirection. When you surf on http://myhomepage/ this will always be the domain for the sites.

     

    But I think these two things should be no problem, because the same web site runs on port 81 and there the search works fine!

    Thursday, February 21, 2008 1:01 PM
  •  

    I think this is the robots.txt problem. Can you remove this robots.txt and try again?
    Thursday, February 21, 2008 1:05 PM
  • After removing the robots.txt the search stopped after 1 second again!

     

    Can it be that it stopps because of already having the content in index? I have already serached the web site with http://pcname:81/ and now Í search the same homepage under http://myhomepage/  ?

    Thursday, February 21, 2008 1:14 PM
  •  

    They should be treated as different sources even the contents are the same. Can you add a direct link to the start page as the start address? for example, http://myhomepage/index.php or something like that.
    Thursday, February 21, 2008 1:17 PM
  • Yes, I've tried the startpage http://myhomepage/desktopdefault.aspx, but no results?!

     

    Thursday, February 21, 2008 1:22 PM
  • Is that the same error message with http://myhomepage/?

    Thursday, February 21, 2008 1:24 PM
  • Yes, it is!

     

    http://myhomepage/desktopdefault.aspx
    The specified address was excluded from the index. The crawl rules may have to be modified to include this address. (The item was deleted because it was either not found or the crawler was denied access to it.)

    Thursday, February 21, 2008 1:26 PM
  •  

    This is quite strange. Can you remove and recreate this content source then recrawl it? Or maybe server name mapping would be a workaround.
    Saturday, February 23, 2008 7:38 AM
  • Removing and recreating doesn't work!

     

    What du you mean with 'Server name Mapping'?

     

    Monday, February 25, 2008 9:48 AM
  • You can map http://internal site to http://external.somecompany.com by using SNM. Sometimes it helps,

    Tuesday, February 26, 2008 3:47 AM
  •  

    You mean that I first do the mapping and then the search of the Content Source again!?

     

    I will try...

     

    Thanks for help!

    Tuesday, February 26, 2008 8:54 AM
  • No success! The search still stopps after short time with zero success...

     

    Don't know the problem, because I can call the homepage via web browser http://myhomepage/, but the search doesn't work?!

     

     

     

     

     

     

     

    Tuesday, February 26, 2008 9:00 AM

  • Have you solved the problem? It's very long now.
    Saturday, February 12, 2011 10:21 PM