Analysis is looping through my site recursively RRS feed

  • Question

  • User1491546438 posted
    I have a problem with one of my sites where the analysis process is looping through my site recursively. In the report I see the following -


    • http://sitename/filename
    • http://sitename//filename
    • http://sitename///filename
    • http://sitename////filename etc


    So I guess my questions are -

    • How can I stop the analysis tool from doing this?
    • Is this a problem with my site or with the analysis tool?
    • How can I trace where the problem starts?
    • I'm guessing somewhere in one (or more) of my pages I have a bad link somewhere and the analysis tool just keeps spidering.  If I go to http://sitename////////default.aspx and view the source I don't see any multiple //////'s so I guess that's why it keeps going...

    Any help, much appreciated.



    Thursday, June 4, 2009 10:05 AM

All replies

  • User-47214744 posted

    Could you find one of those "deep routes" and right-click and use "View Routes for this Page".

    It should show the way we got into this page (sort of like a call-stack starting at the bottom). Then double click each of the entries and see where the link is being used.

    I have a feeling that it might be a redirect (302) that generates an incorrect link in the markup, but please let us know what do you see.


    Friday, June 5, 2009 12:46 PM
  • User1491546438 posted

    Thanks, yes it is a redirect that is causing the problem.

    The response headers - 302, Location: /page.aspx
    In the content body of the response is some HTML and a link to %2Fpage.aspx

    Tuesday, June 9, 2009 5:40 AM
  • User-47214744 posted

    This is a bug in ASP.NET redirection logic that it incorrectly encodes the attribute in the markup causing it to look like a different URL.

    We will probably disable parsing the markup for 302 responses in future versions since by now most browsers will not even display the content and just follow the Location header.

    What you can try doing for now is add a robots.txt in the root of your site that includes something like:

    User-agent: iisbot

    Disallow: /*//

    This will tell us to not follow any of the URL's that contain // slashes together which hopefully will only be this case. Either way when you re-run the analysis you will see an informational entry for each URL we decided not to visit based on robots which will confirm that. This should prevent the infinite loop.

    Let us know if this helped.

    Tuesday, June 9, 2009 12:48 PM