locked
Finding broken links RRS feed

  • Question

  • User-1374374791 posted

    I have recently had to restore a site due to a server crash.  I know that there are missing PDF files that are hyper-linked from multiple pages.  How can I use the SEO Toolkit to find out what PDF files are missing?

     Thanks,

     Eric

    Monday, March 19, 2012 10:06 AM

All replies

  • User-1374374791 posted

     I found the problem.  If you have custom error pages, the broken links do not show up.

     

    Eric

    Monday, March 19, 2012 11:10 AM
  • User-47214744 posted

    A couple of ways you might still be able to find them are:

    1) Do a query for all "*.pdf" that returned HTML content type, something like (save below as query.xml and use the Report->Open Query):

    <?xml version="1.0" encoding="utf-8"?>
    <query dataSource="urls">
     
    <filter>
       
    <expression field="ContentTypeNormalized" operator="Equals" value="text/html" />
       
    <expression field="URL" operator="Ends" value=".pdf" />
     
    </filter>
     
    <displayFields>
       
    <field name="URL" />
       
    <field name="StatusCode" />
       
    <field name="Title" />
       
    <field name="Description" />
     
    </displayFields>
    </query>

    2) Alternatively if your error pages are shown by a redirect, you can use a query for all .pdf that resulted in a redirect:

    <?xml version="1.0" encoding="utf-8"?>
    <query dataSource="urls">
     
    <filter>
       
    <expression field="StatusCode" operator="Equals" value="Found" />
       
    <expression field="URL" operator="Equals" value=".pdf" />
     
    </filter>
     
    <displayFields>
       
    <field name="URL" />
       
    <field name="StatusCode" />
       
    <field name="Title" />
       
    <field name="Description" />
     
    </displayFields>
    </query>
    Monday, March 19, 2012 12:56 PM
  • User-1941904720 posted

    I would suggest brokenlinkcheck website, it is a free broken link checker. Also for small websites you can use site explorer.

    Tuesday, August 27, 2019 6:32 AM