Broken Link Checker Checker

Source code is on Bitbucket

Echoback to see the request headers for this browser

To see Http Status Codes and be able to return them from this site.

Edge Cases

The x's should show up as broken links

I've encountered lots of strange behaviour whilst doing broken link checking usually to do with anti-scraping mechanisms. These links are part of my test suite I run against my own tool. The links are here to test and help out other broken link checkers too.

These links should not redirect to something else (that is tested below)

2. Linked in - working link Usually returns a 999 status code or hits a security check through puppeteer

2x. Linked in - not working link Usually returns a 999 status code or hits a security check through puppeteer

3. Drupal.org- working link

3x. Drupal.org - not working link

4. mouser.co.uk Akamai problem?

4x. mouser.co.uk Akamai problem?

5. element14.com can timeout - webserver security.

5x. element14.com can timeout - webserver security.

6. cert-manager.io/docs Can be strange

6x. cert-manager.io/docsXXX Can be strange

7. zillow.com Hits a captcha

7x. zillow.com Hits a captcha

8. autohotkey.com cloudflare fronted

8x. autohotkey.com cloudflare fronted

9. autohotkey.com/boards cloudflare fronted

9x. autohotkey.com/boards cloudflare fronted

10. rayner.com normal wordpress

10x. rayner.com normal wordpress

11. https://www.amazon.co.uk amazon - blocks HEAD

11x. https://www.amazon.co.uk/XXX

20. https://www.dell.com/support/article/en-au/sln311129/dell-command-update?lang=en

20x. https://www.dell.com/support/article/en-uk/sln311129XXX/dell-command-update?lang=en HEAD doesn't work GET does

24. https://twitter.com/dave_mateer twitter - working link

24x. https://twitter.com/dave_mateerXXX twitter - not working link but hard to test

Internal Redirects to Canonical URL

A canonical URL is your preferred url

http://www.brokenlinkcheckerchecker.com/sc/200.html http://www.brokenlinkcheckerchecker.com/sc/200.html which does a 301 permanent redirect to http://brokenlinkcheckerchecker.com/sc/200.html

Redirects and https and www variants

1. http://davemateer.com/brokenurl which should redirect to https://davemateer.com/404.html

2. http://www.davemateer.com/brokenurl which should redirect twice to https://davemateer.com/404.html

3. https://davemateer.com/brokenurl (canonical) which should return a 404 https://davemateer.com/404.html

4. https://www.davemateer.com/brokenurl which should redirect to the https://davemateer.com/404.html

Internal broken links

Internal broken link to a page /brokenurl that doesn't exist and goes to the global 404

Internal broken link2 with a trailing slash which should go to the 404 as above

Broken link to an asset image .png

Non-Existent Domain Name

Nodomainhere link so a link to a broken domain name

Files

test_image.jpg A 4KB jpg with MIME type: image/jpeg (source wikipedia: https://en.wikipedia.org/wiki/File:Test_image.jpg)

pizigani_10mb.jpg A 10MB jpg with MIME type: image/jpeg (source: https://commons.wikimedia.org/wiki/File:Pizigani_1367_Chart_10MB.jpg)

a17.pdf A 20MB pdf with MIME type: (source: https://www.hq.nasa.gov/alsj/a17/A17_FlightPlan.pdf

cp43.pdf A 100MB pdf with MIME type: (source: https://cartographicperspectives.org/index.php/journal/article/view/cp43-complete-issue/pdf)

200MB.zip A 200MB zip with MIME type: (source: https://www.thinkbroadband.com/download)

Big Pages

big-html-page.html A 2.4MB large html file with data from CIA World Book (source: https://corpus.canterbury.ac.nz/descriptions/).

big-html-page2.html A 38MB large html file with repeated data from CIA World Book (source: https://corpus.canterbury.ac.nz/descriptions/).

Blank Hyperlink

Link with nothing in it ie a href="" blank link

Rate Limits

It will return a 429 Too Many Requests

/ratelimit/index.html 1 request per second

/ratelimit10s/index.html 1 request every 10 seconds

Thank you for using Broken Link Checker Checker!!