This document describes common edge-cases and workarounds for checking links to various sites.
Please add your own findings and send us a pull request if you can.
GitHub has a quite aggressive rate limiter.
If you're seeing errors like:
GitHub token not specified. To check GitHub links reliably, use `--github-token` flag / `GITHUB_TOKEN` env var.
That means you're getting rate-limited. As per the message, you can make lychee
use a GitHub personal access token to circumvent this.
For more details, see "GitHub token" section in README.md.
The number of concurrent network requests (MAX_CONCURRENCY
) is set to 128 by default.
Every network request maps to an open socket, which is represented as a file on UNIX systems.
If you see error messages like "error trying to connect: tcp open error: Too
many open files (os error 24)" then you ran out of file handles.
You have two options:
- Lower the concurrency by setting
--max-concurrency
to something more conservative like 32. This works, but it also comes with a performance penalty. - Increase the number of maximum file handles. See instructions here or here.
Some websites don't respond with a 200
(OK) status code.
Instead they might send 204
(No Content), 206
(Partial Content), or
something else entirely.
If you run into such issues you can work around that by providing a custom
list of accepted status codes, such as --accept 200,204,206
.
Some sites expect one or more custom headers to return a valid response.
For example, crates.io expects a Accept: text/html
header or else it
will return a 404.
To fix that you can pass additional headers like so: --header "accept=text/html"
.
You can use that argument multiple times to add more headers.
Or, you can accept all content/MIME types: --header "accept=*/*"
.
See more info about the Accept header over at MDN.
We use https://github.com/reacherhq/check-if-email-exists for email checking. You can test your mail address with curl:
curl -X POST \
'https://api.reacher.email/v0/check_email' \
-H 'content-type: application/json' \
-H 'authorization: test_api_token' \
-d '{"to_email": "box@domain.test"}'
Some settings on your mail server (such as SPF
Policy, DNSBL
) may prevent
your email from being verified. If you have an error with checking a working
email, you can disable this check using the commandline
parameter
--exclude-mail
.