Excluding Links
lychee allows you to selectively check links using --include
and --exclude
parameters. Both accept regular expressions, giving you powerful control over which links to check or ignore.
Basic Usage
Section titled “Basic Usage”Here are some basic examples to get you started:
# Exclude all links that contain "example.com" and "example.org"lychee --exclude example\.com --exclude example\.org
# Same as above, but using a single exclude parameterlychee --exclude example\.(com|org)
# Check only links that contain "twitter.com"lychee --exclude '.*' --include 'twitter\.com'
Advanced Examples
Section titled “Advanced Examples”Let’s look at some more advanced, real-world scenarios:
# Exclude links to specific file typeslychee --exclude '\.(pdf|zip|png|jpg)$'
# Exclude links to social media platformslychee --exclude '(facebook|twitter|linkedin|instagram)\.com'
# Check only links within your own domainlychee --include '^https?://yourdomain\.com'
# Exclude links to specific subdomainslychee --exclude '^https?://blog\.example\.com'
# Exclude links with certain URL parameterslychee --exclude '\?utm_source='
# Exclude links to specific sections of a websitelychee --exclude 'example\.com/blog/\d{4}/'
Important Notes
Section titled “Important Notes”-
Full URL Matching: The regex matches against the full URL, including the scheme (http:// or https://). For example:
Terminal window # This will work:lychee --exclude '^https://www\.linkedin\.com'# This might not work as expected:lychee --exclude 'linkedin\.com' -
Precedence: Includes take precedence over excludes. You can use this to create complex filtering rules.
-
Escaping Special Characters: Remember to escape special regex characters like
.
with a backslash. -
Testing Your Regex: It’s a good idea to test your regex patterns before using them with lychee. You can use online regex testers,
grep
, orripgrep
for this purpose.
Permanently Excluding Links
Section titled “Permanently Excluding Links”If you have links you always want to exclude, you can put them into a
.lycheeignore
file in the root of your project:
https://www.zombo.com/# This is a comment, which will be ignoredhttps://www.youtube.com/watch?v=dQw4w9WgXcQ# Regex is also supported in herehttps?:\/\/(www\.)?reddit\.com\/r\/(funny|videos)^mailto: # Ignore all mailto links
This way, you don’t have to specify them with --exclude
every time you run lychee
and you can check them into version control.
Tips for Effective Link Exclusion
Section titled “Tips for Effective Link Exclusion”- Start Specific: Begin with specific exclusions and broaden them if needed.
- Use Comments: In
.lycheeignore
, use comments to explain complex patterns. - Group Similar Exclusions: Use regex alternation
(a|b)
to group similar exclusions. - Review Regularly: Periodically review your exclusions to ensure they’re still relevant.
- Combine with Other Flags: Use exclusions in combination with other lychee flags like
--include
for fine-grained control.
Troubleshooting
Section titled “Troubleshooting”If your exclusions aren’t working as expected:
- Make sure you use single quotes (
'
) around your regex patterns to prevent shell expansion. Depending on your shell, double quotes ("
) might not work as expected and single quotes are the only way to ensure the regex is passed to lychee correctly. - Check that your regex is correct and escapes special characters properly.
- Use an online regex tester to validate your patterns.
- Ensure you’re matching against the full URL, including the scheme.
So if you want to exclude
example.com
, use^https?://example\.com
. - Use lychee’s verbose output (
-v
) to see which links are being checked or excluded. - For complex setups, consider breaking your check into multiple lychee runs with different exclusion/inclusion rules.
Remember, the goal is to strike a balance between thorough link checking and avoiding false positives or unnecessary checks. Tailor your exclusions to your specific needs and the structure of your content.