ScanEmail: Restrict Access in WeasyPrint #459
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add
local_fetch_only
Function to Restrict External Network Access in WeasyPrintDescription
This PR introduces a custom URL fetcher function,
local_fetch_only
, for WeasyPrint. The purpose of this function is to prevent any external network access during the fetching process. It allows only local file paths, base64 encoded data, and relative URLs. All other URLs, including HTTP, HTTPS, FTP, and IP addresses, are blocked. Previously, external calls were observed for things like CSS files and such. This should be restricted.Implementation
The
local_fetch_only
function is designed to:data
scheme.file
scheme.For all other URL schemes (e.g.,
http
,https
,ftp
), the function returns an empty response, effectively blocking the request.Code
Reasoning
The primary motivation for this implementation is security. By blocking external network requests, we ensure that WeasyPrint cannot inadvertently leak data or fetch resources from untrusted sources. This will prevent some resources from loading but ultimately is safer.
Control
Allowing only local file paths and base64 encoded data provides fine-grained control over the resources that can be accessed. Relative URLs are permitted to ensure that internal resources can still be referenced without specifying the full URL.
Use Cases
Describe testing procedures
An additional test with an
eml
file was created to test the retrieval of external (fake) resources. This test will produce a thumbnail of an image without additional tagsSample output
If this change modifies Strelka's output, then please include a sample of the output here.
Checklist