-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sanitizer exception for IMG SRC attribute not being applied #16020
Comments
I'm having the same issue as described by @mjfs to get
|
The following issue for the bluemonday microcosm-cc/bluemonday#51 (comment) suggest that the implementation for the src allowing policy must be something like
rather than the straightforward gitea/modules/markup/sanitizer.go Line 114 in b3ef6a6
And this issue suggest the the valid configuration exists #3025 and has a request for the example to be added to the docs. Would be greate if the solution (now or after a bugfix) will be added as an example to https://docs.gitea.io/en-us/external-renderers/#appini-file-configuration (now it has only TeX example) |
This works for me: [markdown]
CUSTOM_URL_SCHEMES = data
[markup.docx]
ENABLED = true
FILE_EXTENSIONS = .docx
RENDER_COMMAND = "pandoc --from docx --to html --self-contained"
IS_INPUT_FILE = false The src attribute is not blocked but the data url. Now the images are there but not rendered for me in Firefox. The standalone pandoc output works but not embedded into Gitea. But that may be another problem. |
@KN4CK3R: Your proposal does actually produce a non-empty IMG SRC attribute. Unfortunately, the data URI gets corrupted, probably at the sanitizing phase. Therefore this results in an invalid image format since the content can not be Base64 decoded into a valid JPG (or any other format used as input). It appears that the payload is still considered as a valid uri during processing therefore shortened (e.g. multiple slashes get reduced to a single one). Instructions bellow are not directly related to the open issue, but might be helpful to someone else trying to determine how to use To avoid composing entire
HTML file
To test it outside in command line you can use the following (with cat Sample.docx | pandoc --from docx --to html --metadata title=" " --self-contained --template /usr/bin/Blank.html > Sample.html Instead of the above one could also cut redundant lines from the |
The problem with some jupyter files are the invalid data uri images. If the input file contains images in base64 format with lines separated by newlines they will be dropped by the sanitizer because a data uri should not contain control characters. You may need to convert the jupyter input or output and strip those newlines. Sample input with
You could use a wrapper script which replaces the newlines before passing the file to nbconvert. |
A wrapper is not needed anymore after we upgrade bluemonday (see microcosm-cc/bluemonday#123) |
[x]
):Description
When using external markup renderer, sanitizer exception is not being applied. The attribute is consequently removed from output.
I am using
Pandoc
to render Office Open XML document (docx
extension). No matter what combination of sanitizer configuration and markup renderer I choose, the data URI value ofsrc
attribute onimg
element is always removed from Gitea's final HTML output for anydocx
file previewed in browser (i.e. only<img/>
remains).As I understand the Gitea documentation (as well as cheat sheet), the configuration bellow should work:
I was not able to found any workaround for this scenario (that could achieve desired end result) in the documentation, so if any other solution is generally used as an alternative for this use case (e.g. such as externalizing document resources), that will also do.
The text was updated successfully, but these errors were encountered: