Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline Images get stripped #51

Closed
mstaack opened this issue Dec 18, 2017 · 15 comments
Closed

Inline Images get stripped #51

mstaack opened this issue Dec 18, 2017 · 15 comments
Assignees

Comments

@mstaack
Copy link

mstaack commented Dec 18, 2017

Using this content:

<img src="
    DJdYTIeanOpT+DOTuANXi/bGOrWj6CONzv2sPjv2CmV1unU4zPgISg6DJnJ3ImTh8Mtbs00aNP1CZSGy0YqLEn47RgXW8amasW
    7XWsmmvX2iuXiwAAAAAEAAQAAAFVyAgjmRpnihqGCkpDQPbGkNUOFk6DZqgHCNGg2T4QAQBoIiRSAwBE4VA4FACKgkB5NGReAS
    FZEmxsQ0whPDi9BiACYQAInXhwOUtgCUQoORFCGt/g4QAIQA7">

and this policy:

	// Define a policy, we are using the UGC policy as a base.
	p := bluemonday.UGCPolicy()

	// Allow images to be embedded via data-uri
	p.AllowDataURIImages()

i get an empty string back.... whats wrong with my policy?

cheers max

@grafana-dee
Copy link
Contributor

There is nothing wrong with the policy, but the image data is truncated or incomplete.

The data itself is parsed as base64 using https://github.com/microcosm-cc/bluemonday/blob/master/helpers.go#L223-L226

But the input is not valid https://play.golang.org/p/-eq8Xsr8Qy :

package main

import (
	"encoding/base64"
	"fmt"
)

func main() {
	s := `R0lGODdhEAAQAMwAAPj7+FmhUYjNfGuxYYDJdYTIeanOpT+DOTuANXi/bGOrWj6CONzv2sPjv2CmV1unU4zPgISg6DJnJ3ImTh8Mtbs00aNP1CZSGy0YqLEn47RgXW8amasW7XWsmmvX2iuXiwAAAAAEAAQAAAFVyAgjmRpnihqGCkpDQPbGkNUOFk6DZqgHCNGg2T4QAQBoIiRSAwBE4VA4FACKgkB5NGReASFZEmxsQ0whPDi9BiACYQAInXhwOUtgCUQoORFCGt/g4QAIQA7`

	if _, err := base64.StdEncoding.DecodeString(s); err != nil {
		// this one fires
		fmt.Println(err.Error())
	} else {
		fmt.Println("parsed as base64")
	}
}

Were you to provide a valid base64 input, it would work (following example based on the unit test):

package main

import (
	"fmt"

	"github.com/microcosm-cc/bluemonday"
)

func main() {
	p := bluemonday.NewPolicy()
	p.AllowImages()
	p.AllowDataURIImages()

	s := `<img src="">`

	// this returns the image
	fmt.Printf("'%s'\n", p.Sanitize(s))
}

@grafana-dee grafana-dee self-assigned this Dec 18, 2017
@mstaack
Copy link
Author

mstaack commented Dec 18, 2017

@buro9 thanks for you feedback, will check base64 on my side ;)

@mstaack mstaack closed this as completed Dec 18, 2017
@mstaack
Copy link
Author

mstaack commented Dec 18, 2017

@buro9 weird
this is my base64 img code:
https://play.golang.org/p/O0j5Dmlvmp so its valid.

but my html response has an empty image tag: <img alt="">

this is my go code:

	// Define a policy, we are using the UGC policy as a base.
	p := bluemonday.UGCPolicy()

	// HTML email is often displayed in iframes and needs to preserve core
	// structure
	p.AllowDocType(true)

	// HTML email frequently contains obselete and basic HTML
	p.AllowElements("html", "head", "body", "label", "input", "font", "main", "nav", "header", "footer", "kbd", "legend", "map", "title")

	p.AllowAttrs("type").Matching(StyleType).OnElements("style")
	p.AllowAttrs("style").Globally()


	// Custom attributes on elements
	p.AllowAttrs("type", "media").OnElements("style")
	p.AllowAttrs("face", "size").OnElements("font")
	p.AllowAttrs("name", "content", "http-equiv").OnElements("meta")
	p.AllowAttrs("name", "data-id").OnElements("a")
	p.AllowAttrs("for").OnElements("label")
	p.AllowAttrs("type").OnElements("input")
	p.AllowAttrs("rel", "href").OnElements("link")
	p.AllowAttrs("topmargin", "leftmargin", "marginwidth", "marginheight", "yahoo").OnElements("body")
	p.AllowAttrs("xmlns").OnElements("html")

	p.AllowAttrs("style", "vspace", "hspace", "face", "bgcolor", "color", "border", "cellpadding", "cellspacing").Globally()

	// HTML email tends to see the use of obselete spacing and styling attributes
	p.AllowAttrs("bgcolor", "color", "align").OnElements("basefont", "font", "hr", "table", "td")
	p.AllowAttrs("border").OnElements("img", "table", "basefont", "font", "hr", "td")
	p.AllowAttrs("cellpadding", "cellspacing", "valign", "halign").OnElements("table")

	// Allow "class" attributes on all elements
	p.AllowStyling()

	// Allow images to be embedded via data-uri
	p.AllowDataURIImages()

	// Add "rel=nofollow" to links
	p.RequireNoFollowOnLinks(true)
	p.RequireNoFollowOnFullyQualifiedLinks(true)

	// Open external links in a new window/tab
	p.AddTargetBlankToFullyQualifiedLinks(true)

	// Read input from stdin so that this is a nice unix utility and can receive
	// piped input
	dirty, err := ioutil.ReadAll(os.Stdin)
	if err != nil {
		log.Fatal(err)
	}

	// Apply the policy and write to stdout
	fmt.Fprint(
		os.Stdout,
		p.Sanitize(
			string(dirty),
		),
	)

@grafana-dee
Copy link
Contributor

Hmmm, if I take your image example from the linked Go Playground... it works for me.
When I use the base64 input within a src="" if does not work.
If I remove the line breaks and whitespace... it works for me.

Do you have whitespace or line breaks in your input data URL content?

@mstaack
Copy link
Author

mstaack commented Dec 18, 2017

ahh yeah...okay
image

@mstaack
Copy link
Author

mstaack commented Dec 18, 2017

but i cant just replace all \n to an empty string

@mstaack
Copy link
Author

mstaack commented Dec 18, 2017

hmm okay fixed the base64 part...
image

but this still gives an empty image tag... weird

@grafana-dee
Copy link
Contributor

This is a valid bug... I'll fix it :)

Line feeds in data URI attributes in HTML5 are valid, so I should be stripping those out beforehand.

@grafana-dee grafana-dee reopened this Dec 18, 2017
@mstaack
Copy link
Author

mstaack commented Dec 18, 2017

@buro9 your work is awesome! will recheck when your done. thanks!

@grafana-dee
Copy link
Contributor

Thanks... do you have a test file btw... with the full input you are using (for this part of the file)... to avoid me copying it from a screenshot :)

@mstaack
Copy link
Author

mstaack commented Dec 18, 2017

@mstaack
Copy link
Author

mstaack commented Dec 18, 2017

@mstaack
Copy link
Author

mstaack commented Dec 18, 2017

both should work

@grafana-dee
Copy link
Contributor

These would still return empty string because image/image/png is not a valid content-type.

But assuming image/png is used I'll make the new line one work too.

@mstaack
Copy link
Author

mstaack commented Dec 18, 2017

yeah you are totally right... will fix that on my side ;)

grafana-dee pushed a commit that referenced this issue Dec 18, 2017
In pre-go1.8 there were bugs with encoding issues that we did not encounter if
we just rejected URLs that contained whitespace. As whitespace is only valid
when data: URLs are used, we will restrict allowing whitespace only to data:
scheme URLs
grafana-dee pushed a commit that referenced this issue Dec 18, 2017
Resolves #51 by permitting spaces in URLs within HTML
nixypanda added a commit to trilogy-group/kayako-bluemonday that referenced this issue Dec 19, 2017
* 'master' of github.com:microcosm-cc/bluemonday: (21 commits)
  Resolves microcosm-cc#51 Adjusted to be safe go pre-go1.8
  Resolves microcosm-cc#51 by permitting spaces in URLs within HTML
  Travis tests go1.1 to go1.9 and tip
  Rename LICENCE.md to LICENSE.md
  Add Go1.9 to Travis CI
  Remove .gitignore.
  Testing on go1.8 and go1.9rc2 tip
  Do not vendor dependencies
  Fixed build conditional for < go1.8
  Fixes 42 by using conditional compilation of tests
  Add center and marquee to whitelist of elements allowed without attributes
  Issue 37 case tag erroneously was 'javascript' not 'script'
  Added tip back to travis with allow failures
  tip is weird sometimes. It's not me, it's you.
  fmt -s, Makefile cleanup
  Resolves microcosm-cc#35
  Updated to reflect recent changes
  Add Go1.7 to testing
  Resolves microcosm-cc#33
  Added Gufran to credits
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants