-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
W3CDom attribute names case sensitivity #981
Labels
Comments
Change that will fix that i squite simple: index 81ac932..281e3d7 100644
--- a/src/main/java/org/jsoup/helper/W3CDom.java
+++ b/src/main/java/org/jsoup/helper/W3CDom.java
@@ -124,7 +124,7 @@ public class W3CDom {
// valid xml attribute names are: ^[a-zA-Z_:][-a-zA-Z0-9_:.]
String key = attribute.getKey().replaceAll("[^-a-zA-Z0-9_:.]", "");
if (key.matches("[a-zA-Z_:][-a-zA-Z0-9_:.]*"))
- el.setAttribute(key, attribute.getValue());
+ el.setAttribute(key.toLowerCase(), attribute.getValue());
}
} |
Thanks - this was fixed many, many moons ago! Apologies for the late reply. |
And, if the XML parser was used instead of the HTML parser, the attribute names and tag names would be output with the original case. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
According to HTML specification (http://w3c.github.io/html-reference/documents.html#case-insensitivity), both tag and attribute names are case insensitive. However, in current implementation tag names are converted to lower case, but attribute names are left as-is.
Example HTML:
will make following test case to fail:
The text was updated successfully, but these errors were encountered: