-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix #199: mime-type was incorrectly parsed from content-type when cha… #200
Conversation
…ent-type when charset param exists
@dportabella thanks for the PR! Love seeing this 😃 Looks like you got hit by checkstyle: https://travis-ci.org/archivesunleashed/aut/jobs/370216592#L4855-L4856 |
Line are limited to 80 characters? 😖 |
@dportabella oh, you can create a new commit and push, and it'll automatically update here. I'll squash things done on merge. |
…ent-type when charset param exists
Codecov Report
@@ Coverage Diff @@
## master #200 +/- ##
=======================================
Coverage 67.29% 67.29%
=======================================
Files 32 32
Lines 636 636
Branches 124 124
=======================================
Hits 428 428
Misses 167 167
Partials 41 41
Continue to review full report at Codecov.
|
@lintool can I get one more set of eyes on this before I hit merge? |
@@ -96,8 +96,12 @@ public static WARCRecord fromBytes(final byte[] bytes) throws IOException { | |||
*/ | |||
public static String getWarcResponseMimeType(final byte[] contents) { | |||
// This is a somewhat janky way to get the MIME type of the response. | |||
// Moreover the parser is not fully complaint to the specification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you might mean "compliant"?
How about - "Moreover, this simple regular expression is not compliant with the specification."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor typo, otherwise lgtm.
@dportabella Thanks for this pull request! I think if you incorporate or consider @lintool's comments above (you can just push the changes to the branch and the PR will be updated) we'll be able to merge this into AUT. |
done. |
Thanks for being a contributor, @dportabella ! |
welcome :) |
fix #199: mime-type was incorrectly parsed from content-type when charset param exists
GitHub issue(s):
If you are responding to an issue, please mention their numbers below.
#199
What does this Pull Request do?
Calling
WarcRecordUtils.getWarcResponseMimeType
with a header such asContent-Type: text/html;charset=ISO-8859-1
should returntext/html
. However it was returningtext/html;charset=ISO-8859-1
. Also, the functionkeepValidPages
was not working properly because of this.This pull request fixes this issue.
How should this be tested?
A unit test was added:
WarcLoaderTest.testContentTypeWithCharset
Additional Notes:
Any additional information that you think would be helpful when reviewing this PR.
Example:
Does this change require documentation to be updated?
No
Does this change add any new dependencies?
No
Could this change or impact execution of existing code?
No
Interested parties
Tag (@ mention) interested parties.
Thanks in advance for your help with the Archives Unleashed Toolkit!