Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add feature to allow for parsing of self closing script tags #124

Merged
merged 2 commits into from
Oct 30, 2024

Conversation

Seanstoppable
Copy link
Contributor

I discovered that my company was maintaining a fork to handle unclosed script tags.
Since functionality exists to enable handling for unclosed iframes and other unclosed tags, this brings in a similar feature to close that gap.

@rbri
Copy link
Member

rbri commented Oct 28, 2024

Thanks @Seanstoppable will have a look....

@rbri
Copy link
Member

rbri commented Oct 29, 2024

Hi @Seanstoppable,

i'm more in favor to remove features than adding new ones ;-). Can you tell me please a bit about the motivation behind this?

@Seanstoppable
Copy link
Contributor Author

Sure, absoluately understand.

I am currently maintaining a bunch of crawlers. At one point, we apparently decided we wanted to capture self closing script tags in our parsing (despite them being non-compliant). In these cases, we are still attempting to parse the attributes, in particular the script src attribute. We have a fork for essentially these lines, which I am attempting to bring up to date. Rather than maintaining a fork for these lines, I am hoping to contribute back so we can go back to using official published jars.

Since self-closed iframes are also supported (despite also not being compliant), I was hoping this would bring parity for this tag and a reasonable addition.

@rbri
Copy link
Member

rbri commented Oct 29, 2024

Ok, sounds reasonable - and having more users is always good for a product ;-)

It will be great, if you can change the way you wrote the tests. Please remove the added junit tests and instead create some test cases like we have for the iframe feature.

Tests are usually done by placing an html file, an optional settings file (to enable some features) and files for the expected results in the correct folder. And the file names have to start with 'test-'.

You can use this bunch of files as sample
image

And feel free to add as many test cases as you like - if required i will add similar tests to the iframe stuff.

Thanks a lot for bringing this up.

@Seanstoppable
Copy link
Contributor Author

Absolutely. Tests added.
One of them is failing, the frg one. It is saying there is an extra newline being generated (but ONLY for this test), which has left me scratching my head.

@rbri
Copy link
Member

rbri commented Oct 30, 2024

@Seanstoppable will merge this and have a look

@rbri rbri merged commit eed52ca into HtmlUnit:master Oct 30, 2024
2 checks passed
@rbri
Copy link
Member

rbri commented Oct 30, 2024

@Seanstoppable fixed (see commit) and made a new snapshot build available.

Thanks for your contribution

@Seanstoppable
Copy link
Contributor Author

Awesome, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants