Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove invisible spans from HTML #148

Merged
merged 1 commit into from
Sep 16, 2021
Merged

Remove invisible spans from HTML #148

merged 1 commit into from
Sep 16, 2021

Conversation

stephanebisson
Copy link
Collaborator

https://phabricator.wikimedia.org/T289500

Sometimes, the API returns the following HTML for the author of an image: <div class=\"fn value\">\nUnknown artist<span style=\"display: none;\">Unknown artist</span>\n</div>

The span is hidden so it's not really a problem but when we strip extract the text content from the HTML we get all text so we end up with "Unknown artistUnknown artist".

Added explicit removal of hidden spans from the strip function.

@stephanebisson stephanebisson requested a review from a team September 15, 2021 18:08
@hueitan hueitan merged commit 222226f into main Sep 16, 2021
@hueitan hueitan deleted the T289500 branch September 16, 2021 08:48
@stephanebisson stephanebisson restored the T289500 branch July 28, 2022 12:57
@stephanebisson stephanebisson deleted the T289500 branch July 28, 2022 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants