-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pornhub] Fix like and dislike count extraction #27356
Conversation
Yeah my bad, the video i tested with didn't have K in it. You should still keep the regex relaxed along with your changes |
IMO the relax doesn't make too much sense, because the data relies on Or I'd prefer |
i'm just quoting their doc on regex https://github.com/ytdl-org/youtube-dl#make-regular-expressions-relaxed-and-flexible 🤷♀️ |
Yeah, I know. I read and thought about it. Let the maintainer decide. |
@@ -354,9 +354,9 @@ def add_video_url(video_url): | |||
view_count = self._extract_count( | |||
r'<span class="count">([\d,\.]+)</span> [Vv]iews', webpage, 'view') | |||
like_count = self._extract_count( | |||
r'<span[^>]+class="votesUp"[^>]*>([\d,\.]+)</span>', webpage, 'like') | |||
r'<span class="votesUp" data-rating="(\d+)">', webpage, 'like') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r'<span class="votesUp" data-rating="(\d+)">', webpage, 'like') | |
r'class="votesUp".+?data-rating="(\d+)"', webpage, 'like') |
dislike_count = self._extract_count( | ||
r'<span[^>]+class="votesDown"[^>]*>([\d,\.]+)</span>', webpage, 'dislike') | ||
r'<span class="votesDown" data-rating="(\d+)">', webpage, 'dislike') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r'<span class="votesDown" data-rating="(\d+)">', webpage, 'dislike') | |
r'class="votesDown".+?data-rating="(\d+)"', webpage, 'dislike') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Do not remove old patterns.
- New patterns must be relaxed similarly.
@@ -354,9 +354,9 @@ def add_video_url(video_url): | |||
view_count = self._extract_count( | |||
r'<span class="count">([\d,\.]+)</span> [Vv]iews', webpage, 'view') | |||
like_count = self._extract_count( | |||
r'<span[^>]+class="votesUp"[^>]*>([\d,\.]+)</span>', webpage, 'like') | |||
r'<span class="votesUp" data-rating="(\d+)">', webpage, 'like') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r'<span class="votesUp" data-rating="(\d+)">', webpage, 'like') | |
r'<span[^>]+class="votesUp"[^>]+data-rating="(\d+)"[^>]*>', webpage, 'like') |
dislike_count = self._extract_count( | ||
r'<span[^>]+class="votesDown"[^>]*>([\d,\.]+)</span>', webpage, 'dislike') | ||
r'<span class="votesDown" data-rating="(\d+)">', webpage, 'dislike') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r'<span class="votesDown" data-rating="(\d+)">', webpage, 'dislike') | |
r'<span[^>]+class="votesDown"[^>]+data-rating="(\d+)"[^>]*>', webpage, 'dislike') |
Please follow the guide below
x
into all the boxes [ ] relevant to your pull request (like that [x])Before submitting a pull request make sure you have:
In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:
What is the purpose of your pull request?
Description of your pull request and other information
Related to #27234
In my limited test, the structure has become