Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added login support for PornHub and PornHub Premium. #24294

Closed
wants to merge 5 commits into from

Conversation

twaddington
Copy link
Contributor

Please follow the guide below

  • You will be asked some questions, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your pull request (like that [x])
  • Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractor
  • New feature

Description of your pull request and other information

Resolves #18797

The pornhub extractor has been updated with support for --netrc and
--username/password authentication.

self._set_cookie(host, 'age_verified', '1')

# Authenticate, if required
self._login_if_required(host)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be in _real_initialize. Same for all other occurrences.

Copy link
Contributor Author

@twaddington twaddington Mar 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that would be ideal. Unfortunately, this can't be in _real_initialize because we need to know the URL so we can load the appropriate credentials for either pornhub.com or pornhubpremium.com.

self._set_cookie(host, 'age_verified', '1')

# Authenticate, if required
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove pointless comments. This is already clear from method name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this comment.

Comment on lines 75 to 79
login_form_url = 'https://%s/premium/login' % host
login_post_url = 'https://www.%s/front/authenticate' % host
else:
login_form_url = 'https://%s/login' % host
login_post_url = 'https://www.%s/front/authenticate' % host
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DRY.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

)


class PornHubBaseIE(InfoExtractor):

_NETRC_MACHINE = 'pornhub' # or 'pornhubpremium'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does premium login also work on pornhub.com site? In any case it's better to have separate extractors with separate _NETRC_MACHINE with common base class in case one may want to use different credentials.

Copy link
Contributor Author

@twaddington twaddington Mar 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The account logins are completely separate. You can have a pornhub.com and a pornhubpremium.com account that are totally different.

I took the approach of having separate extractors before but that lead to more duplicated code. The extractors are exactly the same except for the login process.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at line 54 you'll see we load the appropriate credentials based on the URL.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then extractors must be separate. Move common code in the base class and there will be no duplicated code.

Copy link
Contributor Author

@twaddington twaddington Mar 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move common code in the base class and there will be no duplicated code. — @dstftw

Yeah, the only way to do this would be to move all the code from the subclasses (playlist extractors) up into the base class and then create a new PornHubPremiumIE that extends from PornHubBaseIE.

That will create a much larger pull-request. Are you comfortable reviewing that? I avoided that approach because I didn't think a larger pull would be accepted.

This right here is the simplest and most straightforward change and IMO the best way to implement this.

Copy link
Contributor Author

@twaddington twaddington Mar 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An earlier attempt I made tried this approach. Please take a quick look here and see if you're okay with this direction:

https://github.com/twaddington/youtube-dl/blob/phpremium/youtube_dl/extractor/pornhubpremium.py#L84-L134

Note that there is still some duplicated code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class structure would look something like this:

PornHubBaseIE
  - PornHubIE
    - PornHubUserIE
    - PornHubPagedVideoListIE
    - PornHubUserVideosUploadIE
  - PornHubPremiumIE
    - PornHubPremiumUserIE
    - PornHubPremiumPagedVideoListIE
    - PornHubPremiumUserVideosUploadIE

@twaddington
Copy link
Contributor Author

Any updates? I'd love to get this merged in as-is. It's a relatively small patch, is less likely to cause regressions or undesirable changes in behavior, and resolves an issue that's been open since Jan 10, 2019.

cc @dstftw @remitamine

@twaddington twaddington mentioned this pull request Mar 17, 2020
9 tasks
@dstftw
Copy link
Collaborator

dstftw commented Mar 17, 2020

I've already pointed out changes that should be made.

if all(login_info) and not cookies:
self._login(host, login_info)
Copy link
Collaborator

@dstftw dstftw Mar 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Having arbitrary cookies set for domain does not mean user is logged in so this will potentially skip login when it should not.
  2. With current approach cookies will always be non empty since age_verified cookie is always set by the extractor.

Copy link
Contributor Author

@twaddington twaddington Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an attempt to avoid submitting multiple "logged-in" status checks when downloading playlists. I will remove as I can see it's causing inconsistent behavior.

@ytdl-org ytdl-org deleted a comment Mar 20, 2020
@footar64
Copy link

footar64 commented Mar 27, 2020

I have attempted to use this PR but as of today I'm getting a "cannot extract title" error:

ERROR: Unable to extract title; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "c:\users\dubif\p2\misc\youtube-dl\youtube_dl\YoutubeDL.py", line 797, in extract_info
    ie_result = ie.extract(url)
  File "c:\users\dubif\p2\misc\youtube-dl\youtube_dl\extractor\common.py", line 530, in extract
    ie_result = self._real_extract(url)
  File "c:\users\dubif\p2\misc\youtube-dl\youtube_dl\extractor\pornhub.py", line 261, in _real_extract
    webpage, 'title', group='title')
  File "c:\users\dubif\p2\misc\youtube-dl\youtube_dl\extractor\common.py", line 1014, in _html_search_regex
    res = self._search_regex(pattern, string, name, default, fatal, flags, group)
  File "c:\users\dubif\p2\misc\youtube-dl\youtube_dl\extractor\common.py", line 1005, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
youtube_dl.utils.RegexNotFoundError: Unable to extract title; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output

Is this a bug with the PR or is it my config that's broken?

The pornhub extractor has been updated with support for --netrc and
--username/password authentication. This change allows authenticated
users to archive content they have purchased.
@twaddington
Copy link
Contributor Author

@footar64 try now. There was a bug with a cookies check.

@twaddington
Copy link
Contributor Author

twaddington commented Mar 31, 2020

I've already pointed out changes that should be made. — @dstftw

I fixed the issue with the cookies check. I'm happy to correct any functional issues to get this pull-request approved, but I don't plan on rearchitecting the entire extractor for you. You're welcome to build on the work I've done here if you'd prefer a different architecture.

@twaddington
Copy link
Contributor Author

Just checking in. Are there lingering concerns with the functionality of this patch or just the minimal approach I've taken for implementation?

Premium is free right now so you can easily create an account to verify this change. Premium works exactly the same way as pornhub.com in almost every way except login.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Site Request] Pornhub Premium
3 participants