Added login support for PornHub and PornHub Premium. #24294

twaddington · 2020-03-08T19:24:42Z

Please follow the guide below

You will be asked some questions, please read them carefully and answer honestly
Put an x into all the boxes [ ] relevant to your pull request (like that [x])
Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

At least skimmed through adding new extractor tutorial and youtube-dl coding conventions sections
Searched the bugtracker for similar pull requests
Checked the code with flake8

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

I am the original author of this code and I am willing to release it under Unlicense
I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

Bug fix
Improvement
New extractor
New feature

Description of your pull request and other information

Resolves #18797

The pornhub extractor has been updated with support for --netrc and
--username/password authentication.

dstftw · 2020-03-10T15:45:10Z

youtube_dl/extractor/pornhub.py

        self._set_cookie(host, 'age_verified', '1')

+        # Authenticate, if required
+        self._login_if_required(host)


This should be in _real_initialize. Same for all other occurrences.

Yeah, that would be ideal. Unfortunately, this can't be in _real_initialize because we need to know the URL so we can load the appropriate credentials for either pornhub.com or pornhubpremium.com.

dstftw · 2020-03-10T15:45:35Z

youtube_dl/extractor/pornhub.py

        self._set_cookie(host, 'age_verified', '1')

+        # Authenticate, if required


Remove pointless comments. This is already clear from method name.

I removed this comment.

dstftw · 2020-03-10T15:45:50Z

youtube_dl/extractor/pornhub.py

+            login_form_url = 'https://%s/premium/login' % host
+            login_post_url = 'https://www.%s/front/authenticate' % host
+        else:
+            login_form_url = 'https://%s/login' % host
+            login_post_url = 'https://www.%s/front/authenticate' % host


dstftw · 2020-03-10T15:48:16Z

youtube_dl/extractor/pornhub.py

 )


 class PornHubBaseIE(InfoExtractor):
+
+    _NETRC_MACHINE = 'pornhub'  # or 'pornhubpremium'


Does premium login also work on pornhub.com site? In any case it's better to have separate extractors with separate _NETRC_MACHINE with common base class in case one may want to use different credentials.

No. The account logins are completely separate. You can have a pornhub.com and a pornhubpremium.com account that are totally different.

I took the approach of having separate extractors before but that lead to more duplicated code. The extractors are exactly the same except for the login process.

If you look at line 54 you'll see we load the appropriate credentials based on the URL.

Then extractors must be separate. Move common code in the base class and there will be no duplicated code.

Move common code in the base class and there will be no duplicated code. — @dstftw

Yeah, the only way to do this would be to move all the code from the subclasses (playlist extractors) up into the base class and then create a new PornHubPremiumIE that extends from PornHubBaseIE.

That will create a much larger pull-request. Are you comfortable reviewing that? I avoided that approach because I didn't think a larger pull would be accepted.

This right here is the simplest and most straightforward change and IMO the best way to implement this.

An earlier attempt I made tried this approach. Please take a quick look here and see if you're okay with this direction:

https://github.com/twaddington/youtube-dl/blob/phpremium/youtube_dl/extractor/pornhubpremium.py#L84-L134

Note that there is still some duplicated code.

The class structure would look something like this:

PornHubBaseIE - PornHubIE - PornHubUserIE - PornHubPagedVideoListIE - PornHubUserVideosUploadIE - PornHubPremiumIE - PornHubPremiumUserIE - PornHubPremiumPagedVideoListIE - PornHubPremiumUserVideosUploadIE

twaddington · 2020-03-17T17:46:00Z

Any updates? I'd love to get this merged in as-is. It's a relatively small patch, is less likely to cause regressions or undesirable changes in behavior, and resolves an issue that's been open since Jan 10, 2019.

cc @dstftw @remitamine

dstftw · 2020-03-17T17:50:20Z

I've already pointed out changes that should be made.

dstftw · 2020-03-17T18:30:14Z

youtube_dl/extractor/pornhub.py

+        if all(login_info) and not cookies:
+            self._login(host, login_info)


Having arbitrary cookies set for domain does not mean user is logged in so this will potentially skip login when it should not.

With current approach cookies will always be non empty since age_verified cookie is always set by the extractor.

This was an attempt to avoid submitting multiple "logged-in" status checks when downloading playlists. I will remove as I can see it's causing inconsistent behavior.

footar64 · 2020-03-27T17:52:13Z

I have attempted to use this PR but as of today I'm getting a "cannot extract title" error:

ERROR: Unable to extract title; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "c:\users\dubif\p2\misc\youtube-dl\youtube_dl\YoutubeDL.py", line 797, in extract_info
    ie_result = ie.extract(url)
  File "c:\users\dubif\p2\misc\youtube-dl\youtube_dl\extractor\common.py", line 530, in extract
    ie_result = self._real_extract(url)
  File "c:\users\dubif\p2\misc\youtube-dl\youtube_dl\extractor\pornhub.py", line 261, in _real_extract
    webpage, 'title', group='title')
  File "c:\users\dubif\p2\misc\youtube-dl\youtube_dl\extractor\common.py", line 1014, in _html_search_regex
    res = self._search_regex(pattern, string, name, default, fatal, flags, group)
  File "c:\users\dubif\p2\misc\youtube-dl\youtube_dl\extractor\common.py", line 1005, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
youtube_dl.utils.RegexNotFoundError: Unable to extract title; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output

Is this a bug with the PR or is it my config that's broken?

The pornhub extractor has been updated with support for --netrc and --username/password authentication. This change allows authenticated users to archive content they have purchased.

twaddington · 2020-03-31T03:19:24Z

@footar64 try now. There was a bug with a cookies check.

twaddington · 2020-03-31T03:21:59Z

I've already pointed out changes that should be made. — @dstftw

I fixed the issue with the cookies check. I'm happy to correct any functional issues to get this pull-request approved, but I don't plan on rearchitecting the entire extractor for you. You're welcome to build on the work I've done here if you'd prefer a different architecture.

twaddington · 2020-04-22T16:12:35Z

Just checking in. Are there lingering concerns with the functionality of this patch or just the minimal approach I've taken for implementation?

Premium is free right now so you can easily create an account to verify this change. Premium works exactly the same way as pornhub.com in almost every way except login.

dstftw requested changes Mar 10, 2020

View reviewed changes

dstftw added the pending-fixes label Mar 10, 2020

twaddington mentioned this pull request Mar 17, 2020

[Site Request] Pornhub Premium #18797

Closed

9 tasks

dstftw reviewed Mar 17, 2020

View reviewed changes

ytdl-org deleted a comment Mar 20, 2020

twaddington added 5 commits March 30, 2020 20:08

Added login support for PornHub and PornHub Premium.

24fa01b

The pornhub extractor has been updated with support for --netrc and --username/password authentication. This change allows authenticated users to archive content they have purchased.

Added default _NETRC_MACHINE value to make tests happy.

93e2812

Feedback

a052d5a

Feedback

a571ae7

Fixed login issue by removing cookies check.

1b0793f

twaddington force-pushed the phpremium-lite branch from 652b0e3 to 1b0793f Compare March 31, 2020 03:19

dstftw force-pushed the master branch from 7b956a1 to 5e26784 Compare September 13, 2020 13:49

dstftw closed this in e22ff4e Feb 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added login support for PornHub and PornHub Premium. #24294

Added login support for PornHub and PornHub Premium. #24294

twaddington commented Mar 8, 2020

dstftw Mar 10, 2020

twaddington Mar 11, 2020 •

edited

Loading

dstftw Mar 10, 2020

twaddington Mar 11, 2020

dstftw Mar 10, 2020

twaddington Mar 11, 2020

dstftw Mar 10, 2020

twaddington Mar 11, 2020 •

edited

Loading

twaddington Mar 11, 2020

dstftw Mar 12, 2020

twaddington Mar 15, 2020 •

edited

Loading

twaddington Mar 15, 2020 •

edited

Loading

twaddington Mar 15, 2020

twaddington commented Mar 17, 2020

dstftw commented Mar 17, 2020

dstftw Mar 17, 2020 •

edited

Loading

twaddington Mar 31, 2020 •

edited

Loading

footar64 commented Mar 27, 2020 •

edited

Loading

twaddington commented Mar 31, 2020

twaddington commented Mar 31, 2020 •

edited

Loading

twaddington commented Apr 22, 2020

		self._set_cookie(host, 'age_verified', '1')

		# Authenticate, if required

		if all(login_info) and not cookies:
		self._login(host, login_info)

Added login support for PornHub and PornHub Premium. #24294

Added login support for PornHub and PornHub Premium. #24294

Conversation

twaddington commented Mar 8, 2020

Please follow the guide below

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

What is the purpose of your pull request?

Description of your pull request and other information

Choose a reason for hiding this comment

twaddington Mar 11, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

twaddington Mar 11, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

twaddington Mar 15, 2020 • edited Loading

Choose a reason for hiding this comment

twaddington Mar 15, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

twaddington commented Mar 17, 2020

dstftw commented Mar 17, 2020

dstftw Mar 17, 2020 • edited Loading

Choose a reason for hiding this comment

twaddington Mar 31, 2020 • edited Loading

Choose a reason for hiding this comment

footar64 commented Mar 27, 2020 • edited Loading

twaddington commented Mar 31, 2020

twaddington commented Mar 31, 2020 • edited Loading

twaddington commented Apr 22, 2020

twaddington Mar 11, 2020 •

edited

Loading

twaddington Mar 11, 2020 •

edited

Loading

twaddington Mar 15, 2020 •

edited

Loading

twaddington Mar 15, 2020 •

edited

Loading

dstftw Mar 17, 2020 •

edited

Loading

twaddington Mar 31, 2020 •

edited

Loading

footar64 commented Mar 27, 2020 •

edited

Loading

twaddington commented Mar 31, 2020 •

edited

Loading