Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AnimeLab] Add new extractor #13600

Closed
wants to merge 38 commits into from
Closed

Conversation

mariuszskon
Copy link

Please follow the guide below

  • You will be asked some questions, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your pull request (like that [x])
  • Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractor
  • New feature

Description of your pull request and other information

This pull request adds basic support for the Australian / New Zealand streaming site AnimeLab. More features are planned, but I wanted to get a basic version out for now.

The site is run by Madman Entertainment, and is legal.

The site requires authentication, but a free account is simple to make (only requres an email address, which it does not even need a confirmation actually exists) and can access most of the content (but the site is geo-restricted to Aus/NZ).

It would be great if someone has a premium account and can verify that premium content is accessible (and does not interfere with the tests).

Plase leave feedback on what needs to be changed :)

youtube_dl/extractor/animelab.py Outdated Show resolved Hide resolved
youtube_dl/extractor/animelab.py Outdated Show resolved Hide resolved
youtube_dl/extractor/animelab.py Outdated Show resolved Hide resolved
youtube_dl/extractor/animelab.py Outdated Show resolved Hide resolved
youtube_dl/extractor/animelab.py Outdated Show resolved Hide resolved
youtube_dl/extractor/animelab.py Outdated Show resolved Hide resolved
youtube_dl/extractor/animelab.py Outdated Show resolved Hide resolved
youtube_dl/extractor/animelab.py Outdated Show resolved Hide resolved
youtube_dl/extractor/animelab.py Outdated Show resolved Hide resolved
youtube_dl/extractor/animelab.py Outdated Show resolved Hide resolved
@mariuszskon
Copy link
Author

mariuszskon commented Jul 9, 2017

@dstftw I have resolved the issues I could so far. When you reply to my other queries, I will solve the related issues. Hopefully I made the code easier to read by inlining the methods I only used once :)

@mariuszskon
Copy link
Author

@dstftw I believe I have resolved the issues. Please review at your convenience.

@mariuszskon
Copy link
Author

@dstftw Bump. I also removed a superflous test and added thumbnail extraction. In case I was a bit unclear, I plan on adding playlist extraction once this is merged (that is, I am ready for this to be merged when you have reviewed my submission). Thanks for your review in advance.

@mariuszskon
Copy link
Author

mariuszskon commented Sep 9, 2017

@dstftw Could I ask for a re-review? Quite a bit has changed ;)
I'm ready for a merge when you are.
Thanks.

@BelSean21
Copy link

@mariuszskon Thanks for creating it

@mariuszskon
Copy link
Author

mariuszskon commented Apr 21, 2020

@inks007 the problem with --download-archive has been fixed in the latest commit 395f0e7. It was indeed on my end rather than youtube-dl's. Please update and tell me know if all is good now on your end too :)

@mariuszskon
Copy link
Author

@inks007 I cannot reproduce your issue regarding getting heights:

$ python -m youtube_dl --netrc https://www.animelab.com/player/my-hero-academia-episode-1
[AnimeLab] Downloading login page
[AnimeLab] Logging in
[AnimeLab] my-hero-academia-episode-1: Downloading requested URL
[download] Destination: My Hero Academia - Episode 1 - Izuku Midoriya - Origin-6401.mp4

If you could provide an exact test command/URL that causes this behaviour that would be awesome.

@BelSean21
Copy link

Hey @mariuszskon I'm seeing the same issue as @inks007

pi@raspberrypi:~/Projects/youtube-dl_Animelab/youtube-dl $ ./youtube-dl -u BelSean21@somewhere.xxx.au -p password https://www.animelab.com/player/gleipnir-episode-1
[AnimeLab] Downloading login page
[AnimeLab] Logging in
[AnimeLab] gleipnir-episode-1: Downloading requested URL
WARNING: [AnimeLab] Could not get height of video
WARNING: [AnimeLab] Could not get height of video
[download] Destination: Gleipnir - Episode 1 - Something Inside of Me-55226.mp4
[download]   1.2% of 455.50MiB at 312.45KiB/s ETA 24:34

Doesn't seem to effect the resultant download. Video plays fine

@inks007
Copy link

inks007 commented Apr 21, 2020

@mariuszskon --download-archive looks to be working correctly, thank you.

The warnings seem to depend on the video (probably happens with the newer ones).

~/youtube-dl $ python3 -m youtube_dl --netrc https://www.animelab.com/player/arte-episode-3
[AnimeLab] Downloading login page
[AnimeLab] Logging in
[AnimeLab] arte-episode-3: Downloading requested URL
WARNING: [AnimeLab] Could not get height of video
WARNING: [AnimeLab] Could not get height of video
[download] Destination: Arte - Episode 3 - First Job-55155.mp4

I tried my-hero-academia-episode-1 like you did (it is a very old episode) and there aren't any warnings there.

@mariuszskon
Copy link
Author

Hmm this is a very interesting problem. @BelSean21 Gleipnir is working correctly on my end. @inks007 I cannot test your link because it is too new and requires a premium account, which I do not have. Episode 2 seems to work fine.
Since the warning is appearing twice, that means two of the formats do not have an exposed height but are otherwise considered valid by the extractor (which afaik means they have a URL). If you use --list-formats you can usually see more than two formats.
After writing most of this I realised what the problem is - since I do not have premium, I do not have access to the same formats you do. This is good because we can deal with this edge case right here 😄
To minimise the amount of data for me to sift through, and to avoid leaking URLs with copyright information, please add the following line after line 156 in youtube_dl/extractors/animelab.py:

print(quality_data)

(make sure it has the right indentation i.e. it has the same number of spaces as line before it.

Then you can post the results of attempting to get those height-less videos 😄

@inks007
Copy link

inks007 commented Apr 22, 2020

@mariuszskon Sent you a subscription code instead. Much easier.

@BelSean21
Copy link

@mariuszskon Like this?

pi@raspberrypi:~/Projects/youtube-dl_Animelab/youtube-dl $ ./youtube-dl -u BelSean21@somewhere.xxx.au -p password  https://www.animelab.com/player/gleipnir-episode-1
[AnimeLab] Downloading login page
[AnimeLab] Logging in
[AnimeLab] gleipnir-episode-1: Downloading requested URL
{u'videoQualityType': u'SD', u'description': u'360p', u'videoFormat': {u'videoFormatType': u'PROGRESSIVE', u'name': u'MP4'}, u'id': 1, u'name': u'360p'}
{u'videoQualityType': u'HD', u'description': u'720p', u'videoFormat': {u'videoFormatType': u'PROGRESSIVE', u'name': u'MP4'}, u'id': 3, u'name': u'720p'}
{u'videoQualityType': u'SD', u'description': u'480p', u'videoFormat': {u'videoFormatType': u'PROGRESSIVE', u'name': u'MP4'}, u'id': 2, u'name': u'480p'}
{u'videoQualityType': u'HD', u'description': u'1080p', u'videoFormat': {u'videoFormatType': u'PROGRESSIVE', u'name': u'MP4'}, u'id': 4, u'name': u'1080p'}
{u'videoQualityType': u'SD', u'description': u'SD HLS', u'videoFormat': {u'videoFormatType': u'ADAPTIVE', u'name': u'HLS'}, u'id': 6, u'name': u'SD HLS'}
{u'videoQualityType': u'HD', u'description': u'HD HLS', u'videoFormat': {u'videoFormatType': u'ADAPTIVE', u'name': u'HLS'}, u'id': 7, u'name': u'HD HLS'}
WARNING: [AnimeLab] Could not get height of video
{u'videoQualityType': u'SD', u'description': u'SD MPEG-DASH', u'videoFormat': {u'videoFormatType': u'ADAPTIVE', u'name': u'MPEG-DASH'}, u'id': 8, u'name': u'SDDASH'}
{u'videoQualityType': u'HD', u'description': u'HD MPEG-DASH', u'videoFormat': {u'videoFormatType': u'ADAPTIVE', u'name': u'MPEG-DASH'}, u'id': 9, u'name': u'HDDASH'}
WARNING: [AnimeLab] Could not get height of video
[download] Resuming download at byte 12167225
[download] Destination: Gleipnir - Episode 1 - Something Inside of Me-55226.mp4
[download]   3.3% of 455.50MiB at 897.90KiB/s ETA 08:22?
ERROR: Interrupted by user
pi@raspberrypi:~/Projects/youtube-dl_Animelab/youtube-dl $

@BelSean21
Copy link

@mariuszskon Getting this with some of the Movies

pi@raspberrypi:~/Videos $ ./youtube-dl -u BelSean21@somewhere.xxx.au -p password https://www.animelab.com/player/hentai-kamen
[AnimeLab] Downloading login page
[AnimeLab] Logging in
[AnimeLab] hentai-kamen: Downloading requested URL
Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "./youtube-dl/__main__.py", line 19, in <module>
  File "./youtube-dl/youtube_dl/__init__.py", line 474, in main
  File "./youtube-dl/youtube_dl/__init__.py", line 464, in _real_main
  File "./youtube-dl/youtube_dl/YoutubeDL.py", line 2018, in download
  File "./youtube-dl/youtube_dl/YoutubeDL.py", line 796, in extract_info
  File "./youtube-dl/youtube_dl/extractor/common.py", line 530, in extract
  File "./youtube-dl/youtube_dl/extractor/animelab.py", line 135, in _real_extract
AttributeError: 'NoneType' object has no attribute 'get'

pi@raspberrypi:~/Videos $ ./youtube-dl -u BelSean21@somewhere.xxx.au -p password https://www.animelab.com/player/attack-on-titan-live-action-part-1
[AnimeLab] Downloading login page
[AnimeLab] Logging in
[AnimeLab] attack-on-titan-live-action-part-1: Downloading requested URL
Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "./youtube-dl/__main__.py", line 19, in <module>
  File "./youtube-dl/youtube_dl/__init__.py", line 474, in main
  File "./youtube-dl/youtube_dl/__init__.py", line 464, in _real_main
  File "./youtube-dl/youtube_dl/YoutubeDL.py", line 2018, in download
  File "./youtube-dl/youtube_dl/YoutubeDL.py", line 796, in extract_info
  File "./youtube-dl/youtube_dl/extractor/common.py", line 530, in extract
  File "./youtube-dl/youtube_dl/extractor/animelab.py", line 135, in _real_extract
AttributeError: 'NoneType' object has no attribute 'get'
pi@raspberrypi:~/Videos $

@mariuszskon
Copy link
Author

@inks007 you are extremely generous! I will do my best to take advantage of your donation to make AnimeLab extraction as good as it can be!

@BelSean21 thanks for following my instructions, the output is exactly what I wanted to see. Of course I can myself confirm this is the case thanks to the donation of premium. Basically, if you look carefully, AnimeLab is sometimes reporting the quality as 'HD' instead of '1080p'. This is a simple fix, which I could have hacked together, or, which I realised youtube-dl has more built-in features for analysis :) This has been implemented in c3dca17.

I also noticed that the English or Japanese option is not handled smoothly (it should get both options in a single command), so I fixed that to an extent in 2f6d029. It's not the most elegant solution, but it works.

@BelSean21 I have fixed the movie extraction issue in 2cf8283.

@inks007
Copy link

inks007 commented Apr 23, 2020

@mariuszskon Thanks for all the fixes.
I see that --list-formats now shows all the DASH formats

There is some minor weirdness with picking up different formats:

python3 -m youtube_dl --netrc https://www.animelab.com/player/kakushigoto-episode-4 --list-formats
[AnimeLab] Downloading login page
[AnimeLab] Logging in
[AnimeLab] kakushigoto-episode-4: Downloading URL https://www.animelab.com/player/kakushigoto-episode-4/subtitles
[AnimeLab] 55326: Downloading m3u8 information
[AnimeLab] 55326: Downloading m3u8 information
[AnimeLab] 55326: Downloading MPD manifest
[AnimeLab] 55326: Downloading MPD manifest
[AnimeLab] kakushigoto-episode-4: Downloading URL https://www.animelab.com/player/kakushigoto-episode-4/dubbed
[AnimeLab] 55326: Downloading m3u8 information
[AnimeLab] 55326: Downloading m3u8 information
[AnimeLab] 55326: Downloading MPD manifest
[AnimeLab] 55326: Downloading MPD manifest
[info] Available formats for 55326:
format code                               extension  resolution note
157821_yeshardsubbed_ja-JP-audio-ENGLISH  m3u8       audio only [en]
157822_yeshardsubbed_ja-JP-audio-ENGLISH  m3u8       audio only [en]
157823_yeshardsubbed_ja-JP-0              m4a        audio only DASH audio  209k , m4a_dash container, mp4a.40.2 (48000Hz)
157824_yeshardsubbed_ja-JP-0              m4a        audio only DASH audio  209k , m4a_dash container, mp4a.40.2 (48000Hz)
157821_yeshardsubbed_ja-JP-583            m3u8       426x240     583k , avc1.42c01e, video only
157822_yeshardsubbed_ja-JP-583            m3u8       426x240     583k , avc1.42c01e, video only
157823_yeshardsubbed_ja-JP-1              mp4        426x240    DASH video  714k , mp4_dash container, avc1.42c01e, video only
157824_yeshardsubbed_ja-JP-1              mp4        426x240    DASH video  714k , mp4_dash container, avc1.42c01e, video only
157821_yeshardsubbed_ja-JP-725            m3u8       640x360     725k , avc1.640828, video only
157822_yeshardsubbed_ja-JP-725            m3u8       640x360     725k , avc1.640828, video only
157821_yeshardsubbed_ja-JP-911            m3u8       853x480     911k , avc1.640828, video only
157822_yeshardsubbed_ja-JP-911            m3u8       853x480     911k , avc1.640828, video only
157823_yeshardsubbed_ja-JP-2              mp4        640x360    DASH video 1212k , mp4_dash container, avc1.640828, video only
157824_yeshardsubbed_ja-JP-2              mp4        640x360    DASH video 1212k , mp4_dash container, avc1.640828, video only
157822_yeshardsubbed_ja-JP-1264           m3u8       1280x720   1264k , avc1.640828, video only
157823_yeshardsubbed_ja-JP-3              mp4        854x480    DASH video 1997k , mp4_dash container, avc1.640828, video only
157824_yeshardsubbed_ja-JP-3              mp4        854x480    DASH video 1997k , mp4_dash container, avc1.640828, video only
157822_yeshardsubbed_ja-JP-2078           m3u8       1920x1080  2078k , avc1.640829, video only
157824_yeshardsubbed_ja-JP-4              mp4        1280x720   DASH video 3503k , mp4_dash container, avc1.640828, video only
157824_yeshardsubbed_ja-JP-5              mp4        1920x1080  DASH video 4769k , mp4_dash container, avc1.640829, video only
157819_yeshardsubbed_ja-JP                mp4        360p       [ja-JP]
157817_yeshardsubbed_ja-JP                mp4        480p       [ja-JP]
157818_yeshardsubbed_ja-JP                mp4        720p       [ja-JP]
157820_yeshardsubbed_ja-JP                mp4        1080p      [ja-JP]  (best)

-f 157824_yeshardsubbed_ja-JP-4 works
-f "mp4[format_id*=ja-JP-4]" works
-f "[format_id*=ja-JP-4]" doesn't work
-f "[format_id*=157818]" works

Can't seem to specify picking up the newer DASH formats without specifying the extension (mp4 or m4a). This is probably minor, because all of the following work as intended:
-f "bestvideo+bestaudio" picks up the best DASH video
-f "bestvideo[height<=720]+bestaudio" picks up the best DASH 720p video
-f "best" picks up the best muxed video

@mariuszskon
Copy link
Author

@inks007 Indeed, we now have lots more formats thanks to actually correctly handling mpd and m3u8 😅

The format shown is what my extractor reports to youtube-dl. All control is then passed to the rest of youtube-dl where selecting the format, actually performing the download of the media etc. is actually done. Therefore it is most likely that the problem is not in my extractor. However, I have been wrong before, so I will take a look. My response will be a bit delayed.

@BelSean21
Copy link

Thankyou @mariuszskon Looking good :-)

@mariuszskon
Copy link
Author

@inks007 after a bunch of research and experimenting I have come to a few conclusions regarding the issue of format selection.
First, it has nothing to do with the particular extractor (other than the list of formats) but a lot to do with youtube-dl's implementation of format selection. There is a TL;DR at the bottom if you do not care about the technical details.
In YoutubeDL.py:

        FormatSelector = collections.namedtuple('FormatSelector', ['type', 'selector', 'filters'])

Debugging this with your given format specifiers gives the following results:

  • -f 157824_yeshardsubbed_ja-JP-4 => [FormatSelector(type='SINGLE', selector='157824_yeshardsubbed_ja-JP-4', filters=[])]
  • -f "mp4[format_id*=ja-JP-4]" => [FormatSelector(type='SINGLE', selector='mp4', filters=['format_id*=ja-JP-4'])]
  • -f "[format_id*=ja-JP-4]" => [FormatSelector(type='SINGLE', selector='best', filters=['format_id*=ja-JP-4'])]
  • -f "[format_id*=157818]" => [FormatSelector(type='SINGLE', selector='best', filters=['format_id*=157818'])]

See the problem? best appears seemingly out of nowhere. This is because some selector is necessary. The problem with best is that it requires audio and video to be within the one file. 157824_yeshardsubbed_ja-JP-4 is video only. Therefore -f "[format_id*=ja-JP-4]" is equivalent to -f "best[format_id*=ja-JP-4]" which understandably fails because there is no combined audio+video file which matches that filter.

Similarly, you will see that -f "[format_id*=157824]", which is in the format of your "working" last example, produces [FormatSelector(type='SINGLE', selector='best', filters=['format_id*=157824'])] and does not work for 157824_yeshardsubbed_ja-JP-4. -f "[format_id*=157818]" works because the target file does have both audio and video.

TL;DR: If you only specify things in brackets [ ] then youtube-dl not only wants it to match that filter, but also be the best. Sometimes it cannot find a best that is also filtered down, because it requires both audio and video in the same file. Solution/workaround: specify something as the selector, such as the extension. Or maybe, you do want the particular video and you wish to mux it with the best audio: -f "bestvideo[format_id*=ja-JP-4]+bestaudio"

@mariuszskon
Copy link
Author

@dstftw I understand that you are very busy since youtube-dl is a massive project, but I would love it if you could re-review my code.

@inks007
Copy link

inks007 commented Apr 29, 2020

@mariuszskon that makes a lot of sense, thanks.

@aristaeus
Copy link

Animelab's being shut down so you may as well close this PR :'(

@mariuszskon
Copy link
Author

Animelab's being shut down so you may as well close this PR :'(

Thanks for the update, indeed it is. Apparently you can login to funimation with the same details, see https://www.animelab.com/blog/animelab-is-becoming-funimation-in-australia-and-new-zealand/

Whether you like the move or not, the good part from youtube-dl's perspective is there is less code to maintain 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants