Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[npr] Add new extractor #13446

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

[npr] Add new extractor #13446

wants to merge 3 commits into from

Conversation

gfabiano
Copy link
Contributor

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractor
  • New feature

Description of your pull request and other information

I rewritten Npr extractor and added support for #13440. However there is a problem with m3u8 playlists from NprVideo that returns error 404.

regards,
Giuseppe Fabiano

return formats


class NprPlaylistIE(NprBaseIE):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't change extractor name.

class NprIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?npr\.org/player/v2/mediaPlayer\.html\?.*\bid=(?P<id>\d+)'
class NprBaseIE(InfoExtractor):
def extract_info(self, id):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't shadow built-in names.

}

def _real_extract(self, url):
display_id = self._match_id(url)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a display id.


return json_data['list']['story'][0]

def extract_formats(self, media_info):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be private.

class NprIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?npr\.org/player/v2/mediaPlayer\.html\?.*\bid=(?P<id>\d+)'
class NprBaseIE(InfoExtractor):
def extract_info(self, id):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be private.

'title': story.get('title', {}).get('$text'),
'id': video.get('id'),
'duration': int_or_none(video.get('duration', {}).get('$text')),
'formats': self.extract_formats(video),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No url and formats at the same time.

return {
'url': url,
'display_id': display_id,
'title': story.get('title', {}).get('$text'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Title is mandatory.

@gfabiano
Copy link
Contributor Author

Ok, pending fixes resolved and also problem with m3u8 formats. @dstftw
Now I have a problem with smil formats:

[NprVideo] 533198237: Downloading JSON metadata
[NprVideo] 533201718: Downloading SMIL file
[NprVideo] 533201718: Downloading m3u8 information
[NprVideo] 533201718: Downloading m3u8 information
[NprVideo] 533201718: Downloading m3u8 information
[NprVideo] 533201718: Downloading m3u8 information
[info] Writing video description metadata as JSON to: test_NprVideo_533198237.info.json
[debug] Invoking downloader on 'rtmp://flash.npr.org/ondemand/'
[download] Destination: test_NprVideo_533198237.flv
[debug] rtmpdump command line: rtmpdump --verbose -r "rtmp://flash.npr.org/ondemand/" -o test_NprVideo_533198237.flv.part --playpath npr-mp4/npr/ascvid/2017/06/20170619_ascvid_tigersjaw-n-1500000.mp4 --stop 1 --resume --skip 1
[rtmpdump] RTMPDump v2.1
[rtmpdump] (c) 2009 Andrej Stepanchuk, Howard Chu, The Flvstreamer Team; license: GPL
[rtmpdump] DEBUG: Parsing...
[rtmpdump] DEBUG: Parsed protocol: 0
[rtmpdump] DEBUG: Parsed host    : flash.npr.org
[rtmpdump] DEBUG: Parsed app     : ondemand
[rtmpdump] DEBUG: Number of skipped key frames for resume: 1
[rtmpdump] DEBUG: Protocol : RTMP
[rtmpdump] DEBUG: Hostname : flash.npr.org
[rtmpdump] DEBUG: Port     : 1935
[rtmpdump] DEBUG: Playpath : npr-mp4/npr/ascvid/2017/06/20170619_ascvid_tigersjaw-n-1500000.mp4
[rtmpdump] DEBUG: tcUrl    : rtmp://flash.npr.org:1935/ondemand
[rtmpdump] DEBUG: swfUrl   : <NULL>
[rtmpdump] DEBUG: pageUrl  : <NULL>
[rtmpdump] DEBUG: app      : ondemand
[rtmpdump] DEBUG: auth     : <NULL>
[rtmpdump] DEBUG: subscribepath : <NULL>
[rtmpdump] DEBUG: flashVer : WIN 10,0,22,87
[rtmpdump] DEBUG: live     : no
[rtmpdump] DEBUG: timeout  : 120 sec
[rtmpdump] DEBUG: Failed to get last keyframe.
[rtmpdump] DEBUG: Closing connection.

@rpvcg
Copy link

rpvcg commented Jun 23, 2017

The issue with #13440 is a site error. the hls url in the page source is 404:

http://ivideo-i.akamaihd.net/i/npr-mp4/npr/ascvid/2017/06/20170619_ascvid_tigersjaw-n-,200000,500000,1000000,1500000,.mp4.csmil/master.m3u8

This is because the 200000 quality is not actually present on the server. An edited hls url does work:

http://ivideo-i.akamaihd.net/i/npr-mp4/npr/ascvid/2017/06/20170619_ascvid_tigersjaw-n-,500000,600000,1000000,1200000,1500000,2000000,.mp4.csmil/master.m3u8

There is also html5 mp4 progressive url (600000 quality) in the page source which does work. It can be extended to the higher qualities using dirty heuristics or the quality values from the hls url. Best quality:

https://ondemand.npr.org/npr-mp4/npr/ascvid/2017/06/20170619_ascvid_tigersjaw-n-2000000.mp4

@gfabiano
Copy link
Contributor Author

Interesting, I will add this to the extractor. Any suggestion with rtmp? @stinkteeth

@rpvcg
Copy link

rpvcg commented Jun 23, 2017

rtmp seems to work okay if --resume is removed. Not sure why.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants