-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Loom] Add new extractor #28039
base: master
Are you sure you want to change the base?
[Loom] Add new extractor #28039
Conversation
… test/test_unicode_literals.py
youtube_dl/extractor/loom.py
Outdated
def _extract_video_info_json(self, webpage, video_id): | ||
info = self._html_search_regex( | ||
r'window.loomSSRVideo = (.+?);', | ||
webpage, | ||
'info') | ||
return self._parse_json(info, 'json', js_to_json) | ||
|
||
def _get_url_by_id_type(self, video_id, type): | ||
request = compat_urllib_request.Request( | ||
self._BASE_URL + 'api/campaigns/sessions/' + video_id + '/' + type, | ||
{}) | ||
json_doc = self._download_json(request, video_id) | ||
return (url_or_none(json_doc.get('url')), json_doc.get('part_credentials')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated at 34e6a6b
youtube_dl/extractor/loom.py
Outdated
request = compat_urllib_request.Request( | ||
self._BASE_URL + 'api/campaigns/sessions/' + video_id + '/' + type, | ||
{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move into _download_json
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated at 70b8045
youtube_dl/extractor/loom.py
Outdated
def _get_m3u8_formats(self, url, video_id, credentials): | ||
format_list = self._extract_m3u8_formats(url, video_id) | ||
for item in format_list: | ||
item['protocol'] = 'm3u8_native' | ||
item['url'] += '?' + credentials | ||
item['ext'] = 'mp4' | ||
item['format_id'] = 'hls-' + str(item.get('height', 0)) | ||
item['extra_param_to_segment_url'] = credentials | ||
return format_list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated at 34e6a6b
youtube_dl/extractor/loom.py
Outdated
ext = self._search_regex( | ||
r'\.([a-zA-Z0-9]+)\?', | ||
url, 'ext', default=None) | ||
if(ext != 'm3u8'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No parens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated at 34e6a6b
ext = self._search_regex( | ||
r'\.([a-zA-Z0-9]+)\?', | ||
url, 'ext', default=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read coding conventions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this part, I may need to extract the file extension from a url.
Would you prefer a relaxed regex \.([^.?]+)\?
?
Or HEAD [URL]
and extract the extension from content-type
header with mimetype2ext(mt)
?
youtube_dl/extractor/loom.py
Outdated
'width': try_get(info, lambda x: x['video_properties']['width']), | ||
'height': try_get(info, lambda x: x['video_properties']['height']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int_or_none
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated at 29c4168
|
||
return { | ||
'id': info.get('id'), | ||
'title': info.get('name'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mandatory. Read coding conventions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the id
, I may provide a fallback value from the url. However, the title
does not have another fallback source, other than the embedded JSON.
Any advice?
Afterthoughts:
Is that okay if use use [video_id]
or the word Loom
as the fallback title?
youtube_dl/extractor/loom.py
Outdated
|
||
for i in range(len(folder_info['entries'])): | ||
video_id = folder_info['entries'][i] | ||
folder_info['entries'][i] = LoomIE(self._downloader)._real_extract(url_or_none(self._BASE_URL + 'share/' + video_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
url_result
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated at 1b2651e
|
||
ext = self._search_regex( | ||
r'\.([a-zA-Z0-9]+)\?', | ||
url, 'ext', default=None) | ||
if ext != 'm3u8': | ||
formats.append({ | ||
'url': url, | ||
'ext': ext, | ||
'format_id': type, | ||
'width': int_or_none(try_get(info, lambda x: x['video_properties']['width'])), | ||
'height': int_or_none(try_get(info, lambda x: x['video_properties']['height'])) | ||
}) | ||
else: | ||
credentials = compat_urllib_parse_urlencode(part_credentials) | ||
m3u8_formats = self._extract_m3u8_formats(url, video_id) | ||
for item in m3u8_formats: | ||
item['protocol'] = 'm3u8_native' | ||
item['url'] += '?' + credentials | ||
item['ext'] = 'mp4' | ||
item['format_id'] = 'hls-' + str(item.get('height', 0)) | ||
item['extra_param_to_segment_url'] = credentials | ||
for i in range(len(m3u8_formats)): | ||
formats.insert( | ||
(-1, len(formats))[i == len(m3u8_formats) - 1], | ||
m3u8_formats[i]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
octet-stream support required
#27957 (comment)
@dstftw or @wongyiuhang can I help move this along? |
Yes, of course. I'm sorry for holding the pull request. Is there anything that I need to do? 👀 |
@wongyiuhang since the PR has not yet been merged I have tried to git clone your repo and checkout to 'loom' branch. Then I have installed with 'pip3 install -e .' but downloading the link above does not work.
This is my youtube-dl -v:
|
Hey folks, you were almost there! |
I suggest an optimisation, which is to use the Here's how to the
I'm happy to give a hand, what else is needed to get this one through ? @dstftw Anyone else that we can tag ? |
Release 2021.12.17
I checked out @wongyiuhang's code from his loom branch and it doesn't work (anymore). Merely downloads a 4kb mp4. Somebody on Reddit suggests having to download the manifest. Also, embedded URLs are invalid currently and look like this |
Hello, make sure not to forget about this. |
Is anyone available to keep moving this along? Support for loom would be great |
It works perfectly |
We just need to add a little more metadata from mine and we done |
The first test for LoomFolderIE is giving 404 on JSON download. The API URL with |
Please follow the guide below
x
into all the boxes [ ] relevant to your pull request (like that [x])Before submitting a pull request make sure you have:
In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:
What is the purpose of your pull request?
Description of your pull request and other information
In response to a site request #27957, this new extractor is written for loom.com.
Closes #27957