Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with News section on vod.tvp.pl #7799

Closed
hubertbanas opened this issue Dec 8, 2015 · 13 comments
Closed

Issues with News section on vod.tvp.pl #7799

hubertbanas opened this issue Dec 8, 2015 · 13 comments

Comments

@hubertbanas
Copy link

I see no issues when getting TV Shows but was wondering if we should expect tvp.py to handle news section on vod.tvp.pl such as Wiadomosci, Teleexpress, Panorama?

Wiadomosci

youtube-dl -jv http://vod.tvp.pl/22704887/08122015-1500
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-jv', u'http://vod.tvp.pl/22704887/08122015-1500']
[debug] Encodings: locale ANSI_X3.4-1968, fs ANSI_X3.4-1968, out ANSI_X3.4-1968, pref ANSI_X3.4-1968
[debug] youtube-dl version 2015.12.06
[debug] Python version 2.7.9 - Linux-3.16.0-4-amd64-x86_64-with-debian-8.2
[debug] exe versions: none
[debug] Proxy map: {}
WARNING: Falling back on generic information extractor.
ERROR: Unsupported URL: http://vod.tvp.pl/22704887/08122015-1500
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 1285, in _real_extract
    doc = compat_etree_fromstring(webpage.encode('utf-8'))
  File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 248, in compat_etree_fromstring
    doc = _XML(text, parser=etree.XMLParser(target=etree.TreeBuilder(element_factory=_element_factory)))
  File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 237, in _XML
    parser.feed(text)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed
    self._raiseerror(v)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
    raise err
ParseError: mismatched tag: line 78, column 123
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 663, in extract_info
    ie_result = ie.extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 290, in extract
    return self._real_extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 1890, in _real_extract
    raise UnsupportedError(url)
UnsupportedError: Unsupported URL: http://vod.tvp.pl/22704887/08122015-1500

Teleexpress

youtube-dl -jv http://vod.tvp.pl/22705460/08122015-1700
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-jv', u'http://vod.tvp.pl/22705460/08122015-1700']
[debug] Encodings: locale ANSI_X3.4-1968, fs ANSI_X3.4-1968, out ANSI_X3.4-1968, pref ANSI_X3.4-1968
[debug] youtube-dl version 2015.12.06
[debug] Python version 2.7.9 - Linux-3.16.0-4-amd64-x86_64-with-debian-8.2
[debug] exe versions: none
[debug] Proxy map: {}
WARNING: Falling back on generic information extractor.
ERROR: Unsupported URL: http://vod.tvp.pl/22705460/08122015-1700
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 1285, in _real_extract
    doc = compat_etree_fromstring(webpage.encode('utf-8'))
  File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 248, in compat_etree_fromstring
    doc = _XML(text, parser=etree.XMLParser(target=etree.TreeBuilder(element_factory=_element_factory)))
  File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 237, in _XML
    parser.feed(text)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed
    self._raiseerror(v)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
    raise err
ParseError: mismatched tag: line 78, column 123
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 663, in extract_info
    ie_result = ie.extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 290, in extract
    return self._real_extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 1890, in _real_extract
    raise UnsupportedError(url)
UnsupportedError: Unsupported URL: http://vod.tvp.pl/22705460/08122015-1700

Panorama

youtube-dl -jv http://vod.tvp.pl/22702654/07122015-1540
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-jv', u'http://vod.tvp.pl/22702654/07122015-1540']
[debug] Encodings: locale ANSI_X3.4-1968, fs ANSI_X3.4-1968, out ANSI_X3.4-1968, pref ANSI_X3.4-1968
[debug] youtube-dl version 2015.12.06
[debug] Python version 2.7.9 - Linux-3.16.0-4-amd64-x86_64-with-debian-8.2
[debug] exe versions: none
[debug] Proxy map: {}
WARNING: Falling back on generic information extractor.
ERROR: Unsupported URL: http://vod.tvp.pl/22702654/07122015-1540
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 1285, in _real_extract
    doc = compat_etree_fromstring(webpage.encode('utf-8'))
  File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 248, in compat_etree_fromstring
    doc = _XML(text, parser=etree.XMLParser(target=etree.TreeBuilder(element_factory=_element_factory)))
  File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 237, in _XML
    parser.feed(text)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed
    self._raiseerror(v)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
    raise err
ParseError: mismatched tag: line 78, column 123
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 663, in extract_info
    ie_result = ie.extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 290, in extract
    return self._real_extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 1890, in _real_extract
    raise UnsupportedError(url)
UnsupportedError: Unsupported URL: http://vod.tvp.pl/22702654/07122015-1540

Working TV Show example

youtube-dl -jv http://vod.tvp.pl/seriale/obyczajowe/na-sygnale/sezon-2-27-/odc-39/17834272
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-jv', u'http://vod.tvp.pl/seriale/obyczajowe/na-sygnale/sezon-2-27-/odc-39/17834272']
[debug] Encodings: locale ANSI_X3.4-1968, fs ANSI_X3.4-1968, out ANSI_X3.4-1968, pref ANSI_X3.4-1968
[debug] youtube-dl version 2015.12.06
[debug] Python version 2.7.9 - Linux-3.16.0-4-amd64-x86_64-with-debian-8.2
[debug] exe versions: none
[debug] Proxy map: {}
{"display_id": "17834272", "extractor": "tvp.pl", "protocol": "m3u8", "_filename": "Na sygnale, odc. 39-17834272.mp4", "format": "10407 - 1920x1080", "requested_subtitles": null, "tbr": 10407, "height": 1080, "preference": null, "format_id": "10407", "playlist_index": null, "playlist": null, "thumbnails": [{"url": "http://s.tvp.pl/images2/0/6/a/uid_06ac4295470c8be8d18d5934f8b50a051417639984396_width_720_play_0_pos_0_gs_0_height_405.jpg", "id": "0"}], "title": "Na sygnale, odc. 39", "url": "http://46.28.242.18/token/video/vod/17834272/20151208/1593557513/7ab3d917-c042-43d9-9abe-6f3c57fab0bd/video.ism/video-audio%3D96000-video%3D9722000.m3u8", "extractor_key": "Tvp", "vcodec": "mp4a", "http_headers": {"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/20.0 (Chrome)"}, "id": "17834272", "width": 1920, "ext": "mp4", "webpage_url": "http://vod.tvp.pl/seriale/obyczajowe/na-sygnale/sezon-2-27-/odc-39/17834272", "formats": [{"http_headers": {"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/20.0 (Chrome)"}, "protocol": "m3u8", "format": "meta - multiple (Quality selection URL)", "url": "http://46.28.242.18/token/video/vod/17834272/20151208/1593557513/7ab3d917-c042-43d9-9abe-6f3c57fab0bd/video.ism/video.m3u8", "format_note": "Quality selection URL", "ext": "mp4", "preference": -1, "format_id": "meta", "resolution": "multiple"}, {"http_headers": {"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/20.0 (Chrome)"}, "protocol": "m3u8", "format": "101 - unknown", "url": "http://46.28.242.18/token/video/vod/17834272/20151208/1593557513/7ab3d917-c042-43d9-9abe-6f3c57fab0bd/video.ism/video-audio%3D96000.m3u8", "vcodec": "mp4a", "tbr": 101, "ext": "mp4", "preference": null, "format_id": "101"}, {"http_headers": {"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/20.0 (Chrome)"}, "protocol": "m3u8", "format": "736 - 398x224", "url": "http://46.28.242.18/token/video/vod/17834272/20151208/1593557513/7ab3d917-c042-43d9-9abe-6f3c57fab0bd/video.ism/video-audio%3D96000-video%3D599000.m3u8", "vcodec": "mp4a", "tbr": 736, "height": 224, "width": 398, "ext": "mp4", "preference": null, "format_id": "736", "acodec": "avc1"}, {"http_headers": {"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/20.0 (Chrome)"}, "protocol": "m3u8", "format": "984 - 480x270", "url": "http://46.28.242.18/token/video/vod/17834272/20151208/1593557513/7ab3d917-c042-43d9-9abe-6f3c57fab0bd/video.ism/video-audio%3D96000-video%3D833000.m3u8", "vcodec": "mp4a", "tbr": 984, "height": 270, "width": 480, "ext": "mp4", "preference": null, "format_id": "984", "acodec": "avc1"}, {"http_headers": {"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/20.0 (Chrome)"}, "protocol": "m3u8", "format": "1444 - 640x360", "url": "http://46.28.242.18/token/video/vod/17834272/20151208/1593557513/7ab3d917-c042-43d9-9abe-6f3c57fab0bd/video.ism/video-audio%3D96000-video%3D1267000.m3u8", "vcodec": "mp4a", "tbr": 1444, "height": 360, "width": 640, "ext": "mp4", "preference": null, "format_id": "1444", "acodec": "avc1"}, {"http_headers": {"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/20.0 (Chrome)"}, "protocol": "m3u8", "format": "1979 - 800x450", "url": "http://46.28.242.18/token/video/vod/17834272/20151208/1593557513/7ab3d917-c042-43d9-9abe-6f3c57fab0bd/video.ism/video-audio%3D96000-video%3D1771000.m3u8", "vcodec": "mp4a", "tbr": 1979, "height": 450, "width": 800, "ext": "mp4", "preference": null, "format_id": "1979", "acodec": "avc1"}, {"http_headers": {"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/20.0 (Chrome)"}, "protocol": "m3u8", "format": "3160 - 960x540", "url": "http://46.28.242.18/token/video/vod/17834272/20151208/1593557513/7ab3d917-c042-43d9-9abe-6f3c57fab0bd/video.ism/video-audio%3D96000-video%3D2886000.m3u8", "vcodec": "mp4a", "tbr": 3160, "height": 540, "width": 960, "ext": "mp4", "preference": null, "format_id": "3160", "acodec": "avc1"}, {"http_headers": {"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/20.0 (Chrome)"}, "protocol": "m3u8", "format": "5990 - 1280x720", "url": "http://46.28.242.18/token/video/vod/17834272/20151208/1593557513/7ab3d917-c042-43d9-9abe-6f3c57fab0bd/video.ism/video-audio%3D96000-video%3D5555000.m3u8", "vcodec": "mp4a", "tbr": 5990, "height": 720, "width": 1280, "ext": "mp4", "preference": null, "format_id": "5990", "acodec": "avc1"}, {"http_headers": {"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/20.0 (Chrome)"}, "protocol": "m3u8", "format": "10407 - 1920x1080", "url": "http://46.28.242.18/token/video/vod/17834272/20151208/1593557513/7ab3d917-c042-43d9-9abe-6f3c57fab0bd/video.ism/video-audio%3D96000-video%3D9722000.m3u8", "vcodec": "mp4a", "tbr": 10407, "height": 1080, "width": 1920, "ext": "mp4", "preference": null, "format_id": "10407", "acodec": "avc1"}], "fulltitle": "Na sygnale, odc. 39", "thumbnail": "http://s.tvp.pl/images2/0/6/a/uid_06ac4295470c8be8d18d5934f8b50a051417639984396_width_720_play_0_pos_0_gs_0_height_405.jpg", "webpage_url_basename": "17834272", "acodec": "avc1"}
@tomaszg7
Copy link

tomaszg7 commented Mar 6, 2016

It's not only news programs. Some other sections are also not recognized: e.g. http://vod.tvp.pl/24153473/tegie-chlopy

@rathann
Copy link

rathann commented Mar 14, 2016

Another not working example: http://vod.tvp.pl/13364434/nela-mala-reporterka .

@rathann
Copy link

rathann commented Mar 14, 2016

FYI, here's a script that does work (save it as bookmark, open the URL above and click on the bookmark):
http://miniskrypt.blogspot.com/2014/04/3-w-1.html
Use any of the "Pobieracz" links there, they contain embedded JS code.

@tomaszg7
Copy link

tomaszg7 commented May 29, 2017

It seems that vod.tvp.pl is broken again:

youtube-dl -v  https://vod.tvp.pl/video/kariera-nikodema-dyzmy,odc-1,1156628
[debug] System config: []
[debug] User config: ['--prefer-free-formats', '--merge-output-format', 'mkv']
[debug] Custom config: []
[debug] Command-line args: ['-v', 'https://vod.tvp.pl/video/kariera-nikodema-dyzmy,odc-1,1156628']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2017.05.23
[debug] Python version 3.4.5 - Linux-4.9.29-gentoo-x86_64-AMD_FX-8370E_Eight-Core_Processor-with-gentoo-2.3
[debug] exe versions: ffmpeg 3.3.1, ffprobe 3.3.1, rtmpdump 2.4
[debug] Proxy map: {}
[generic] kariera-nikodema-dyzmy,odc-1,1156628: Requesting header
WARNING: Falling back on generic information extractor.
[generic] kariera-nikodema-dyzmy,odc-1,1156628: Downloading webpage
[generic] kariera-nikodema-dyzmy,odc-1,1156628: Extracting information
ERROR: Unsupported URL: https://vod.tvp.pl/video/kariera-nikodema-dyzmy,odc-1,1156628
Traceback (most recent call last):
  File "/usr/lib64/python3.4/site-packages/youtube_dl/YoutubeDL.py", line 760, in extract_info
    ie_result = ie.extract(url)
  File "/usr/lib64/python3.4/site-packages/youtube_dl/extractor/common.py", line 433, in extract
    ie_result = self._real_extract(url)
  File "/usr/lib64/python3.4/site-packages/youtube_dl/extractor/generic.py", line 2795, in _real_extract
    raise UnsupportedError(url)
youtube_dl.utils.UnsupportedError: Unsupported URL: https://vod.tvp.pl/video/kariera-nikodema-dyzmy,odc-1,1156628

Also, "pobieracz" linked above doesn't seem to work either :(

@rathann
Copy link

rathann commented May 31, 2017

They redesigned the website. As you can see, the URLs are different now.

@rathann
Copy link

rathann commented May 31, 2017

After some digging, I found that for example for https://vod.tvp.pl/video/jak-to-dziala,kompresja-danych,29644524 I can get the video from the URL of the <iframe>:

        <iframe class="fit-height" scrolling="no" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""
                src="https://vod.tvp.pl/sess/player/video/29644524"
                        ></iframe>

youtube-dl works just fine when given the above URL directly, so apparently only extraction of that URL is broken now.
However, this appears to be be true only for newer episodes. Older episodes, like https://vod.tvp.pl/video/jak-to-dziala,roboty,9117411 are not parsed at all.

Going to https://vod.tvp.pl/sess/tvplayer.php?object_id=9117411&autoplay=true reveals the video source URL (in the <video> tag): https://sdt-node192-208.tvp.pl/token/video/vod/9117411/20170531/3568614854/7d81f39c-0707-421d-a66d-b5fdbcc1cb2c .

@bezik46
Copy link

bezik46 commented Sep 22, 2018

Still cannot download from ie:

https://vod.tvp.pl/video/miasto-skarbow,odc-1-cezanne,33724038

edit:
I think that happens if some of the fragments are not available (either physically missing or somehow encrypted) for plain download. Strange, as watching in browser seems to work OK?!
One can easily see the missing fragments error showing with
ffmpeg -i input.m3u8 -c copy -bsf:a aac_adtstoasc "output.mp4

[http @ 0000016821e90980] HTTP error 502 Bad Gateway
[hls,applehttp @ 0000016821e89dc0] Failed to open segment 1 of playlist 0

So the "solution" was to use the -F option & chose -f http-xxxx stream, which worked FINE for download

`W:\Media>youtube-dl -v https://vod.tvp.pl/sess/player/video/33724038

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-v', 'https://vod.tvp.pl/sess/player/video/33724038']
[debug] Encodings: locale cp1252, fs mbcs, out cp850, pref cp1252
[debug] youtube-dl version 2018.06.14
[debug] Python version 3.4.4 (CPython) - Windows-10-10.0.14393
[debug] exe versions: ffmpeg N-80085-g9591ca7, rtmpdump 2.4
[debug] Proxy map: {}
[tvp] 33724038: Downloading webpage
[tvp:embed] 33724038: Downloading webpage
[tvp:embed] 33724038: Downloading ISM manifest
WARNING: Failed to download ISM manifest: HTTP Error 502: Bad Gateway
[tvp:embed] 33724038: Downloading f4m manifest
[tvp:embed] 33724038: Downloading m3u8 information
[tvp:embed] 33724038: Checking video URL
[tvp:embed] 33724038: video URL is invalid, skipping
[tvp:embed] 33724038: Checking video URL
[tvp:embed] 33724038: Checking video URL
[tvp:embed] 33724038: Checking video URL
[tvp:embed] 33724038: Checking video URL
[tvp:embed] 33724038: Checking video URL
[tvp:embed] 33724038: video URL is invalid, skipping
[tvp:embed] 33724038: Checking video URL
[tvp:embed] 33724038: video URL is invalid, skipping
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on 'http://sdt-l3-04-199.tvp.pl/token/video/vod/33724038/20180922/1380265253/3c6e67e0-f1dc-49b2-b103-c6db690e5621/video.ism/video-audio%3D97000-video%3D9440000.m3u8'
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 663
[download] Destination: Miasto skarbów, odc. 1 – Cezanne-33724038.mp4
ERROR: giving up after 10 retries
File "main.py", line 19, in
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bcc20cp\build\youtube_dl_init_.py", line 472, in main
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bcc20cp\build\youtube_dl_init_.py", line 462, in _real_main
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bcc20cp\build\youtube_dl\YoutubeDL.py", line 2001, in download
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bcc20cp\build\youtube_dl\YoutubeDL.py", line 803, in extract_info
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bcc20cp\build\youtube_dl\YoutubeDL.py", line 895, in process_ie_result
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bcc20cp\build\youtube_dl\YoutubeDL.py", line 857, in process_ie_result
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bcc20cp\build\youtube_dl\YoutubeDL.py", line 1635, in process_video_result
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bcc20cp\build\youtube_dl\YoutubeDL.py", line 1908, in process_info
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bcc20cp\build\youtube_dl\YoutubeDL.py", line 1847, in dl
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bcc20cp\build\youtube_dl\downloader\common.py", line 364, in download
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bcc20cp\build\youtube_dl\downloader\hls.py", line 144, in real_download
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bcc20cp\build\youtube_dl\downloader\fragment.py", line 102, in _download_fragment
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bcc20cp\build\youtube_dl\downloader\common.py", line 364, in download
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bcc20cp\build\youtube_dl\downloader\http.py", line 353, in real_download
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bcc20cp\build\youtube_dl\downloader\common.py", line 165, in report_error
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bcc20cp\build\youtube_dl\YoutubeDL.py", line 620, in report_error
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\rg3\tmp6bcc20cp\build\youtube_dl\YoutubeDL.py", line 582, in trouble`

@the-researcher
Copy link

the-researcher commented Feb 12, 2019

The solution for the "news" section on TVP is somewhat trivial.

@scerazy gave an example, so I'll flesh it out here a bit.

https://vod.tvp.pl/video/teleexpress,11022019-1700,40998112

That's the link to Teleexpress on TVP. Parsing that link into youtube-dl gives an error. However, if you take the last numerical value after the comma (,) and place that into a direct-link player, the video can be seen by youtube-dl:

https://vod.tvp.pl/sess/player/video/40998112

Therefore, the REGEX needs to be fixed for TVP VOD in general to look at the last numerical string in the URL, and to paste that link to the player module on TVP. If anything, that last string of digits seems to be a UUID for site-wide TVP content.

https://vod.tvp.pl/video/serwis-info,05022019-1529,40998237
                                                   vvvvvvvv REGEX conversion to /sess/player/video/ link
              https://vod.tvp.pl/sess/player/video/40998237
https://vod.tvp.pl/video/panorama,11022019-1100,40998053
                                                vvvvvvvv REGEX conversion to /sess/player/video/ link
           https://vod.tvp.pl/sess/player/video/40998053
https://vod.tvp.pl/video/teleexpress,10022019-1715,40943166
                                                   vvvvvvvv REGEX conversion to /sess/player/video/ link
              https://vod.tvp.pl/sess/player/video/40943166

If the REGEX is fixed for TVP VOD logic to look at the last string of digits for TVP VOD, and to append that to the /sess/player/video/* link, then I think that would solve this problem for TVP VOD site-wide.

@yan12125 ...?

@hubertbanas
Copy link
Author

@the-researcher that did it. Thanks for reporting your findings.

Can we please have this issue re-opened, so we can track the fix?

@the-researcher
Copy link

the-researcher commented Feb 16, 2019

I hate flagging people, but I almost have a fix for this issue. I just have no clue how to modify the source to add the fix.

@hubertbanas @remitamine @jbuchbinder

Here's the REGEX that seems to ALMOST work with unit testing in Python code:

https://regex101.com/r/aubLoI/4

The error is that when the REGEX runs, it will work if the URL has more than 1 comma in the string. If there is only one comma in the URL string, then the REGEX example won't parse the string correctly.

Somewhat frustrating that I can't get the single comma to work, so help would be appreciated.

EDIT 1: Nevermind, the single comma link has a different link structure, i.e. it has /website/ instead of /video/, so a REGEX parsing has to account for that. I think it's fixed now. I just shortened the REGEX to ignore whether or not it's /video/ or /website/, as we only want the last digits of the link as the UUID, which is then parsed into the generalized VOD /sess/ player for youtube-dl extraction. Hooray...!...?

@hubertbanas
Copy link
Author

@the-researcher
Awesome! Will you be creating Pull Request for it?

@the-researcher
Copy link

@hubertbanas I have no clue how to create a pull request.

@the-researcher
Copy link

Opened a new issue, and referenced it here. That will probably fix it...?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants