Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New York Times Closed Caption Not Being Detected #28132

Open
michaeljohnm opened this issue Feb 10, 2021 · 0 comments
Open

New York Times Closed Caption Not Being Detected #28132

michaeljohnm opened this issue Feb 10, 2021 · 0 comments

Comments

@michaeljohnm
Copy link

I was able to download this video just fine and it plays ok in Chrome. There is also an available closed caption in the New York Times embedded player but upon attempting to list the available subs for this particular video, YTDL tells me that the video has no subtitles. Below is the verbose output from my attempt.

I have tested a couple of videos and YTDL gives the same result for all of them. Thanks for the help! Appreciate your time.

youtube-dl -F https://www.nytimes.com/video/health/100000007056651/covid-ards-acute-respiratory-distress-syndrome.html --list-sub -v
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-F', 'https://www.nytimes.com/video/health/100000007056651/covid-ards-acute-respiratory-distress-syndrome.html', '--list-sub', '-v']
[debug] Encodings: locale cp1252, fs mbcs, out cp850, pref cp1252
[debug] youtube-dl version 2021.02.04
[debug] Python version 3.4.4 (CPython) - Windows-10-10.0.19041
[debug] exe versions: ffmpeg N-99692-gde59826703, ffprobe N-99692-gde59826703
[debug] Proxy map: {}
[NYTimes] 100000007056651: Downloading video JSON
[NYTimes] 100000007056651: Downloading m3u8 information
[NYTimes] 100000007056651: Downloading m3u8 information
100000007056651 has no subtitles

pukkandan added a commit to yt-dlp/yt-dlp that referenced this issue Apr 28, 2021
Authored by fstirlitz
Modified from: ytdl-org/youtube-dl#6144

Closes: #73
Fixes:
ytdl-org/youtube-dl#6106
ytdl-org/youtube-dl#14977
ytdl-org/youtube-dl#21438
ytdl-org/youtube-dl#23609
ytdl-org/youtube-dl#28132

Might also fix (untested):
ytdl-org/youtube-dl#15424
ytdl-org/youtube-dl#18267
ytdl-org/youtube-dl#23899
ytdl-org/youtube-dl#24375
ytdl-org/youtube-dl#24595
ytdl-org/youtube-dl#27899

Related:
ytdl-org/youtube-dl#22379
ytdl-org/youtube-dl#24517
ytdl-org/youtube-dl#24886
ytdl-org/youtube-dl#27215

Notes:
* The functions `extractor.common._extract_..._formats` are still kept for compatibility
* Only some extractors have currently been moved to using `_extract_..._formats_and_subtitles`
* Direct subtitle manifests (without a master) are not supported and are wrongly identified as containing video formats
* AES support is untested
* The fragmented TTML subtitles extracted from DASH/ISM are valid, but are unsupported by `ffmpeg` and most video players
    * Their XML fragments can be dumped using `ffmpeg -i in.mp4 -f data -map 0 -c copy out.ttml`.
        Once the unnecessary headers are stripped out of this, it becomes a valid self-contained ttml file
    * The ttml subs downloaded from DASH manifests can also be directly opened with <https://github.com/SubtitleEdit>
* Fragmented WebVTT files extracted from DASH/ISM are also unsupported by most tools
    * Unlike the ttml files, the XML fragments of these cannot be dumped using `ffmpeg`
    * The webtt subs extracted from DASH can be parsed by <https://github.com/gpac/gpac>
    * But validity of the those extracted from ISM are untested
nixxo pushed a commit to nixxo/yt-dlp that referenced this issue Nov 22, 2021
Authored by fstirlitz
Modified from: ytdl-org/youtube-dl#6144

Closes: #73
Fixes:
ytdl-org/youtube-dl#6106
ytdl-org/youtube-dl#14977
ytdl-org/youtube-dl#21438
ytdl-org/youtube-dl#23609
ytdl-org/youtube-dl#28132

Might also fix (untested):
ytdl-org/youtube-dl#15424
ytdl-org/youtube-dl#18267
ytdl-org/youtube-dl#23899
ytdl-org/youtube-dl#24375
ytdl-org/youtube-dl#24595
ytdl-org/youtube-dl#27899

Related:
ytdl-org/youtube-dl#22379
ytdl-org/youtube-dl#24517
ytdl-org/youtube-dl#24886
ytdl-org/youtube-dl#27215

Notes:
* The functions `extractor.common._extract_..._formats` are still kept for compatibility
* Only some extractors have currently been moved to using `_extract_..._formats_and_subtitles`
* Direct subtitle manifests (without a master) are not supported and are wrongly identified as containing video formats
* AES support is untested
* The fragmented TTML subtitles extracted from DASH/ISM are valid, but are unsupported by `ffmpeg` and most video players
    * Their XML fragments can be dumped using `ffmpeg -i in.mp4 -f data -map 0 -c copy out.ttml`.
        Once the unnecessary headers are stripped out of this, it becomes a valid self-contained ttml file
    * The ttml subs downloaded from DASH manifests can also be directly opened with <https://github.com/SubtitleEdit>
* Fragmented WebVTT files extracted from DASH/ISM are also unsupported by most tools
    * Unlike the ttml files, the XML fragments of these cannot be dumped using `ffmpeg`
    * The webtt subs extracted from DASH can be parsed by <https://github.com/gpac/gpac>
    * But validity of the those extracted from ISM are untested
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant