Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YouTube] Tab Extractor only extracts first page (30 videos) sometimes #28075

Closed
6 tasks done
coletdjnz opened this issue Feb 4, 2021 · 15 comments
Closed
6 tasks done

Comments

@coletdjnz
Copy link
Contributor

coletdjnz commented Feb 4, 2021

Checklist

  • I'm reporting a broken site support issue
  • I've verified that I'm running youtube-dl version 2021.02.04.1
  • I've checked that all provided URLs are alive and playable in a browser
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched the bugtracker for similar bug reports including closed ones
  • I've read bugs section in FAQ

Verbose log

On this particular large channel sometimes it will only extract one page:

youtube-dl https://www.youtube.com/c/willaxtv/videos --flat-playlist --verbose
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://www.youtube.com/c/willaxtv/videos', '--flat-playlist', '--verbose']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.02.04.1
[debug] Python version 3.9.1 (CPython) - Linux-5.10.7-3-MANJARO-x86_64-with-glibc2.32
[debug] exe versions: ffmpeg 4.3.1, ffprobe 4.3.1, rtmpdump 2.4
[debug] Proxy map: {}
[youtube:tab] willaxtv: Downloading webpage
[download] Downloading playlist: Willax Television - Videos
[youtube:tab] playlist Willax Television - Videos: Downloading 30 videos
[download] Downloading video 1 of 30
[download] Downloading video 2 of 30
[download] Downloading video 3 of 30
[download] Downloading video 4 of 30
[download] Downloading video 5 of 30
[download] Downloading video 6 of 30
[download] Downloading video 7 of 30
[download] Downloading video 8 of 30
[download] Downloading video 9 of 30
[download] Downloading video 10 of 30
[download] Downloading video 11 of 30
[download] Downloading video 12 of 30
[download] Downloading video 13 of 30
[download] Downloading video 14 of 30
[download] Downloading video 15 of 30
[download] Downloading video 16 of 30
[download] Downloading video 17 of 30
[download] Downloading video 18 of 30
[download] Downloading video 19 of 30
[download] Downloading video 20 of 30
[download] Downloading video 21 of 30
[download] Downloading video 22 of 30
[download] Downloading video 23 of 30
[download] Downloading video 24 of 30
[download] Downloading video 25 of 30
[download] Downloading video 26 of 30
[download] Downloading video 27 of 30
[download] Downloading video 28 of 30
[download] Downloading video 29 of 30
[download] Downloading video 30 of 30
[download] Finished downloading playlist: Willax Television - Videos

Running again...

youtube-dl https://www.youtube.com/c/willaxtv/videos --flat-playlist --verbose
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://www.youtube.com/c/willaxtv/videos', '--flat-playlist', '--verbose']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.02.04.1
[debug] Python version 3.9.1 (CPython) - Linux-5.10.7-3-MANJARO-x86_64-with-glibc2.32
[debug] exe versions: ffmpeg 4.3.1, ffprobe 4.3.1, rtmpdump 2.4
[debug] Proxy map: {}
[youtube:tab] willaxtv: Downloading webpage
[download] Downloading playlist: Willax Television - Videos
[youtube:tab] Downloading page 1
[youtube:tab] Downloading page 2
[youtube:tab] Downloading page 3
[youtube:tab] Downloading page 4
[youtube:tab] Downloading page 5
[youtube:tab] Downloading page 6
[youtube:tab] Downloading page 7
..(and so on, this one would have thousands of pages. )

Description

When using --flat-playlist on a /videos tab it sometimes only downloads the first page.

Possibly similar to #27981. I accidentally wrote my debugging in there but realized this is probably different.
Seems to be in these cases YouTube provides slightly different json data, causing the extractor to fail to process the next continuation page so we get less pages than there should be.
My debugging of this: #27981 (comment)

@garbled1
Copy link

garbled1 commented Feb 4, 2021

I just noticed my nightly run started doing this exact same thing this morning. Previously it was pulling up multiple pages, but now suddenly certain channels I had on --playlist-reverse are only finding 30...

@pixelcmtd
Copy link

happening to me too rn

@garbled1
Copy link

Previously I could get it to work by just running again, and then it would start functioning, but now it appears to be completely broken, every time I just get the 30 latest videos..

/usr/local/bin/youtube-dl -o '/video/Misc/Youtube/%(uploader)s [%(channel_id)s]/%(upload_date)s - %(title)s [%(id)s].%(ext)s' http://www.youtube.com/channel/UCGGM8r6qjSRqPy9eAIFT32w/videos --flat-playlist --verbose
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-o', '/video/Misc/Youtube/%(uploader)s [%(channel_id)s]/%(upload_date)s - %(title)s [%(id)s].%(ext)s', 'http://www.youtube.com/channel/UCGGM8r6qjSRqPy9eAIFT32w/videos', '--flat-playlist', '--verbose']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2021.02.04.1
[debug] Python version 3.7.3 (CPython) - Linux-4.19.0-10-amd64-x86_64-with-debian-10.5
[debug] exe versions: ffmpeg 4.1.6-1, ffprobe 4.1.6-1, phantomjs 2.1.1, rtmpdump 2.4
[debug] Proxy map: {}
[youtube:tab] UCGGM8r6qjSRqPy9eAIFT32w: Downloading webpage
[download] Downloading playlist: Marine Depot Aquarium Supplies - Videos
[youtube:tab] playlist Marine Depot Aquarium Supplies - Videos: Downloading 30 videos
[download] Downloading video 1 of 30
[download] Downloading video 2 of 30
[download] Downloading video 3 of 30

Whereas even earlier this morning, I was getting all 27 pages.

@al-rib
Copy link

al-rib commented Feb 10, 2021

Having the same issue, if devs want me to try a patch or something I'm willing to help

@dyewts
Copy link

dyewts commented Feb 10, 2021

Same here. Happens on different channels, on both V 2021.01.08 and 2021.02.04.1

@mechalincoln
Copy link

mechalincoln commented Feb 10, 2021

This was working perfectly earlier today. I can confirm it stops at 30 for playlists larger than 30 videos using version 2021.02.04.1.

I upgraded a few days ago so nothing changed on my end.

The behavior still persisted when I tried to go back and use the 2021.01.24.1 version.

@liamengland1
Copy link

liamengland1 commented Feb 10, 2021

YouTube changed where the continuation token is located in the initialData JSON. So no one else should confirm that they can repro, it's just unneeded.

@pukkandan
Copy link
Contributor

I have a fix: pukkandan/empty@a1b535b
But I'm tired of making PRs that never gets merged. Feel free to cherry pick it

@garbled1
Copy link

I can't get the above patch to apply to 2021.2.4.1

@pukkandan
Copy link
Contributor

the conflict is literally caused by a newline, lol
Here's the patch applied on current master: https://github.com/pukkandan/youtube-dl-1/tree/pagination

@garbled1
Copy link

Confirmed fixed with that copy of youtube.py replacing mine. Thanks.

@serl
Copy link
Contributor

serl commented Feb 11, 2021

I think it has been fixed for #28130

@pukkandan
Copy link
Contributor

yes, this is fixed. I believe search extractor is still broken

@coletdjnz
Copy link
Contributor Author

I think it has been fixed for #28130

Yep I can't seem to reproduce it anymore with the latest update.

Should we close this issue? Or is this still relevant to the search extractor @pukkandan mentioned?

@coletdjnz coletdjnz changed the title [YouTube] /videos tab extractor doesn't extract all pages sometimes [YouTube] Tab Extractor only extracts first page (30 videos) sometimes Feb 19, 2021
@coletdjnz
Copy link
Contributor Author

coletdjnz commented Feb 19, 2021

Changed title to be more specific to this issue to prevent confusion, and since this appears to be fixed now I'm going to close this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants