[iTunes] Add new extractor (Closes #2097) #9590

TRox1972 · 2016-05-23T17:34:32Z

The extractor only works for free content, like most podcasts, i.e. it does not download 30-seconds previews of paid songs.

dstftw · 2016-05-25T15:07:21Z

youtube_dl/extractor/itunes.py

+
+        webpage = self._download_webpage(sanitized_Request(self._html_search_regex(
+            r'<string>\s*(https?://itunes.apple.com/[^>]+)</string>', self._download_webpage(
+            request, display_id), 'iTunes url'), headers={'User-Agent': self._USER_AGENT}), display_id)


Avoid such cumbersome code. That's unreadable.

yan12125 · 2016-05-28T11:44:44Z

Seems without an iTunes User-Agent, the response from https://itunes.apple.com/us/itunes-u/uc-davis-symphony-orchestra/id403834767 just contains all what we want. If you're going to keep the current approach, use self._download_xml() and xpath_text() instead of parsing XML by hand.

yan12125 · 2016-05-28T11:49:42Z

youtube_dl/extractor/itunes.py

+# coding: utf-8
+from __future__ import unicode_literals
+
+import datetime


No longer used.

TRox1972 · 2016-05-28T11:53:37Z

@yan12125 You're right. I'll change the approach to not spoof the user agent.

yan12125 · 2016-05-28T11:58:33Z

youtube_dl/extractor/itunes.py

+        webpage = self._download_webpage(url, display_id)
+
+        video_infos = re.findall(r'var\s+__desc_popup_d_\d+\s*=\s*({[^><]+});', webpage)
+        html_entries = re.findall(r'<tr\s+[^>]*role="row"[^>]+>', webpage)


get_element_by_attribute() may be useful.

It will get the content between <tr> and </tr>, but not the content inside the tags, like <tr preview-duration="485000">

yan12125 · 2016-05-28T12:06:01Z

Don't git push for each commit, or Travis CI will be flooded.

TRox1972 · 2016-05-28T12:07:52Z

@yan12125 Sorry didn't know that.

TRox1972 · 2016-06-25T22:06:44Z

@yan12125 @dstftw Does this seem good?

Lomanic · 2017-01-17T17:48:11Z

youtube_dl/extractor/itunes.py

+
+
+class iTunesIE(InfoExtractor):
+    _VALID_URL = r'https?://itunes\.apple\.com/[a-z]{2}/[a-z0-9-]+/(?P<display_id>[a-z0-9-]+)?/(?:id)?(?P<id>[0-9]+)'


It should be the following as the (valid) URL variations of https://itunes.apple.com/us/itunes-u/uc-davis-symphony-orchestra/id403834767 don't match otherwise

_VALID_URL = r'https?://itunes\.apple\.com/[a-z]{2}?/?[a-z0-9-]+/?(?P<display_id>[a-z0-9-]+)?/(?:id)?(?P<id>[0-9]+)'

https://itunes.apple.com/itunes-u/id403834767

https://itunes.apple.com/us/itunes-u/id403834767

https://itunes.apple.com/itunes-u/uc-davis-symphony-orchestra/id403834767

TRox1972 changed the title ~~[iTunes] Add new extractor~~ [iTunes] Add new extractor (Closes #2097) May 23, 2016

TRox1972 force-pushed the p18 branch from fcc0006 to 3e3d445 Compare May 23, 2016 17:49

dstftw reviewed May 25, 2016
View reviewed changes

yan12125 reviewed May 28, 2016
View reviewed changes

TRox1972 force-pushed the p18 branch 2 times, most recently from d4bf26f to 5762b67 Compare May 28, 2016 12:20

[iTunes] Add new extractor

16000e0

TRox1972 force-pushed the p18 branch from 5762b67 to 16000e0 Compare May 28, 2016 12:23

Lomanic reviewed Jan 17, 2017

View reviewed changes

dstftw force-pushed the master branch from fa77986 to 0c7a631 Compare June 24, 2017 22:03

dstftw force-pushed the master branch from 4991699 to 1141e91 Compare August 5, 2017 00:42

NewsGuyTor mentioned this pull request Sep 14, 2017

Add support for iTunes U (was: ERROR: Unsupported URL: https://itunes.apple.com/...) #2097

Open

kristofferR mentioned this pull request Sep 14, 2017

[iTunes] Add new extractor #14202

Closed

8 tasks

dstftw force-pushed the master branch from 293617b to af0f742 Compare October 11, 2017 16:48

dstftw force-pushed the master branch from 37318e1 to 65220c3 Compare January 27, 2018 22:49

dstftw force-pushed the master branch from 8d14fa1 to 5399ab3 Compare February 4, 2018 00:55

dstftw force-pushed the master branch from c486aa9 to 5ee7ae5 Compare December 9, 2018 15:38

dstftw force-pushed the master branch from 8cd780c to de0359c Compare January 4, 2019 20:44

dstftw force-pushed the master branch from d99bab0 to e118a87 Compare January 23, 2019 18:40

dstftw force-pushed the master branch from 5e26784 to da2069f Compare September 13, 2020 13:52

cypheron mentioned this pull request Feb 3, 2021

Evaluation / overview of new proposed extractors / sites #28054

Open

dirkf force-pushed the master branch from 01bf89e to 4c6fba3 Compare August 26, 2022 07:51

dirkf closed this Aug 1, 2023

dirkf added the defunct PR source branch is not accessible label Oct 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[iTunes] Add new extractor (Closes #2097) #9590

[iTunes] Add new extractor (Closes #2097) #9590

TRox1972 commented May 23, 2016 •

edited

Loading

dstftw May 25, 2016

yan12125 commented May 28, 2016

yan12125 May 28, 2016

TRox1972 commented May 28, 2016

yan12125 May 28, 2016

TRox1972 May 28, 2016

yan12125 commented May 28, 2016

TRox1972 commented May 28, 2016

TRox1972 commented Jun 25, 2016

Lomanic Jan 17, 2017



		class iTunesIE(InfoExtractor):
		_VALID_URL = r'https?://itunes\.apple\.com/[a-z]{2}/[a-z0-9-]+/(?P<display_id>[a-z0-9-]+)?/(?:id)?(?P<id>[0-9]+)'

[iTunes] Add new extractor (Closes #2097) #9590

[iTunes] Add new extractor (Closes #2097) #9590

Conversation

TRox1972 commented May 23, 2016 • edited Loading

dstftw May 25, 2016

Choose a reason for hiding this comment

yan12125 commented May 28, 2016

yan12125 May 28, 2016

Choose a reason for hiding this comment

TRox1972 commented May 28, 2016

yan12125 May 28, 2016

Choose a reason for hiding this comment

TRox1972 May 28, 2016

Choose a reason for hiding this comment

yan12125 commented May 28, 2016

TRox1972 commented May 28, 2016

TRox1972 commented Jun 25, 2016

Lomanic Jan 17, 2017

Choose a reason for hiding this comment

TRox1972 commented May 23, 2016 •

edited

Loading