[heise] fix title extraction, modify test accordingly #15784

kayb94 · 2018-03-06T21:56:34Z

Before submitting a pull request make sure you have:

At least skimmed through adding new extractor tutorial and youtube-dl coding conventions sections
Searched the bugtracker for similar pull requests
Checked the code with flake8

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

I am the original author of this code and I am willing to release it under Unlicense

What is the purpose of your pull request?

Bug fix

Description of your pull request and other information

I fixed the title extraction for heise. All three unit tests now work (had to adjust one).
This fixes #15496 and also even makes my own old pull request #14108 obsolete and closable.
Accordingly, this also can replace #15026 (corrected wrong pull request number).

dstftw · 2018-03-06T22:02:43Z

youtube_dl/extractor/heise.py

@@ -76,7 +74,9 @@ def _real_extract(self, url):
        if not title or title == "c't":
            title = self._search_regex(
                r'<div[^>]+class="videoplayerjw"[^>]+data-title="([^"]+)"',
-                webpage, 'title')
+                webpage, 'title', default=None, fatal=False)


default and fatal are not used together.

dstftw · 2018-03-06T22:02:57Z

youtube_dl/extractor/heise.py

-                webpage, 'title')
+                webpage, 'title', default=None, fatal=False)
+        if not title:
+            self._og_search_title(webpage)


This has no effect.

kayb94 · 2018-03-06T22:21:14Z

Obviously you were right! Thanks a lot... ^^

dstftw · 2018-03-07T16:15:34Z

youtube_dl/extractor/heise.py

-                webpage, 'title')
+                webpage, 'title', default=None)
+        if not title:
+            title = self._og_search_title(webpage)


Adding another fallback here does not make much sense since title is not used when delegated to kaltura anyway.

kayb94 · 2018-03-14T21:54:54Z

Is this ready, or shall I change anything?

Regards!

dstftw · 2018-03-15T02:37:15Z

youtube_dl/extractor/heise.py

            title = self._search_regex(
                r'<div[^>]+class="videoplayerjw"[^>]+data-title="([^"]+)"',
-                webpage, 'title')
+                webpage, 'title', default=None)


Nothing changed.

What do you mean, "nothing changed"? If I use the version on master, the title extraction still fails for some videos (the latest c't uplink episode worked out just fine), especially the one in the unit test.

Regards

Adding another fallback here does not make much sense since title is not used when delegated to kaltura anyway.

Anyway as said, If i don't do so, some videos can't be downloaded, because the title extraction fails. Also, not all videos get directed to KalturaIE, I think (can't provide an example right now though).

Is there any way of giving the extracted title to KalturaIE?

Regards!

[heise] fix title extraction, modify test accordingly

18f011f

dstftw requested changes Mar 6, 2018

View reviewed changes

dstftw added the pending-fixes label Mar 6, 2018

[heise] fix my non-sense in title extraction

60b76bd

dstftw requested changes Mar 7, 2018

View reviewed changes

[heise] Further simplify title extraction

47e2c48

dstftw reviewed Mar 15, 2018

View reviewed changes

dstftw closed this in 8e70c1b Mar 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[heise] fix title extraction, modify test accordingly #15784

[heise] fix title extraction, modify test accordingly #15784

kayb94 commented Mar 6, 2018 •

edited

Loading

dstftw Mar 6, 2018

dstftw Mar 6, 2018

kayb94 commented Mar 6, 2018

dstftw Mar 7, 2018

kayb94 commented Mar 14, 2018

dstftw Mar 15, 2018

kayb94 Mar 15, 2018

dstftw Mar 15, 2018

kayb94 Mar 15, 2018

[heise] fix title extraction, modify test accordingly #15784

[heise] fix title extraction, modify test accordingly #15784

Conversation

kayb94 commented Mar 6, 2018 • edited Loading

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

What is the purpose of your pull request?

Description of your pull request and other information

dstftw Mar 6, 2018

Choose a reason for hiding this comment

dstftw Mar 6, 2018

Choose a reason for hiding this comment

kayb94 commented Mar 6, 2018

dstftw Mar 7, 2018

Choose a reason for hiding this comment

kayb94 commented Mar 14, 2018

dstftw Mar 15, 2018

Choose a reason for hiding this comment

kayb94 Mar 15, 2018

Choose a reason for hiding this comment

dstftw Mar 15, 2018

Choose a reason for hiding this comment

kayb94 Mar 15, 2018

Choose a reason for hiding this comment

kayb94 commented Mar 6, 2018 •

edited

Loading