Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[heise] fix title extraction, modify test accordingly #15784

Closed
wants to merge 3 commits into from

Conversation

kayb94
Copy link
Contributor

@kayb94 kayb94 commented Mar 6, 2018

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense

What is the purpose of your pull request?

  • Bug fix

Description of your pull request and other information

I fixed the title extraction for heise. All three unit tests now work (had to adjust one).
This fixes #15496 and also even makes my own old pull request #14108 obsolete and closable.
Accordingly, this also can replace #15026 (corrected wrong pull request number).

@@ -76,7 +74,9 @@ def _real_extract(self, url):
if not title or title == "c't":
title = self._search_regex(
r'<div[^>]+class="videoplayerjw"[^>]+data-title="([^"]+)"',
webpage, 'title')
webpage, 'title', default=None, fatal=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default and fatal are not used together.

webpage, 'title')
webpage, 'title', default=None, fatal=False)
if not title:
self._og_search_title(webpage)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has no effect.

@kayb94
Copy link
Contributor Author

kayb94 commented Mar 6, 2018

Obviously you were right! Thanks a lot... ^^

webpage, 'title')
webpage, 'title', default=None)
if not title:
title = self._og_search_title(webpage)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding another fallback here does not make much sense since title is not used when delegated to kaltura anyway.

@kayb94
Copy link
Contributor Author

kayb94 commented Mar 14, 2018

Is this ready, or shall I change anything?

Regards!

title = self._search_regex(
r'<div[^>]+class="videoplayerjw"[^>]+data-title="([^"]+)"',
webpage, 'title')
webpage, 'title', default=None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean, "nothing changed"? If I use the version on master, the title extraction still fails for some videos (the latest c't uplink episode worked out just fine), especially the one in the unit test.

Regards

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding another fallback here does not make much sense since title is not used when delegated to kaltura anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway as said, If i don't do so, some videos can't be downloaded, because the title extraction fails. Also, not all videos get directed to KalturaIE, I think (can't provide an example right now though).

Is there any way of giving the extracted title to KalturaIE?

Regards!

@dstftw dstftw closed this in 8e70c1b Mar 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Heise] ERROR: Unable to extract title;
2 participants