Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GetVideosAsync fix #821

Closed
wants to merge 3 commits into from
Closed

GetVideosAsync fix #821

wants to merge 3 commits into from

Conversation

hunterlan
Copy link

Closes #749

For some reasons, YouTube provided videos, which were already added and this caused to stop function to continue fetch next videos.
Added a separate list, which filter response.Videos by searching added video ID in encounteredIds.

Added additional variable, which will not contain videos, that were already added.
[Fact]
public async Task I_can_get_videos_included_in_a_buggy_playlist()
{
var youtube = new YoutubeClient();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var youtube = new YoutubeClient();
// Arrange
var youtube = new YoutubeClient();

@@ -189,4 +189,16 @@ public async Task I_can_get_a_subset_of_videos_included_in_a_playlist()
// Assert
videos.Should().HaveCount(10);
}

[Fact]
public async Task I_can_get_videos_included_in_a_buggy_playlist()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a better description?

Suggested change
public async Task I_can_get_videos_included_in_a_buggy_playlist()
public async Task I_can_get_videos_included_in_a_playlist_with_a_lot_of_duplicates()

var videos = await youtube.Playlists.GetVideosAsync(PlaylistIds.EnormousDuplicates);

// Assert
videos.Should().HaveCountGreaterOrEqualTo(3_900);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also worth asserting that there are no duplicates

Comment on lines 95 to 100
var originalVideos = response.Videos.Where(v =>
{
return !encounteredIds.Any(e => string.Equals(e.Value, v.Id));
});

foreach (var videoData in response.Videos)
foreach (var videoData in originalVideos)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused what fundamentally changed between the old and the current implementation. Can you explain in more detail?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused what fundamentally changed between the old and the current implementation. Can you explain in more detail?

Sorry for late answer. To be honest it was hard for me to detect it (and even harder to explain it), but I noticed that sometimes our duplicates can be as a last list element and while they are not added to encounteredId, they are still used in JSON key playlistIndex. That's why when we are reaching value of 2039 in encounteredId, we have value 811 of lastVideoIndex. So, to be sure that we don't work with duplicates, we can remove them before foreach.

It's probably make sense to remove if condition in foreach and just add videoId to encounteredId, since we removing duplicates.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I realised that duplicates might be in the beginning and then my suggestion doesn't cover it, so I will improve solution.

Added all suggested changes from PR and
improved duplicate filtration.
Comment on lines +96 to +107
IEnumerable<PlaylistVideoData> originalVideos;
if (!encounteredIds.Any())
{
originalVideos = response.Videos.DistinctBy(v => v.Id);
}
else
{
originalVideos = response.Videos.Where(v =>
{
return !encounteredIds.Any(e => string.Equals(e.Value, v.Id));
});
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think the old code and the new code work the same 🤔

Previously, we used a HashSet to deduplicate video IDs. Using the following block:

  if (!encounteredIds.Add(videoId))
                    continue; // this will execute if the `videoId` already exists

And now we use this instead:

            if (!encounteredIds.Any())
            {
                originalVideos = response.Videos.DistinctBy(v => v.Id);
            }
            else
            {
                originalVideos = response.Videos.Where(v =>
                {
                    return !encounteredIds.Any(e => string.Equals(e.Value, v.Id));
                });
            }

Which seems to do the same.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can run it by yourself and see that my changes allow to download more videos. I can't explain by words, why this changes works better than code on master tree. My observation just showed that for some reasons, on some point (probably when last item is duplicate), it starts to return videos, that were already added to HashSet.

@Tyrrrz Tyrrrz added the invalid label Nov 6, 2024
@Tyrrrz Tyrrrz closed this Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GetVideosAsync does not return all videos of a playlist
2 participants