Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DASH/HLS/SS downloads: Parallelize & merge requests to improve download speed #5978

Closed
ojw28 opened this issue May 31, 2019 · 37 comments
Closed
Assignees

Comments

@ojw28
Copy link
Contributor

ojw28 commented May 31, 2019

ExoPlayer currently requests segments one at a time when downloading segmented media. This means that the network is not fully utilized, particularly in cases where there's significant latency between client and server. There are two approaches for improving this:

  1. Parallelize segment downloads so that the network can remain more fully utilized.
  2. Where consecutive segments have the same URL and consecutive byte ranges (normally true for DASH VOD, and also supported in HLS with #EXT-X-BYTERANGE), it's possible to request multiple segments in a single HTTP request.

In both cases, any implementation should ensure that demuxed tracks are downloaded at approximately equal rates, because users expect that if something is ~33% downloaded, approximately the first ~33% of that piece of content will be playable.

@ghost
Copy link

ghost commented Jun 27, 2019

What is the progress of this issue. the performance of our product offline download feature is extremely poor. because of the low ts cdn, we need parallelize segment downloads to download more ts files in parallel.

@Stijak
Copy link

Stijak commented Jun 27, 2019

This is really serious issue. For a bare 600MB (in 8000 chunks) player needs about 40 minutes to download (over 300 MBIT network)...
This is only 15%ish percent faster than action-file based implementation, and more then 10 times slower then our old custom download implementation on same medias...

@janeriksamsonsen
Copy link

janeriksamsonsen commented Jun 27, 2019

@Stijak - just measured a download on my side:
Pixel 3 XL
100mbit connection
Around 8700 segments (video + audio + subtitle) where each segment is 2 seconds long.
Total downloaded: 4.2GB
Time to finish: 11 minutes

Your numbers sounds strange. Can you please share some details about your device and implementation?

I do however also see the need for parallelized downloads since ExoPlayer download is still quite a bit slower than our 3rdparty player/downloader.

@Stijak
Copy link

Stijak commented Jun 28, 2019

Ok, thanks for feedback. Judging on your responses, on my test Galaxy S9+ speed should be high even on exoplayer 2.9, and yet, still slow, and there are no CPU bottlenecks or anything like that.

I even transferred DASH from CDN to my local macbook apache to eliminate eventual CDN problems with small chunks.
Ok, I guess I will keep looking and the problem is not in exoplayer, although I don't understand how my code could influence download speed.

Screen Shot 2019-06-28 at 12 53 35 PM

Screen Shot 2019-06-28 at 12 53 50 PM

@Stijak
Copy link

Stijak commented Jun 28, 2019

It seems that significant portion is spent on request initialization...
And much smaller audio files are sometimes slower...

Screen Shot 2019-06-28 at 1 10 38 PM

Screen Shot 2019-06-28 at 1 10 56 PM

If results from both mobile network (to CDN) and another team in another part of the globe wasn't similar, I would blame local WiFi...

@Stijak
Copy link

Stijak commented Jul 10, 2019

Update:
I repeated the same tests (local apache server) with dash samples from https://dash.itec.aau.at/dash-dataset/

I used red bull sample, since it was longest, and using same bitrate we aim for our content, and our production app for testing (so it would be apple to apple comparison), I was able to test 15 second segment size and 1 second segment size downloads.

15 second segments download finished in less than 2 minutes, while 1 second segment took about 8 minutes. So here clear differences in download speed.

1 second is about 5 000 segments and still noticeably faster than our comparable content. Is it some other encoding option, or DRM (our content is using Widewine) I am not sure, but will try to reach solution with content providers...

@zourb
Copy link

zourb commented Jul 19, 2019

Are there any updates for this issue? The parallelized downloads feature is really important especially for poor segments downloads' speed.

@noamtamim
Copy link
Contributor

@ojw28 is this something you're currently working on? We're looking to move from our current download solution to ExoPlayer's built-in, and this is a serious blocker.

@tonihei
Copy link
Collaborator

tonihei commented Aug 19, 2019

Unfortunately no progress yet. But acknowledged the renewed interest and we should probably take a look soon.

@ojw28 ojw28 assigned tonihei and unassigned ojw28 Sep 6, 2019
@ghost
Copy link

ghost commented Sep 8, 2019

what is the status of this issue?

@busylee999
Copy link

Is there are any update?

ojw28 pushed a commit that referenced this issue Nov 15, 2019
This speeds up downloads where segments have the same URL with different
byte ranges. We limit the merged segments to 20 seconds to ensure the download
progress of demuxed streams is roughly in line with the playable media duration.

Issue:#5978
PiperOrigin-RevId: 280410761
@ebmalone1234
Copy link

We still have a lot of interest in this issue. Is anyone actively working it?

@tonihei
Copy link
Collaborator

tonihei commented Feb 20, 2020

The merging part is already done and released from 2.11.0 onwards (see commit above). The parallelization is also implemented but not submitted yet due to pending reviews and it will likely be in the next major release.

@ojw28
Copy link
Contributor Author

ojw28 commented Feb 20, 2020

Correction: The merging part was not actually included in 2.11.0 (or any tagged release to date). It will probably land together with the parallelization improvement in 2.12.0.

ojw28 added a commit that referenced this issue Feb 25, 2020
Bitrates in SmoothStreaming manifests are average bitrates, as per
the ref'd issue.

Issue: #5978
PiperOrigin-RevId: 296467667
ojw28 added a commit that referenced this issue Feb 25, 2020
Bitrates in the DASH manifest are peak bitrates, as per the ref'd issue.

Issue: #5978
PiperOrigin-RevId: 296478812
ojw28 added a commit that referenced this issue Feb 28, 2020
Issue: #5978
PiperOrigin-RevId: 297876336
@kschults
Copy link

@ojw28 Is there a branch where one could test this out prior to the 2.12.0 release becoming available?

@sky-filipeuva
Copy link

I second this question ☝️ We are really interested in testing this feature

@ojw28
Copy link
Contributor Author

ojw28 commented Apr 21, 2020

Changes will appear on the dev-v2 branch as they become available, and will be referenced from this issue (like the ones above).

Note that merging segment fetches is already committed, for the (ideal) case where consecutive segments use the same URL and adjacent byte ranges. Parallel segment fetches is being worked on currently, and you should see commits appearing over the next two weeks.

@sky-filipeuva
Copy link

Thanks very much for the update and transparency !

jruesga pushed a commit to jruesga/ExoPlayer that referenced this issue Apr 21, 2020
This speeds up downloads where segments have the same URL with different
byte ranges. We limit the merged segments to 20 seconds to ensure the download
progress of demuxed streams is roughly in line with the playable media duration.

Issue:google#5978
PiperOrigin-RevId: 280410761
@jruesga
Copy link
Contributor

jruesga commented Apr 22, 2020

I developed my own implementation of ExoPlayer parallelization segments here (https://github.com/jruesga/ExoPlayer/commits/parallel-chunks-downloads), just in case anyone want to test it. It's currently based on r2.10.8, but includes the "merge segment" patch, so I think is easy portable to the stable and dev branches.

Seems to work quite nice, and I'm observing systematically lower download times.

Screenshot 2020-04-22 at 09 28 55

I measured the patch with some DASH and HSL resources from the demo app.

Screenshot 2020-04-22 at 11 24 19

icbaker pushed a commit that referenced this issue Apr 27, 2020
Issue: #5978
PiperOrigin-RevId: 308076851
icbaker pushed a commit that referenced this issue Apr 27, 2020
- Executor is a superclass of ExecutorService, so this is arguably a little
  more flexible.
- It removes the need to use null for direct execution, because Runnable::run
  is a direct executor that can be trivially used instead.
- Removing the error-prone "cannot be a direct executor" restriction in the
  parallel version of SegmentDownloader requires not relying on the futures
  returned from ExecutorService.submit() anyway.

Issue: #5978
PiperOrigin-RevId: 308586620
ojw28 added a commit that referenced this issue May 1, 2020
This will make it a bit easier to push manifest loads to an Executor.

Issue: #5978
PiperOrigin-RevId: 308608155
ojw28 added a commit that referenced this issue May 1, 2020
- Stop throwing InterruptedException from CacheUtil. When a CacheUtil operation
  throws or returns, the caller should always check its own state to determine
  whether they canceled the operation. If a caller is trying to catch
  InterruptedException separately to IOException to do something different in
  that case, they're probably doing the wrong thing. So it's simpler, and probably
  less error prone, just to throw an IOException in the case of interruption.
- Throwing InterruptedIOException is also consistent with what our Extractor and
  DataSource implementations do.

Issue: #5978
PiperOrigin-RevId: 309411556
ojw28 added a commit that referenced this issue May 1, 2020
DownloadManager doesn't know enough about the Downloader implementations to
know that interrupting the thread is the right thing to do. In particular,
if a Downloader implementation wants to delegate work to additional worker
threads, then it will probably not want its main thread to be interrupted
on cancelation. Instead, it will want to cancel/interrupt its worker threads,
and keep its main thread blocked until work on those worker threads has
stopped (download() must not return whilst the Downloader is still touching
the cache).

This change moves control over what happens to the individual Downloader
implementations.

Issue: #5978
PiperOrigin-RevId: 309419177
@hasija79
Copy link

It looks like the getManifest function got removed from the SegmentDownloader implementation classes after these changes. We are overriding this abstract function to filter the playlist & copy stream keys currently.

Can we provide access to the getManifest in dependent classes like we have now or remove the final keyword in getManifest method in SegmentDownloader? Thanks

protected final M getManifest(DataSource dataSource, DataSpec dataSpec) throws IOException {

@ojw28
Copy link
Contributor Author

ojw28 commented May 13, 2020

I don't really understand what you're doing, or why. Playlist filtering based on the stream keys already happens automatically in SegmentDownloader.

@hasija79
Copy link

We've the video bitrate available before start of download & not stream keys. In getManifest, I am loading the manifest & selecting suitable stream keys before filtering.

protected HlsPlaylist getManifest(@NonNull DataSource dataSource, @NonNull DataSpec dataSpec) throws IOException {
       HlsPlaylist hlsPlaylist = loadManifest(dataSource, dataSpec);
       List<StreamKey> keys = listener.onFilterPlaylist(embedCode, hlsPlaylist);
       hlsPlaylist = hlsPlaylist.copy(keys);
       return hlsPlaylist;
}

@ojw28
Copy link
Contributor Author

ojw28 commented May 13, 2020

I see. I think the way we'd support that is to let you inject your own HlsPlaylistParser through the constructor. Which you could implement to do something like:

@Override
public HlsPlaylist parse(Uri uri, InputStream inputStream) throws IOException {
  HlsPlaylist playlist = normalParser.parse(uri, inputStream);
  return yourCopyingLogic(playlist);
}

It's possible you already have such a HlsPlaylistParser implementation for playback. Does that approach work for your use case?

@hasija79
Copy link

I believe FilteringManifestParser should work in our case with this logic.

Thanks for the prompt reply. Appreciate it.

tonihei pushed a commit that referenced this issue May 21, 2020
- To make it clear that cache keys are for whole resources
- To explicitly make it clear to implementors that deriving a cache key
  from position and length is invalid. We've seen at least one developer
  trying to do this [1], so it seems worthwhile to be explicit!

[1] #5978 (comment)

Issue: #5978
PiperOrigin-RevId: 312643930
tonihei pushed a commit that referenced this issue May 21, 2020
A subsequent change will rename it to CacheWriter and make it instantiable.

Issue: #5978
PiperOrigin-RevId: 312648623
ojw28 added a commit that referenced this issue May 27, 2020
Sometimes it's useful to be able to block until something on some other thread
"really has finished". This will be needed for moving caching operations onto
the executor in Downloader implementations, since we need to guarantee that
Downloader.download doesn't return until it's no longer modifying the
underlying cache.

One solution to this is of course just to not interrupt the thread that's
blocking on the condition variable, but there are cases where you do need to do
this in case the thread is at some other point in its execution. This is true
for Downloader implementations, where the Download.download thread will also
be blocking on PriorityTaskManager.proceed. Arranging to conditionally
interrupt the thread based on where it's got to is probably possible, but seems
complicated and error prone.

Issue: #5978
PiperOrigin-RevId: 313152413
ojw28 added a commit that referenced this issue May 27, 2020
- The new CacheWriter is simplified somewhat
- Blocking on PriorityTaskManager.proceed is moved out of
  CacheWriter and into the Downloader tasks. This is because
  we want to shift only the caching parts of the Downloaders
  onto their Executors, whilst keeping the blocking parts on
  the main Downloader threads. Else we can end up "using"
  the Executor threads indefinitely whilst they're blocked.

Issue: #5978
PiperOrigin-RevId: 313222923
ojw28 added a commit that referenced this issue May 29, 2020
@ojw28
Copy link
Contributor Author

ojw28 commented May 29, 2020

I haven't looked at locking in SimpleCache yet, but it might actually be relatively straightforward to allow multiple write locks in the case that they have bounded and non-overlapping byte ranges.

@jruesga - We were able to make multiple write locks work. This is merged in 235df09.

Hopefully we'll get the change that actually parallelises everything in next week. It's turned out to be a much bigger task than was initially anticipated.

@ghost

This comment has been minimized.

ojw28 added a commit that referenced this issue Jun 2, 2020
…mentations

Issue: #5978
PiperOrigin-RevId: 314175257
ojw28 added a commit that referenced this issue Jun 11, 2020
ojw28 added a commit that referenced this issue Jun 29, 2020
- Deprecate constructors that don't take an executor, to direct
  developers toward the new ones. Callers can trivially pass
  Runnable::run to one of the new ones if they want old behaviour.
- Add comment explaining warning suppression added in the CL that
  added parallelised download support.

Issue: #5978
PiperOrigin-RevId: 318803296
@ojw28
Copy link
Contributor Author

ojw28 commented Jun 29, 2020

This is fully implemented in dev-v2, and will be part of 2.12.0. Please give it a try.

Note: Additional commit not ref'd above: c9717f6

@Romantic-LiXuefeng

This comment has been minimized.

@ojw28

This comment has been minimized.

@lavajaw
Copy link

lavajaw commented Jul 20, 2020

This is fully implemented in dev-v2, and will be part of 2.12.0. Please give it a try.

Note: Additional commit not ref'd above: c9717f6

Do you have release date for 2.12.0? Or raw estimation?

@ojw28
Copy link
Contributor Author

ojw28 commented Jul 20, 2020

We're actively working on burning down pending issues for 2.12. I think late August is probably our best estimate as of now.

More generally, we'd like to avoid having so much work sitting around unreleased for a long time going forward, which has happened with 2.11 compared to the current dev-v2 branch. We're looking at how we can improve our release process going forward (it's not straightforward though, so it remains to be seen how good of a job we'll do :)).

@ojw28
Copy link
Contributor Author

ojw28 commented Sep 11, 2020

Closing since this will be released in 2.12 any day now!

@ojw28 ojw28 closed this as completed Sep 11, 2020
@google google locked and limited conversation to collaborators Nov 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests