Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support renderer mode that only renders the first frame (for trick play) #8592

Open
davydovvladimir opened this issue Feb 16, 2021 · 18 comments
Assignees

Comments

@davydovvladimir
Copy link

Hello dear developers!

First, let me explain what I'm doing.
I'm trying to implement reverse trick mode by using I-Frames provided by #EXT-X-I-FRAME-STREAM-INF track.
The current algorithm is as follows:

  1. Set playback to pause by player.setPlayWhenReady(false)
  2. Select trick-play tracks by SelectionOverride.
  3. Get current position by player.getCurrentPosition()
  4. Set a new position reduced by N seconds relative to the current.
  5. Wait for onRenderedFirstFrame() and perform step 3 again. So, steps 3,4,5 are in the loop.

This algorithm is working for me but has some disadvantages.
The main problem is the time between seek() operation and onRenderedFirstFrame event.
It takes about 500ms and looks pretty big.
I have debuged the code and found that first frame is rendered only after 4-5 I-Frames pushed to the decoder.
I found it by additional trace before

.
I've tried to control buffering by overriding DefaultLoadControl.shouldContinueLoading() and I see if I load only 2 or 3 I-Frame chunks after seek(), then the onRenderedFirstFrame event does not come at all.

In theory, sending one i-frame to the decoder is enough to receive onRenderedFirstFrame.
But maybe MediaCodec or Renderer has some internal buffering and requires mulptiple frames to display the first.

Could you tell me why this is happening and how to avoid it?

Thanks.

@andrewlewis
Copy link
Collaborator

In theory, sending one i-frame to the decoder is enough to receive onRenderedFirstFrame.
But maybe MediaCodec or Renderer has some internal buffering and requires mulptiple frames to display the first.

It is indeed the case that MediaCodecs generally require several frames to be queued before they output the first frame, even if (theoretically) there's enough information in the bitstream to output a frame immediately after it's queued. The way around this would be to queue end-of-stream after the I-frame to cause the decoder to output any pending data, but I don't think there's an easy way to get this behavior with the stock ExoPlayer video renderer (you would need to customize it a bit).

Side note: some information in #6794 (comment) may be interesting, if only to confirm your current solution is reasonable.

@ojw28
Copy link
Contributor

ojw28 commented Feb 16, 2021

I think it's more likely that the delay is just a result of decoding up to the exact frame corresponding to the requested seek position. You might be able to speed things up by calling ExoPlayer.setSeekParameters, perhaps with SeekParameters.CLOSEST_SYNC, when entering trick play. The default is EXACT, and you should probably want to restore it when exiting trick play.

To what extent this works may depend on the streaming standard and container format being used. In particular, my suggestion is unlikely to make any difference if you're using HLS (due to #2882 still being open). It should make things better for DASH and most modern progressive container formats (e.g., MP4, MKV).

Ideally the player would have a trick-play mode that would internally do all of the steps required to get the player into a suitable state. Probably including switching the video renderer into a mode where it immediately renders any key-frame it receives. This is tracked by #7171.

@ojw28
Copy link
Contributor

ojw28 commented Feb 16, 2021

Note that you can easily disambiguate between the potential issues @andrewlewis and I are suggesting by adding debugging to MediaCodecVideoRenderer.processOutputBuffer.

If you're not seeing any calls to this method until multiple frames have been queued, then that indicates the issue described by @andrewlewis where the codec needs multiple frames before it provides any output. Conversely, if you're seeing calls to this method with isDecodeOnlyBuffer == true, then that indicates the issue I describe, where frames are being output but are then being discarded by the renderer because they're earlier than the seek position.

@davydovvladimir
Copy link
Author

@andrewlewis @ojw28 thanks a lot for fast answer!
I have performed additional debugging as @ojw28 recommended.
MediaCodecVideoRenderer.processOutputBuffer is called with isDecodeOnlyBuffer == true when third i-frame is queued.
Next call of this method is after fifth i-frame and with isDecodeOnlyBuffer == false immediately before onRenderedFirstFrame.
So, it looks like MediaCodec requires several frames to output first frame and the first generated frame is discarded by the renderer bucause it is earlier than seek position.
I have added call of player.setSeekParameters(SeekParameters.CLOSEST_SYNC); when entering trick play but found no difference in behavior (my stream is HLS with TS container).

Anyway, thanks for your help. I will investigate further options for customizing the video renderer.

@ojw28
Copy link
Contributor

ojw28 commented Feb 16, 2021

Regarding the additional debugging you did: I don't think it's as simple as counting the number of frames queued before you see the first frame being dequeued. This is because the MediaCodec instance will be decoding asynchronously, so your observation that multiple frames are queued before you manage to dequeue a frame might just mean that you're queuing multiple frames quicker than the time the decoder needs to decode the first frame. To establish whether the decoder requires multiple frames before the first frame can be dequeued, you'll need to do a test where you only queue a single frame somehow.

MediaCodecVideoRenderer.processOutputBuffer is called with isDecodeOnlyBuffer == true

Note that this is a frame that's being discarded rather than rendered, probably because its timestamp is earlier than the seek position. If you don't care about accuracy during trick-play, which seems likely, then finding a way to render this frame as the first frame would speed things up. You can experiment with this just by flipping isDecodeOnlyBuffer back to false again. In a complete solution this should be done only when in trick-play mode.

@davydovvladimir
Copy link
Author

davydovvladimir commented Feb 17, 2021

Regarding the additional debugging you did: I don't think it's as simple as counting the number of frames queued before you see the first frame being dequeued. This is because the MediaCodec instance will be decoding asynchronously, so your observation that multiple frames are queued before you manage to dequeue a frame might just mean that you're queuing multiple frames quicker than the time the decoder needs to decode the first frame. To establish whether the decoder requires multiple frames before the first frame can be dequeued, you'll need to do a test where you only queue a single frame somehow.

I have checked this by limiting the amount of data downloaded by overriding DefaultLoadControl.shouldContinueLoading. If I stop downloading after sending three frames to the decoder, I see no calls of MediaCodecVideoRenderer.processOutputBuffer at all with infinite wait.

If you don't care about accuracy during trick-play, which seems likely, then finding a way to render this frame as the first frame would speed things up. You can experiment with this just by flipping isDecodeOnlyBuffer back to false again. In a complete solution this should be done only when in trick-play mode.

I have checked it. It really decreases delay between seeks from ~500ms to ~300ms. So, I will override MediaCodecVideoRenderer.processOutputBuffer and set isDecodeOnlyBuffer to false for trick case. Thanks!

@stevemayhew
Copy link
Contributor

@davydovvladimir It's on the input side that decode only is set, So I"m taking the tack of fixing this in the HLS input side (implement frame accurate seek, at least for I-Frame tracks).

As for the decoder, I've observed exactly what you describe on AmLogic and BRCM platform, so it may be a MediaCodec thing not a platform thing. There is no reason we cannot get < 100ms first frame with reasonable iFrame fetch times.

PS, I'm the author of the i-Frame only support ;-)

@andrewlewis andrewlewis assigned ojw28 and unassigned andrewlewis Jun 11, 2021
stevemayhew added a commit to TiVo/ExoPlayer that referenced this issue Oct 6, 2021
The HLS implementation of `getAdjustedSeekPositionUs()` now completely supports `SeekParameters.CLOSEST_SYNC` and it's brotheran, assuming the HLS stream indicates segments all start with an IDR (that is EXT-X-INDEPENDENT-SEGMENTS  is specified).

This fixes issue google#2882 and improves (but does not completely solve google#8592
stevemayhew added a commit to TiVo/ExoPlayer that referenced this issue Oct 18, 2021
The HLS implementation of `getAdjustedSeekPositionUs()` now completely supports `SeekParameters.CLOSEST_SYNC`
and it's brotheran, assuming the HLS stream indicates segments all start with
an IDR (that is EXT-X-INDEPENDENT-SEGMENTS  is specified).

This fixes issue google#2882 and improves (but does not completely solve google#8592
@stevemayhew
Copy link
Contributor

An update... I have been able to get single frames rendered pretty quickly in pause mode. Here's example logging:

0.0 11-02 18:08:16.799 12000 12000 D EventLogger: seekStarted [eventTime=5176.89, mediaPos=134.63, buffered=5.50, window=0, period=0]
5.0 11-02 18:08:16.804 12000 12000 D EventLogger: positionDiscontinuity [eventTime=5176.89, mediaPos=130.19, buffered=0.00, window=0, period=0, SEEK]
29.0 11-02 18:08:16.828 12000 12083 D SampleQueue: setStartTimeUs() - reset startTimeUs: 132132000
29.0 11-02 18:08:16.828 12000 12083 D SampleQueue: setStartTimeUs() - reset startTimeUs: 132132000
32.0 11-02 18:08:16.831 12000 12000 D EventLogger: loadStarted [eventTime=5176.92, mediaPos=130.19, buffered=0.00, window=0, period=0,  range(o/l): 11198972/244400 uri: http://192.168.5.140/video/1633564800-CCUR_iframe.tsv]
34.0 11-02 18:08:16.833 12000 12000 D EventLogger: positionDiscontinuity [eventTime=5176.92, mediaPos=132.13, buffered=0.00, window=0, period=0, SEEK_ADJUSTMENT]
39.0 11-02 18:08:16.838 12000 12000 D EventLogger: loading [eventTime=5176.93, mediaPos=132.13, buffered=0.00, window=0, period=0, true]
122.0 11-02 18:08:16.921 12000 12000 D EventLogger: bandwidthEstimate [eventTime=5177.01, mediaPos=132.13, buffered=0.00, window=0, period=0, Received BW Estimate.  Loaded Bytes: 376, sample: 0.33422223(Mbps), estimate: 12.842667(Mbps)]
147.0 11-02 18:08:16.946 12000 12100 D SampleQueue: commitSample() - timeUs: 132132000, id:1/8219, queue len:0, flags:1, size:60, mime: application/cea-608, upstreamKeyframeRequired: false
266.0 11-02 18:08:17.065 12000 12100 D SampleQueue: commitSample() - timeUs: 132132000, id:1/27, queue len:0, flags:1, size:239122, mime: video/avc, upstreamKeyframeRequired: false
267.0 11-02 18:08:17.066 12000 12000 D EventLogger: bandwidthEstimate [eventTime=5177.15, mediaPos=132.13, buffered=0.00, window=0, period=0, Received BW Estimate.  Loaded Bytes: 244400, sample: 15.51746(Mbps), estimate: 14.183877(Mbps)]
288.0 11-02 18:08:17.087 12000 12000 D EventLogger: loadCompletedMedia [eventTime=5177.18, mediaPos=132.13, buffered=0.00, window=0, period=0, trackId: 3 load-duration: 238ms codecs: avc1.64001F start(dur): 132132/2002 offset/len: 11198972/244400 uri: http://192.168.5.140/video/1633564800-CCUR_iframe.tsv]
289.0 11-02 18:08:17.088 12000 12000 D EventLogger: loadStarted [eventTime=5177.18, mediaPos=132.13, buffered=0.00, window=0, period=0,  range(o/l): 11443372/226352 uri: http://192.168.5.140/video/1633564800-CCUR_iframe.tsv]
311.0 11-02 18:08:17.110 12000 12100 D SampleQueue: commitSample() - timeUs: 134134000, id:1/8219, queue len:1, flags:1, size:60, mime: application/cea-608, upstreamKeyframeRequired: false
320.0 11-02 18:08:17.119 12000 12000 D EventLogger: renderedFirstFrame [eventTime=5177.21, mediaPos=132.13, buffered=2.00, window=0, period=0, Surface(name=null)/@0x69b57bd]

Most of the 320ms is the load time on the rather large iFrame segments.

The logs above are with code that has:

  1. the fix for the duration issue (see Double to long int microseconds fails in HLS #9575)
  2. a change I'm still working on a pull request for that commits the sample iFrame only segments that end with just he IDR NALU
  3. The changes from pull request 9536
  4. lastly a change that sets MediaCodec.BUFFER_FLAG_KEY_FRAME on the MediaCodec.queueInputBuffer() flags argument.

With the second change it is possible to inject other NALU's after the I-Frame segments IDR to experiment with for various codecs.

@andrewlewis any other ideas besides sending end of stream NALU?

@davydovvladimir
Copy link
Author

Hi @stevemayhew
thank you for the update

  1. lastly a change that sets MediaCodec.BUFFER_FLAG_KEY_FRAME on the MediaCodec.queueInputBuffer() flags argument.

what a change do you mean?

@stevemayhew
Copy link
Contributor

stevemayhew commented Nov 3, 2021 via email

@davydovvladimir
Copy link
Author

Some origin servers (including sadly Apple’s mediafilesegmentor) add
garbage NALUs after the IDR (type 5). These trailing P/B slice NALU can
freeze you on a frame that looks, well not as good as the reference frame
(IDR)

I'm not sure that it is same issue. But I see some strange behavior in my implementation of trick mode by using seek on pause.
If I call seek very fast (with big trick mode speeds like 32x) then after some time pictures on the screen freezes for long time like 1-3 seconds.
In the debug output, I see that onRenderedFirstFrame is called pretty fast (~300ms), but a picture on the screen is not changed.
I thought that it is some bug in MediaCodec platform implementation. But I see it on two different platforms: HiSilicon and Amlogic (both are set-top-boxes). So, it looks like the issue you described "can
freeze you on a frame that looks".

Let me remind details of my implementation:

  1. Set playback paused.
  2. Select I-Frame trick track.
  3. Disable audio.
  4. Call seek to some time (depends on trick speed)
  5. Wait for onRenderedFirstFrame
  6. Go to step 4 again.

@stevemayhew
Copy link
Contributor

@davydovvladimir What are you using to generate the i-Frame only tracks? If you can make a small VOD example, zip and send it I can have a look.

You can use ffmpeg to show the NALU structure in the i-Frame. Use:

ffmpeg -i <path to file with iframe TS> -c copy -bsf:v trace_headers -f null -

@stevemayhew
Copy link
Contributor

@davydovvladimir I've been testing this extensively on AmLogic based SOC's (this includes Google's own Chrome Cast / Goggle TV and TiVoStream) and it works perfectly (frames queued to the decoder MediaCodec.queueInputBuffer() are returned in 10s of ms to the render) see my logging above.

I have reproduced the delay with boxes that have Broadcom SOCs in them (Arris, others). With these boxes it takes 2-3 frames queued before one is emitted.

@ojw28 ojw28 assigned christosts and unassigned ojw28 Jul 18, 2022
psharma676 pushed a commit to psharma676/ExoPlayer that referenced this issue Dec 14, 2023
The HLS implementation of `getAdjustedSeekPositionUs()` now completely supports `SeekParameters.CLOSEST_SYNC` and it's brotheran, assuming the HLS stream indicates segments all start with an IDR (that is EXT-X-INDEPENDENT-SEGMENTS  is specified).

This fixes issue google#2882 and improves (but does not completely solve google#8592
@christosts christosts assigned tonihei and unassigned christosts Feb 2, 2024
@tonihei
Copy link
Collaborator

tonihei commented Feb 7, 2024

Just to clarify what this issue is actually tracking:

My understanding is that this issue is about a special mode that can be configured on the video renderer that only renders the first keyframe, following by an end-of-stream signal to ensure the codec doesn't get stuck. This mode is useful during trick-play/scrubbing-with-keyframes-only because it avoids queuing additional frames on the decoder we don't actually need. The main downside is that we can't just continue playback after this point as we have to decode the same frame again even if we want to continue from the current position.

@stevemayhew I'm aware this is quite an old issue by now, but does that match your understanding? And did the initial attempts you were referring to above show that this helps the get the first frame faster? In theory, if the codec just outputs the first keyframe as soon as it's ready, this change wouldn't even make a difference I guess (besides the unnecessary decoding work).

@stevemayhew
Copy link
Contributor

@tonihei I think so, sorry I've been busy with Widevine / MultiDRM issues so I've been neglecting the work to finish off "trick-play".

The AmLogic codecs do this well (that includes Chromecast with Google TV) but the BRCM SoC's do not, most force using tunneling mode to even work at higher frame rates. I don't think we are going to get fixes to these codecs and tracking it as an ExoPlayer issue makes even less sense, so I'd close this as not supported for BRCM.

What is remaining on my plate is to make this all work is (in priority order):

  1. Port over these pull requests to Media3:
  2. Add a setting to the player API to turn on trick-play seek mode (which needs to pass down into the AdaptiveTrackSelection so we can selectively use or ignore the iFrame only tracks)
  3. Small optimization to nearest SEEK logic to consider buffered samples before doing the adjustment (key frame in the buffered samples is better than flushing the SampleQueue and reading)

If you can assign an RFE to me in Media3 repo I can start carving out time to work on it, also give us a good place to debate the design ;-)

@tonihei
Copy link
Collaborator

tonihei commented Feb 8, 2024

Continue on the existing PRs (in media3) sounds like a good idea to get them merged. The 3rd bullet (prefer seeks in sample queue vs reloading) also sounds useful as a PR.

The more general trick-play support (point 2) is less straight-forward as it needs more design discussions how the overall integration looks like. So it might be better to wait until we have a full design in place (which we may be able to share externally to accept PRs).

And I re-word this issue here to track the specific feature request for the single-frame mode.

@tonihei tonihei changed the title Speed up onRenderedFirstFrame getting on paused Seek. Support renderer mode that only renders the first frame (for trick play) Feb 8, 2024
@stevemayhew
Copy link
Contributor

stevemayhew commented Feb 8, 2024 via email

@tonihei
Copy link
Collaborator

tonihei commented Feb 8, 2024

Oh, yes, please feel free to share what you have! That's actually quite useful and a good start for a conversation. I was just saying it's much harder to merge pull requests for a larger feature without knowing the bigger picture first (compared to these other changes that all have a clear benefit in isolation). But just to note, we are not very likely to look into this in detail until later this year.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants