Native video decoder via `ffmpeg` CLI #7607

emilk · 2024-10-07T06:41:21Z

Part of More native Video support #7298

Send samples to ffmpeg over CLI, read back the results, show the user.

A basic demo of this is in this internal repository: https://github.com/rerun-io/video-experiments

The main approach is fairly simple:

 let mut ffmpeg = FfmpegCommand::new()
        .hide_banner()
        // Keep in mind that all arguments that are about the input, need to go before!
        .format("h264")
        .input("-")
        //.args(&["-c", "copy", "out.mp4"]) // For testing.
        .rawvideo() // Output rgb24 on stdout. (todo for later: any format we can read directly on re_renderer would be better!)
        .spawn()
        .expect("faild to spawn ffmpeg");

    let mut stdin = ffmpeg.take_stdin().unwrap();
    std::thread::spawn(move || {
          // Wait for samples to be sent on stdin.
          // then send them to ffmpeg.
          // See `write_sample_to_nalu_stream`
    });

    // On the main thread, run the output instance to completion
    ffmpeg.iter().unwrap().for_each(|e| match e {
        FfmpegEvent::Log(LogLevel::Error, e) => println!("Error: {}", e),
        FfmpegEvent::Progress(p) => println!("Progress: {} / 00:00:15", p.time),
        FfmpegEvent::OutputFrame(frame) => println!(
            "Received frame: time {:?} fmt {:?} size {}x{}",
            frame.timestamp, frame.pix_fmt, frame.width, frame.height
        ),
        evt => println!("Event: {evt:?}"),
    });

The difficult part is that we need to provide ffmpeg with a format that it can stream in on stdin. In this snippet this is done with format("h264"), so ffmpeg expects the .h264 format which is a stream of NAL units in Annex B format (== replace NAL lengths with start headers). Also, at every IDR frame, sequence parameter sets (SPS) and frame parameter sets (FPS) need to be inserted:

fn write_sample_to_nalu_stream(
    avc_box: &re_mp4::Avc1Box,
    nalu_stream: &mut dyn std::io::Write,
    sample: &re_mp4::Sample,
    video_track_data: &[u8],
    state: &mut NaluStreamState,
) -> Result<(), Box<dyn std::error::Error>> {
    let avcc = &avc_box.avcc;

    // Append SPS & PPS NAL unit whenever encountering an IDR frame unless the previous frame was an IDR frame.
    // TODO(andreas): Should we detect this rather from the NALU stream rather than the samples?
    if sample.is_sync && !state.previous_frame_was_idr {
        for sps in (&avcc.sequence_parameter_sets).iter() {
            nalu_stream.write_all(&NAL_START_CODE)?;
            nalu_stream.write_all(&sps.bytes)?;
        }
        for pps in (&avcc.picture_parameter_sets).iter() {
            nalu_stream.write_all(&NAL_START_CODE)?;
            nalu_stream.write_all(&pps.bytes)?;
        }
        state.previous_frame_was_idr = true;
    } else {
        state.previous_frame_was_idr = false;
    }

    // A single sample, may consist of multiple NAL units, each of which need our special treatment.
    // (most of the time it's 1:1, but there might be extra NAL units for info, especially at the start)
    let mut buffer_offset = sample.offset as usize;
    let sample_end = buffer_offset + sample.size as usize;
    while buffer_offset < sample_end {
        // Each NAL unit in mp4 is prefixed with a length prefix.
        // In Annex B this doesn't exist.
        let length_prefix_size = avcc.length_size_minus_one as usize + 1;

        // TODO: improve the error handling here.
        let nal_unit_size = match length_prefix_size {
            4 => u32::from_be_bytes(
                video_track_data[buffer_offset..(buffer_offset + 4)]
                    .try_into()
                    .unwrap(),
            ) as usize,
            2 => u16::from_be_bytes(
                video_track_data[buffer_offset..(buffer_offset + 2)]
                    .try_into()
                    .unwrap(),
            ) as usize,
            1 => video_track_data[buffer_offset] as usize,
            _ => panic!("invalid length prefix size"),
        };
        //println!("nal unit size: {}", nal_unit_size);

        if (sample.size as usize) < nal_unit_size {
            panic!(
                "sample size {} is smaller than nal unit size {nal_unit_size}",
                sample.size
            );
        }

        nalu_stream.write_all(&NAL_START_CODE)?;
        let data_start = buffer_offset + length_prefix_size; // Skip the size.
        let data_end = buffer_offset + nal_unit_size + length_prefix_size;
        let data = &video_track_data[data_start..data_end];

        // Note that we don't have to insert "emulation prevention bytes" since mp4 NALU still use them.
        // (unlike the NAL start code, the preventation bytes are part of the NAL spec!)

        nalu_stream.write_all(data)?;

        buffer_offset = data_end;
    }

    Ok(())
}

Open questions:

Does this work with H.265 as well? It uses the same formats overall
What about other codecs?
Does this approach scale well with other formats?
ffmpeg allows arbitrary output formats. Can we be clever about what to pick? E.g. if the decoder internally gives us YUV420 (that's most of the time but not always!), we should pass YUV420 on instead of rgb24 and processes this directly.

The text was updated successfully, but these errors were encountered:

emilk added enhancement New feature or request 👀 needs triage This issue needs to be triaged by the Rerun team 🎞️ video and removed 👀 needs triage This issue needs to be triaged by the Rerun team labels Oct 7, 2024

This was referenced Oct 7, 2024

Native H.264 #7606

Open

More native Video support #7298

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Native video decoder via `ffmpeg` CLI #7607

Native video decoder via `ffmpeg` CLI #7607

emilk commented Oct 7, 2024 •

edited by Wumpf

Loading

Native video decoder via ffmpeg CLI #7607

Native video decoder via ffmpeg CLI #7607

Comments

emilk commented Oct 7, 2024 • edited by Wumpf Loading

Native video decoder via `ffmpeg` CLI #7607

Native video decoder via `ffmpeg` CLI #7607

emilk commented Oct 7, 2024 •

edited by Wumpf

Loading