Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add luma & chroma bit depths to Format #491

Merged
merged 15 commits into from
Oct 17, 2023

Conversation

dsparano
Copy link
Contributor

Issue
Videos may be encoded with different bit depths (even between luma and chroma), not just the traditional 8 bpp.

Currently the Format class carries Color metadata, but not bit depths. The capability to decode wide bit depth videos is down to the codec profile support from the device but not always the video profile is associated with a unique bit depth (for ex. not for H266 where Main 10 is for both 8 and 10 bpp). Besides working out the bitdepth from the codec specific profile strings would be cumbersome.

We think it would be useful for e.g. renderers, transformers etc to know the video bit depth. We had this issue internally while working on use cases such as 8bpp base and 10bpp LCEVC enhanced output, but we think it could very well benefit other scenarios.

Proposed solution
Add luma and chroma bit depths to the Format class, set them from the various parsers (see changes).

@microkatz microkatz self-assigned this Jul 3, 2023
@dsparano
Copy link
Contributor Author

dsparano commented Jul 7, 2023

Hi Michael, please let me know if you need any additional info from me for this PR, thanks.

@dsparano
Copy link
Contributor Author

Hi @microkatz any progress on this?

@microkatz
Copy link
Contributor

microkatz commented Jul 19, 2023

@dsparano

Thank you for your patience. I'm still reviewing the CL. I understand that you would like to have this data through H264 and H265 streams. In that regard, would you update the processH264FmtpAttribute and processH265FmtpAttribute methods in RtspMediaTrack.java to include your luma and chroma data in the Format building?

@dsparano
Copy link
Contributor Author

Thanks Michael for reviewing this. I've added setting bit depths for H264 and H265 in RtspMediaTrack.java, I've also rebased against today's tip of main, just in case.

@@ -1237,6 +1247,22 @@ private static void parseVideoSampleEntry(
} else if (childAtomType == Atom.TYPE_av1C) {
ExtractorUtil.checkContainerInput(mimeType == null, /* message= */ null);
mimeType = MimeTypes.VIDEO_AV1;
parent.setPosition(childStartPosition + Atom.HEADER_SIZE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason that you move the parser before checking that you are going to read into it at all? You might as well check the condition before setting the position.

Copy link
Contributor Author

@dsparano dsparano Jul 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, the setPosition could go inside the conditional block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conditional block removed, anyway (see other comment).

int byte2 = parent.readUnsignedByte();
int seqProfile = byte2 >> 5;
int byte3 = parent.readUnsignedByte();
boolean highBitdepth = ((byte3 >> 6) & 0b1) != 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The twelveBit boolean condition is only needed if (seqProfile == 2 && highBitdepth). If you could move this assignment into that segment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AV1 spec is mentioned at line 1258 to describe how to set bitdepthLuma from seqProfile and highBitdepth, however these two are read from AV1CodecConfigurationRecord (from https://aomediacodec.github.io/av1-isobmff/#av1codecconfigurationbox-syntax) which looks like this:

aligned (8) class AV1CodecConfigurationRecord {
  unsigned int (1) marker = 1;
  unsigned int (7) version = 1;
  unsigned int (3) seq_profile;
  unsigned int (5) seq_level_idx_0;
  unsigned int (1) seq_tier_0;
  unsigned int (1) high_bitdepth;
  unsigned int (1) twelve_bit;
  unsigned int (1) monochrome;
  unsigned int (1) chroma_subsampling_x;
  unsigned int (1) chroma_subsampling_y;
  unsigned int (2) chroma_sample_position;
  unsigned int (3) reserved = 0;

  unsigned int (1) initial_presentation_delay_present;
  if (initial_presentation_delay_present) {
    unsigned int (4) initial_presentation_delay_minus_one;
  } else {
    unsigned int (4) reserved = 0;
  }

  unsigned int (8) configOBUs[];
}

so I think the added lines here are correct. I could still add the link to the AV1CodecConfigurationRecord spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, misunderstood your comment. Yes, twelveBit can be set before is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -1237,6 +1247,22 @@ private static void parseVideoSampleEntry(
} else if (childAtomType == Atom.TYPE_av1C) {
ExtractorUtil.checkContainerInput(mimeType == null, /* message= */ null);
mimeType = MimeTypes.VIDEO_AV1;
parent.setPosition(childStartPosition + Atom.HEADER_SIZE);
if (childAtomSize > Atom.HEADER_SIZE) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason behind using this as the condition to see if an AV1CodecConfigurationBox is included?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes from a fix we made in our fork, where we found encodes that had missing HEVCDecoderConfigurationRecord so we added that check for all such atoms. However given that the specs, both ISO 14496-15 and the AV1 one, say "sample entry shall contain an ...Configuration Box" it seems such encodes would not be compliant.
Long story short I think you're right, I should remove the check at line 1251.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@@ -160,6 +162,8 @@ public static final class Builder {
@Nullable private byte[] projectionData;
private @C.StereoMode int stereoMode;
@Nullable private ColorInfo colorInfo;
private int lumaBitdepth;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please instead add this bit depth info to the ColorInfo class?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion on where this value should go, but I figure both the luma and chroma bitDepth should be in the same class (either ColorInfo or Format)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my apology, the comment was to move both into ColorInfo. Could not attach the comment to multiple lines.

Copy link
Contributor Author

@dsparano dsparano Jul 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh it feels slightly wrong to me seeing luma bitdepth in ColorInfo, and since luma & chroma bitdepth are logically in pair I would think they should go in Format, however I'm happy to move them to ColorInfo if you guys prefer so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@microkatz final decision ColorInfo or Format?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are adding the luma and chroma information to vp9 in the mp4 AtomParsers, if you could please at the parsing logic for vp9 for rtsp in this file as well. It will require parsing the fmtp parameters if they exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, looking into this...

Copy link
Contributor Author

@dsparano dsparano Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm correct from https://www.ietf.org/archive/id/draft-ietf-payload-vp9-16.txt the RTP payload for VP9 is not final yet. In the current draft the fmtp parameters for VP9 seem only optional: "max-fr", "max-fs" and "profile-id". None of them allows setting the bit depths. Incidentally I've played with mediamtx and I see it doesn't add fmtp at all for VP9.
Anyway, I can add a processVP9FmtpAttribute() method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, in more detail: I could add it, but it seems to me we couldn't get anything useful out of it, the only possibly relevant bit is the profile-id however for values 0 and 1 the bit depth is 8, which is already the default in Format.Builder, while for values of 2 and 3 the bit depth can be either 10 or 12, so we couldn't set any value. So @microkatz can we skip this?

@dway123
Copy link
Contributor

dway123 commented Jul 21, 2023

Hello @dsparano , would the Format.ColorInfo's ColorTransfer and ColorSpace values work to identify the bit depth?

My understanding is that 10-bit color is typically used with HDR, whereas 8-bit is typically used with SDR. Therefore, you could identify HDR-ness or 10-bit usage by checking whether the color transfer is COLOR_TRANSFER_ST2084 || COLOR_TRANSER_HLG, or ColorTransfer.isTransferHdr.

ColorSpace may also be able to identify the bit depth, but I'm less certain on that.

@dsparano
Copy link
Contributor Author

Hi @dway123, I think we cannot deny there is a correlation between colour metadata and bit depth, however an SDR content can still be on 10 bit or even more, and also HDR content can be encoded with 10, 12 or more (even 8, depending on how you define "HDR"). So I don't think the transfer characteristics itself, e.g. PQ or HLG, or color primaries should be used to assign a bit depth, not to mention that both H264 and HEVC allow for chroma to have a different bit depth than luma.

@dway123
Copy link
Contributor

dway123 commented Jul 24, 2023

Thanks @dsparano. Yeah, I'm aware that 8-bit HDR content can exist (but admittedly just haven't seen it generated myself yet). Have you seen HDR content with a bit depth other than 10-bit?

I wasn't sure if bit-depth might be a proxy for HDR-ness here, but it seems this isn't the case. I don't think this is duplicating already-signalled info then.

I'm also unsure whether this benefits ExoPlayer/Transformer yet, as I haven't seen a use-case myself (the MediaFormat used to configure MediaCodec doesn't use this), but I'm happy to defer to @microkatz on that.

@dsparano
Copy link
Contributor Author

Hi @microkatz sorry to be nagging, if could kindly reply to the two outstanding questions (all details in the relevant comments) I'll then leave you in peace 😉

  1. ColorInfo or Format?
  2. Skip processVP9FmtpAttribute()?

@microkatz
Copy link
Contributor

microkatz commented Jul 26, 2023

@dsparano
No apology needed. We really appreciate your contribution to the library. I still need to review your PR but I have not forgotten!

As far as answers to your question.

  1. ColorInfo please. I understand your comment about luma not being colorinfo but I think that chromaBitDepth does fit in that class and they should be together. Perhaps if more data is added to that class then it can be renamed in the future as it is an @unstable annotated class.
  2. I read your comments about the fmtp parameters. You can skip the processVP9FmtpAttribute()

@dsparano
Copy link
Contributor Author

Cool, no problem. I've moved the bit depths to ColorInfo, which opens the question of how to consider it "valid", atm I've tried to keep existing behaviour and at the same time allow for bit depths to be still reported in the debug text in case they are available and the colour metadata are not. Please check if you're happy with it.

@microkatz
Copy link
Contributor

Hello @dsparano, I was working on setting this PR up for internal review and merging. Unfortunately it seems that this commit is being made from a topic branch off of a private organization-owned fork? Would it be possible to open a new PR from an individual-owned fork? We can't push changes to organization-owned forks like this one. If that's not possible then we can still merge this PR but it will result in an 'evil' merge. See more info here: https://github.com/androidx/media/blob/release/CONTRIBUTING.md#push-access-to-pr-branches

@dsparano
Copy link
Contributor Author

@microkatz It's been quiet for a while, is anything else needed for this to go in?

@microkatz
Copy link
Contributor

I'm going to send this for internal review now. You may see some more commits being added as I make changes in response to review feedback. Please refrain from pushing any more substantive changes as it will complicate the internal review - thanks!

@dsparano
Copy link
Contributor Author

@microkatz has the internal review completed?

@microkatz
Copy link
Contributor

@dsparano. Not just yet. Much of team is currently at a summit and will return next week. We are still working on it. Thank you for your patience!

@dsparano
Copy link
Contributor Author

@microkatz Hi there, is anything else needed from our side on this PR?

@microkatz
Copy link
Contributor

microkatz commented Oct 12, 2023

I'm still working on it. Your code change breaks quite a number of dependencies using the ColorInfo class. It has required a couple of other linked changes like c12fb67. I appreciate your patience.

@microkatz microkatz force-pushed the dsparano-exo128 branch 5 times, most recently from 1e81dcc to d0b84da Compare October 13, 2023 11:28
@copybara-service copybara-service bot merged commit 009d48a into androidx:main Oct 17, 2023
1 check passed
rohitjoins pushed a commit that referenced this pull request Oct 23, 2023
PiperOrigin-RevId: 574129451
(cherry picked from commit 009d48a)
@androidx androidx locked and limited conversation to collaborators Dec 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants