-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add luma & chroma bit depths to Format #491
Conversation
Hi Michael, please let me know if you need any additional info from me for this PR, thanks. |
Hi @microkatz any progress on this? |
Thank you for your patience. I'm still reviewing the CL. I understand that you would like to have this data through H264 and H265 streams. In that regard, would you update the processH264FmtpAttribute and processH265FmtpAttribute methods in RtspMediaTrack.java to include your luma and chroma data in the Format building? |
1175b47
to
5178488
Compare
Thanks Michael for reviewing this. I've added setting bit depths for H264 and H265 in RtspMediaTrack.java, I've also rebased against today's tip of main, just in case. |
@@ -1237,6 +1247,22 @@ private static void parseVideoSampleEntry( | |||
} else if (childAtomType == Atom.TYPE_av1C) { | |||
ExtractorUtil.checkContainerInput(mimeType == null, /* message= */ null); | |||
mimeType = MimeTypes.VIDEO_AV1; | |||
parent.setPosition(childStartPosition + Atom.HEADER_SIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason that you move the parser before checking that you are going to read into it at all? You might as well check the condition before setting the position.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, the setPosition could go inside the conditional block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conditional block removed, anyway (see other comment).
int byte2 = parent.readUnsignedByte(); | ||
int seqProfile = byte2 >> 5; | ||
int byte3 = parent.readUnsignedByte(); | ||
boolean highBitdepth = ((byte3 >> 6) & 0b1) != 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The twelveBit boolean condition is only needed if (seqProfile == 2 && highBitdepth). If you could move this assignment into that segment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AV1 spec is mentioned at line 1258 to describe how to set bitdepthLuma
from seqProfile
and highBitdepth
, however these two are read from AV1CodecConfigurationRecord
(from https://aomediacodec.github.io/av1-isobmff/#av1codecconfigurationbox-syntax) which looks like this:
aligned (8) class AV1CodecConfigurationRecord {
unsigned int (1) marker = 1;
unsigned int (7) version = 1;
unsigned int (3) seq_profile;
unsigned int (5) seq_level_idx_0;
unsigned int (1) seq_tier_0;
unsigned int (1) high_bitdepth;
unsigned int (1) twelve_bit;
unsigned int (1) monochrome;
unsigned int (1) chroma_subsampling_x;
unsigned int (1) chroma_subsampling_y;
unsigned int (2) chroma_sample_position;
unsigned int (3) reserved = 0;
unsigned int (1) initial_presentation_delay_present;
if (initial_presentation_delay_present) {
unsigned int (4) initial_presentation_delay_minus_one;
} else {
unsigned int (4) reserved = 0;
}
unsigned int (8) configOBUs[];
}
so I think the added lines here are correct. I could still add the link to the AV1CodecConfigurationRecord spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, misunderstood your comment. Yes, twelveBit can be set before is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -1237,6 +1247,22 @@ private static void parseVideoSampleEntry( | |||
} else if (childAtomType == Atom.TYPE_av1C) { | |||
ExtractorUtil.checkContainerInput(mimeType == null, /* message= */ null); | |||
mimeType = MimeTypes.VIDEO_AV1; | |||
parent.setPosition(childStartPosition + Atom.HEADER_SIZE); | |||
if (childAtomSize > Atom.HEADER_SIZE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason behind using this as the condition to see if an AV1CodecConfigurationBox is included?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comes from a fix we made in our fork, where we found encodes that had missing HEVCDecoderConfigurationRecord so we added that check for all such atoms. However given that the specs, both ISO 14496-15 and the AV1 one, say "sample entry shall contain an ...Configuration Box" it seems such encodes would not be compliant.
Long story short I think you're right, I should remove the check at line 1251.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
@@ -160,6 +162,8 @@ public static final class Builder { | |||
@Nullable private byte[] projectionData; | |||
private @C.StereoMode int stereoMode; | |||
@Nullable private ColorInfo colorInfo; | |||
private int lumaBitdepth; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please instead add this bit depth info to the ColorInfo class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion on where this value should go, but I figure both the luma and chroma bitDepth should be in the same class (either ColorInfo or Format)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, my apology, the comment was to move both into ColorInfo. Could not attach the comment to multiple lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh it feels slightly wrong to me seeing luma bitdepth in ColorInfo, and since luma & chroma bitdepth are logically in pair I would think they should go in Format, however I'm happy to move them to ColorInfo if you guys prefer so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@microkatz final decision ColorInfo or Format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you are adding the luma and chroma information to vp9 in the mp4 AtomParsers, if you could please at the parsing logic for vp9 for rtsp in this file as well. It will require parsing the fmtp parameters if they exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, looking into this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm correct from https://www.ietf.org/archive/id/draft-ietf-payload-vp9-16.txt the RTP payload for VP9 is not final yet. In the current draft the fmtp parameters for VP9 seem only optional: "max-fr", "max-fs" and "profile-id". None of them allows setting the bit depths. Incidentally I've played with mediamtx and I see it doesn't add fmtp at all for VP9.
Anyway, I can add a processVP9FmtpAttribute()
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, in more detail: I could add it, but it seems to me we couldn't get anything useful out of it, the only possibly relevant bit is the profile-id however for values 0 and 1 the bit depth is 8, which is already the default in Format.Builder, while for values of 2 and 3 the bit depth can be either 10 or 12, so we couldn't set any value. So @microkatz can we skip this?
Hello @dsparano , would the Format.ColorInfo's ColorTransfer and ColorSpace values work to identify the bit depth? My understanding is that 10-bit color is typically used with HDR, whereas 8-bit is typically used with SDR. Therefore, you could identify HDR-ness or 10-bit usage by checking whether the color transfer is COLOR_TRANSFER_ST2084 || COLOR_TRANSER_HLG, or ColorTransfer.isTransferHdr. ColorSpace may also be able to identify the bit depth, but I'm less certain on that. |
Hi @dway123, I think we cannot deny there is a correlation between colour metadata and bit depth, however an SDR content can still be on 10 bit or even more, and also HDR content can be encoded with 10, 12 or more (even 8, depending on how you define "HDR"). So I don't think the transfer characteristics itself, e.g. PQ or HLG, or color primaries should be used to assign a bit depth, not to mention that both H264 and HEVC allow for chroma to have a different bit depth than luma. |
Thanks @dsparano. Yeah, I'm aware that 8-bit HDR content can exist (but admittedly just haven't seen it generated myself yet). Have you seen HDR content with a bit depth other than 10-bit? I wasn't sure if bit-depth might be a proxy for HDR-ness here, but it seems this isn't the case. I don't think this is duplicating already-signalled info then. I'm also unsure whether this benefits ExoPlayer/Transformer yet, as I haven't seen a use-case myself (the |
Hi @microkatz sorry to be nagging, if could kindly reply to the two outstanding questions (all details in the relevant comments) I'll then leave you in peace 😉
|
@dsparano As far as answers to your question.
|
Cool, no problem. I've moved the bit depths to ColorInfo, which opens the question of how to consider it "valid", atm I've tried to keep existing behaviour and at the same time allow for bit depths to be still reported in the debug text in case they are available and the colour metadata are not. Please check if you're happy with it. |
Hello @dsparano, I was working on setting this PR up for internal review and merging. Unfortunately it seems that this commit is being made from a topic branch off of a private organization-owned fork? Would it be possible to open a new PR from an individual-owned fork? We can't push changes to organization-owned forks like this one. If that's not possible then we can still merge this PR but it will result in an 'evil' merge. See more info here: https://github.com/androidx/media/blob/release/CONTRIBUTING.md#push-access-to-pr-branches |
libraries/common/src/main/java/androidx/media3/common/ColorInfo.java
Outdated
Show resolved
Hide resolved
libraries/common/src/main/java/androidx/media3/common/ColorInfo.java
Outdated
Show resolved
Hide resolved
@microkatz It's been quiet for a while, is anything else needed for this to go in? |
0908b72
to
e42635b
Compare
I'm going to send this for internal review now. You may see some more commits being added as I make changes in response to review feedback. Please refrain from pushing any more substantive changes as it will complicate the internal review - thanks! |
@microkatz has the internal review completed? |
@dsparano. Not just yet. Much of team is currently at a summit and will return next week. We are still working on it. Thank you for your patience! |
69a8b07
to
cd4c5b4
Compare
@microkatz Hi there, is anything else needed from our side on this PR? |
I'm still working on it. Your code change breaks quite a number of dependencies using the ColorInfo class. It has required a couple of other linked changes like c12fb67. I appreciate your patience. |
1e81dcc
to
d0b84da
Compare
…ntainers and parameter sets
d0b84da
to
7162407
Compare
PiperOrigin-RevId: 574129451 (cherry picked from commit 009d48a)
Issue
Videos may be encoded with different bit depths (even between luma and chroma), not just the traditional 8 bpp.
Currently the Format class carries Color metadata, but not bit depths. The capability to decode wide bit depth videos is down to the codec profile support from the device but not always the video profile is associated with a unique bit depth (for ex. not for H266 where Main 10 is for both 8 and 10 bpp). Besides working out the bitdepth from the codec specific profile strings would be cumbersome.
We think it would be useful for e.g. renderers, transformers etc to know the video bit depth. We had this issue internally while working on use cases such as 8bpp base and 10bpp LCEVC enhanced output, but we think it could very well benefit other scenarios.
Proposed solution
Add luma and chroma bit depths to the Format class, set them from the various parsers (see changes).