Implement AudioRecord support for audio track #229

IanDBird · 2022-12-20T11:18:24Z

I'm hoping to introduce Camera support as an input into Litr. This would allow scenarios where the raw audio/video is not sourced from a static file, but instead coming from a "live" source. I am planning the following, separate, pull requests:

Audio Record Support: Ability to pull raw PCM from an audio source (e.g. microphone) to mux with video from another source
External Surface Support: Ability to expose the raw surface that is used by the filter pipeline, allowing some other component (e.g. Camera2, CameraX, 3rd party library) to render directly into this. This will essentially bypass the Demux/Decode steps of the pipeline.
Camera2/CameraX Support: Helper classes/source for most common camera source scenarios.

This first PR covers Audio Record Support. It introduces a new AudioRecordMediaSource class which is responsible for taking raw PCM from AudioRecord, computing the appropriate PTS and feeding into the rest of the pipeline via the MediaSource interface. For now, the Demo app is simply muxing this together with a blank/red video stream. In future updates, this will be combined with a Camera2 integration that will merge recorded video and audio into the same file.

IanDBird · 2022-12-20T11:20:00Z

litr-demo/src/main/java/com/linkedin/android/litr/demo/RecordAudioFragment.kt

+        // This demo fragment requires Android M or newer, in order to support reading data from
+        // AudioRecord in a non-blocking way. Let's double check that the current device supports
+        // this.
+        if (Build.VERSION.SDK_INT < Build.VERSION_CODES.M) {


This requirement could be potentially removed in the future, but would require additional changes in AudioRecordMediaSource. We would need something like a ring buffer, with a separate thread reading via the older/blocking api and updating the buffer. Then we would read from this buffer in a non-blocking way. The downside is obviously additional complexity along with increased memory usage. For now, I thought requiring Android M or newer was better, at least for now.

No need for anything that complicated, just leave things as is. Demo app is supposed to be a space for easy experimentation and sample code, not a prod ready app. What you are doing here is exactly what it is for.

IanDBird · 2022-12-20T11:20:47Z

litr-demo/src/main/java/com/linkedin/android/litr/demo/data/TransformationPresenter.java

+            // Create a single (synthetic) video track to ensure our output is playable by the demo
+            // app. We only use 1 second of video (a solid color) so the duration of the video will
+            // likely depend on the length of the audio recording.
+            VideoTrackFormat videoTrackFormat = new VideoTrackFormat(0, MimeType.VIDEO_AVC);


To play a video-less video in the Demo app required additional changes. Therefore, I went with a simple approach of adding a short/synthetic video stream.

That is totally fine.

izzytwosheds · 2022-12-20T15:04:07Z

litr/src/main/java/com/linkedin/android/litr/io/AudioRecordMediaSource.kt

+        // need to leave some buffer for the AudioRecord to continue to queue too. We also don't
+        // want to block waiting for more data, we'll take whatever is available.
+        val targetSize = (bufferSize / 2).coerceAtMost(buffer.capacity())
+        val readBytes = audioRecord?.read(buffer, targetSize, AudioRecord.READ_NON_BLOCKING) ?: -1


To correctly satisfy MediaSource contract we should use offset value here and read into buffer at that value: https://developer.android.com/reference/android/media/MediaExtractor#readSampleData(java.nio.ByteBuffer,%20int)
This is working because offset is 0, but it might not always be. This can be in a follow-up PR.

Also, how is using AudioRecord.READ_NON_BLOCKING working? readSampleData is supposed to be a blocking call. 😅

To correctly satisfy MediaSource contract we should use offset value here and read into buffer at that value

Nice catch. I could probably do something around slicing the buffer at the offset before passing in. Will also need to make sure the targetSize parameter takes into account the potentially non-zero start.

Also, how is using AudioRecord.READ_NON_BLOCKING working?

So, my understanding is that the two parameters mean the following:

AudioRecord.READ_NON_BLOCKING: Read the as much of the existing recorded data that is available right now.

AudioRecord.READ_BLOCKING: Read the recorded data once "enough" is available. The definition of enough, is some internal target buffer size.

In this primitive scenario, where we have a synthetic video stream, or you are stitching together with a previously recorded video, it won't really make a difference. However, when using another "live" media source (like the future external surface), we need to avoid blocking for "too long". Since it appears a shared thread is used to read all MediaSources, if one blocks and waits internally for some buffer to be filled, it could result in the other live source skipping a frame.

Another option could be to introduce separate threading for media sources, but that seems overly complex compared to this.

Thank you for explanation, now this makes sense. So basically we are telling AudioRecord not to wait until it fills its internal buffer and fill ours with whatever it has at the moment, synchronously. This makes a lot of sense and should work - as long as size is not -1 AudioTrackTranscoder should keep going.

Yup, exactly.

izzytwosheds · 2022-12-20T15:08:08Z

litr/src/main/java/com/linkedin/android/litr/io/AudioRecordMediaSource.kt

+    companion object {
+        private val TAG = AudioRecordMediaSource::class.simpleName
+
+        private const val MEDIA_FORMAT_PCM_KEY = "pcm-encoding"


private vals don't really have to be inside companion object

Will do. Have checked a few other files and noted that you move them before the class definition (e.g. AudioOverlayFilter). Will copy

Yup. Kotlin doesn't really like static/companion things. Better use const val in class.

izzytwosheds · 2022-12-20T15:09:05Z

litr/src/main/java/com/linkedin/android/litr/io/AudioRecordMediaSource.kt

+        private const val AUDIO_FORMAT = AudioFormat.ENCODING_PCM_16BIT
+        private const val BYTES_PER_SAMPLE = 2
+
+        const val DEFAULT_AUDIO_SOURCE = MediaRecorder.AudioSource.MIC


can you make these public explicitly? They usually should be for libraries. (I should add that check)

I'm happy to add public but Android Studio will complain (since that's what it is by default). I don't see any other public const val in the code base, but no objections to adding if preferred.

ok, then you can keep it as is. I will add library style enforcement later, perhaps.

izzytwosheds · 2022-12-20T15:12:14Z

litr/src/main/java/com/linkedin/android/litr/io/AudioRecordMediaSource.kt

+class AudioRecordMediaSource(
+        private val audioSource: Int = DEFAULT_AUDIO_SOURCE,
+        private val sampleRate: Int = DEFAULT_SAMPLE_RATE,
+        private val channelConfig: Int = DEFAULT_CHANNEL_CONFIG,


It feels like it is possible to easily mismatch channelConfig and channelCount, which would cause problems, I presume. Should we just accept channelConfig and derive channelCount from it?

@izzytwosheds - What about we change to force either mono or stereo (either an enum, or a boolean, like isStereo). That way, we can internally manage the config vs count and only allow consumers to configure with known/supported configurations?

Something like this:

class AudioRecordMediaSource( private val audioSource: Int = DEFAULT_AUDIO_SOURCE, private val sampleRate: Int = DEFAULT_SAMPLE_RATE, private val isStereo: Boolean = false ) : MediaSource { private val channelCount: Int by lazy { return@lazy if (isStereo) 2 else 1 } private val channelConfig: Int by lazy { return@lazy if (isStereo) AudioFormat.CHANNEL_IN_STEREO else AudioFormat.CHANNEL_IN_MONO }

In that case I would probably keep the parameter an integer channelCount - masks internal AudioConfig implementation and is more future-proof (who knows, maybe someday Android devices will record a 6 channel sound?)

Sure, will tweak 👍

izzytwosheds · 2022-12-20T15:15:13Z

litr-demo/src/main/java/com/linkedin/android/litr/demo/data/TransformationPresenter.java

+            // Create a single (synthetic) video track to ensure our output is playable by the demo
+            // app. We only use 1 second of video (a solid color) so the duration of the video will
+            // likely depend on the length of the audio recording.
+            VideoTrackFormat videoTrackFormat = new VideoTrackFormat(0, MimeType.VIDEO_AVC);


That is totally fine.

izzytwosheds · 2022-12-20T15:17:01Z

litr-demo/src/main/java/com/linkedin/android/litr/demo/RecordAudioFragment.kt

+        // This demo fragment requires Android M or newer, in order to support reading data from
+        // AudioRecord in a non-blocking way. Let's double check that the current device supports
+        // this.
+        if (Build.VERSION.SDK_INT < Build.VERSION_CODES.M) {


No need for anything that complicated, just leave things as is. Demo app is supposed to be a space for easy experimentation and sample code, not a prod ready app. What you are doing here is exactly what it is for.

izzytwosheds · 2022-12-20T18:34:19Z

litr/src/main/java/com/linkedin/android/litr/io/AudioRecordMediaSource.kt

+        private val sampleRate: Int = DEFAULT_SAMPLE_RATE,
+        private val isStereo: Boolean = false
+) : MediaSource {
+    private val channelCount: Int by lazy {


Not sure how much are we saving by using lazy here. There is no complex class initialization going on. I think we can remove all lazy intiatizations here and below.

Fair enough, I think I just got carried away :)

Been there, done that... many times... :)

izzytwosheds · 2022-12-20T19:09:47Z

litr-demo/src/main/java/com/linkedin/android/litr/demo/RecordAudioFragment.kt

+    }
+
+    companion object {
+        private const val REQUEST_AUDIO_RECORD_PERMISSION = 14


nip: we can probably move this out of companion object as well

izzytwosheds · 2022-12-20T19:14:01Z

litr/src/main/java/com/linkedin/android/litr/io/AudioRecordMediaSource.kt

+        }
+    }
+
+    private val channelConfig =


we could probably do something like this here:
private val channelConfig = when (channelCount) {
1 -> AudioFormat.CHANNEL_IN_MONO
2 -> AudioFormat.CHANNEL_IN_STEREO
else -> throw IllegalArgumentException("Unsupported channel count configuration")
}

This would get rid of init and make AudioFormat selection more explicit and readable.

izzytwosheds · 2022-12-20T19:14:12Z

litr/src/main/java/com/linkedin/android/litr/io/AudioRecordMediaSource.kt

+            if (channelCount == 1) AudioFormat.CHANNEL_IN_MONO else AudioFormat.CHANNEL_IN_STEREO
+
+    // Compute the appropriate buffer size based upon the audio configuration.
+    private val bufferSize: Int by lazy {


No need for lazy

izzytwosheds · 2022-12-20T19:14:22Z

litr/src/main/java/com/linkedin/android/litr/io/AudioRecordMediaSource.kt

+    }
+
+    // Compute the number of bytes per microsecond of audio.
+    private val bytesPerUs: Double by lazy {


no need for lazy

izzytwosheds · 2022-12-20T19:14:43Z

litr/src/main/java/com/linkedin/android/litr/io/AudioRecordMediaSource.kt

+        const val DEFAULT_SAMPLE_RATE = 44100
+        const val DEFAULT_CHANNEL_COUNT = 1
+    }
+}


nip: add empty last line

IanDBird commented Dec 20, 2022

View reviewed changes

izzytwosheds reviewed Dec 20, 2022

View reviewed changes

IanDBird force-pushed the audio-record branch from 2c70262 to 2e8586b Compare December 20, 2022 17:45

izzytwosheds reviewed Dec 20, 2022

View reviewed changes

IanDBird force-pushed the audio-record branch 3 times, most recently from b160e8c to d93e7e5 Compare December 20, 2022 19:02

izzytwosheds reviewed Dec 20, 2022

View reviewed changes

IanDBird force-pushed the audio-record branch from d93e7e5 to a94347f Compare December 20, 2022 19:29

izzytwosheds approved these changes Dec 20, 2022

View reviewed changes

IanDBird force-pushed the audio-record branch 2 times, most recently from ba03025 to 3d130b2 Compare December 20, 2022 21:10

Implement AudioRecord support for audio track

8420d86

IanDBird force-pushed the audio-record branch from 3d130b2 to 8420d86 Compare December 21, 2022 08:11

izzytwosheds merged commit 973c205 into linkedin:main Dec 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement AudioRecord support for audio track #229

Implement AudioRecord support for audio track #229

IanDBird commented Dec 20, 2022

IanDBird Dec 20, 2022

izzytwosheds Dec 20, 2022

IanDBird Dec 20, 2022

izzytwosheds Dec 20, 2022

izzytwosheds Dec 20, 2022

izzytwosheds Dec 20, 2022

IanDBird Dec 20, 2022

izzytwosheds Dec 20, 2022

IanDBird Dec 20, 2022

izzytwosheds Dec 20, 2022

IanDBird Dec 20, 2022

izzytwosheds Dec 20, 2022

izzytwosheds Dec 20, 2022

IanDBird Dec 20, 2022

izzytwosheds Dec 20, 2022

izzytwosheds Dec 20, 2022

IanDBird Dec 20, 2022

IanDBird Dec 20, 2022

izzytwosheds Dec 20, 2022

IanDBird Dec 20, 2022

izzytwosheds Dec 20, 2022

izzytwosheds Dec 20, 2022

izzytwosheds Dec 20, 2022

IanDBird Dec 20, 2022

izzytwosheds Dec 20, 2022

izzytwosheds Dec 20, 2022

izzytwosheds Dec 20, 2022

izzytwosheds Dec 20, 2022

izzytwosheds Dec 20, 2022

izzytwosheds Dec 20, 2022

Implement AudioRecord support for audio track #229

Implement AudioRecord support for audio track #229

Conversation

IanDBird commented Dec 20, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment