add initial transcription module, set language level to 1.8, add GoogleCloudTranscriptionService #55

nikvaessen · 2017-07-23T10:19:18Z

This PR relies on RawStreamListener from libjitsi

This commit relies on RawStreamListener from libjitsi

speech-to-text sdk. This Service is currently restricted to 1 minute

the 1 minute limitation

bgrozev · 2017-07-24T16:11:15Z

I would like to make some changes to the names in the API that we added to libjitsi:
RawStreamListener -> ReceiveStreamBufferListener
samplesRead -> handleBuffer
Other suggestions would be welcome, I just really dislike RawStreamListener and samplesRead (which I came up with...) because they are so specific.

Could you open a PR in libjitsi with those changes?

bgrozev · 2017-07-24T16:12:35Z

src/main/java/org/jitsi/jigasi/transcription/GoogleCloudTranscriptionService.java

+import java.util.List;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.function.Consumer;


Star imports please

Should I always use star imports, or only when I have x of the same package?

bgrozev · 2017-07-24T16:15:24Z

src/main/java/org/jitsi/jigasi/transcription/GoogleCloudTranscriptionService.java

+     * transcriptions. CashedThreadPool is chosen because sending
+     * audio chunks are small short-lived async tasks.
+     */
+    private ExecutorService executorService = Executors.newCachedThreadPool();


Do we need to worry about the requests being sent in order?

No, not for sendSingleRequest()

bgrozev · 2017-07-24T16:18:24Z

src/main/java/org/jitsi/jigasi/transcription/TranscriptionService.java

+     * @throws UnsupportedOperationException when this service does not support
+     * fragmented audio speech-to-text
+     */
+    void sent(TranscriptionRequest request,


Can we rename this to e.g. sendRequest?

I will rename it sendSingleRequest to emphasise that this method should be used for one complete audio file/fragment

bgrozev · 2017-07-24T20:58:21Z

src/main/java/org/jitsi/jigasi/transcription/Transcriber.java

+            Participant p = participants.get(ssrc);
+            if(p != null)
+            {
+                participants.get(ssrc).giveBuffer(buffer);


Why call get() a second time when you already have the result in "p"?

Here we give a reference to "buffer" to another thread, and then we return from samplesRead() which runs in a mixer thread. So two threads have "buffer", which could be a problem (if e.g. the mixer thread decides to re-use the buffer before "our" thread has a chance to run and make a copy. I think the code which makes the copy should run in the mixer thread, and it should submit the task only when it needs to send send a request (this will also have less tasks to submit, since not every Buffer results in a request being sent).

Why call get() a second time when you already have the result in "p"?

fixed

I think the code which makes the copy should run in the mixer thread
and it should submit the task only when it needs to send send a request (this will also have less tasks to submit, since not every Buffer results in a request being sent)

Okay, I've moved the local buffering to the mixer thread.

bgrozev · 2017-07-24T21:04:35Z

src/main/java/org/jitsi/jigasi/transcription/Transcriber.java

+        /**
+         * Whether we should buffer locally before sending
+         */
+        private final Boolean USE_LOCAL_BUFFER = true;


Perhaps make these static? A little explanation of where the numbers come from would be useful: with webrtc we usually see 20ms opus frames which are decoded to 2 bytes per sample and 48000Hz sampling rate.

Oh, and the aim is to have the buffer store 500ms worth of audio, right?

Perhaps make these static?

Made them static and moved the Locale up because inner classes can't have static declarations.

A little explanation of where the numbers come from would be useful: with webrtc we usually see 20ms opus frames which are decoded to 2 bytes per sample and 48000Hz sampling rate.
added

added

Oh, and the aim is to have the buffer store 500ms worth of audio, right?

The aim is indeed to store 500ms to save bandwith. Although this might've been the reason I was seeing some errors yesterday ("The audio was not being received real-time"). Maybe it should be a bit lower

bgrozev · 2017-07-24T21:05:15Z

src/main/java/org/jitsi/jigasi/transcription/Transcriber.java

+        private long ssrc;
+
+        /**
+         * The streaming


This looks incomplete

bgrozev · 2017-07-24T21:06:22Z

src/main/java/org/jitsi/jigasi/transcription/Transcriber.java

+         */
+        void left()
+        {
+            session.end();


I would add a null check just in case

bgrozev · 2017-07-24T21:14:12Z

Merged the PR in libjitsi. You can update the API and use the new maven version: libjitsi-1.0-20170724.205625-298

bgrozev · 2017-07-24T21:14:57Z

src/main/java/org/jitsi/jigasi/transcription/GoogleCloudTranscriptionService.java

+     *                       TranscriptionResult
+     */
+    @Override
+    public void sentSingleRequest(final TranscriptionRequest request,


Do you mind renaming to "sendSingleRequest" (with a "d")?

bgrozev · 2017-07-24T21:28:17Z

src/main/java/org/jitsi/jigasi/transcription/GoogleCloudTranscriptionService.java

+        @Override
+        public void give(final TranscriptionRequest request)
+        {
+            this.service.submit(() -> requestManager.sentRequest(request));


Can we rename to sendRequest (with a "d")?

I'm not sure sendRequest is a right name either, because we do not necessarily immediately send a request, but I've changed it

thread, addresses small issues

add initial transcription module, set language level to 1.8

9b03085

This commit relies on RawStreamListener from libjitsi

nikvaessen changed the title ~~add initial transcription module, set language level to 1.8~~ add initial transcription module, set language level to 1.8, add GoogleCloudTranscriptionService Jul 23, 2017

nikvaessen added 2 commits July 23, 2017 12:22

add an implementation of TranscriptionService using the Google Cloud

95d6388

speech-to-text sdk. This Service is currently restricted to 1 minute

forgot 2 licence headers

663dcfb

nikvaessen mentioned this pull request Jul 23, 2017

Add a TranscriptionGateway and TranscriptionGatewaySession #56

Merged

dirty fix which creates a new session every minute to circumvent

731eddf

the 1 minute limitation

bgrozev reviewed Jul 24, 2017

View reviewed changes

nikvaessen added 3 commits July 24, 2017 19:58

renamed sent(), removed an unnecessary thread, used package import

5f029ca

forgot an import

b9d09bb

updated imports

2b216e1

bgrozev reviewed Jul 24, 2017

View reviewed changes

nikvaessen force-pushed the transcription branch from ac21515 to f8fd336 Compare July 25, 2017 11:56

change to ReceiveStreamBufferListener, moves buffering to mixing

fdd361d

thread, addresses small issues

nikvaessen force-pushed the transcription branch from f8fd336 to fdd361d Compare July 25, 2017 12:00

bgrozev merged commit 4863d86 into jitsi:master Jul 25, 2017

nikvaessen deleted the transcription branch July 26, 2017 11:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add initial transcription module, set language level to 1.8, add GoogleCloudTranscriptionService #55

add initial transcription module, set language level to 1.8, add GoogleCloudTranscriptionService #55

nikvaessen commented Jul 23, 2017 •

edited

Loading

bgrozev commented Jul 24, 2017

bgrozev Jul 24, 2017

nikvaessen Jul 24, 2017 •

edited

Loading

bgrozev Jul 24, 2017

nikvaessen Jul 24, 2017

bgrozev Jul 24, 2017

nikvaessen Jul 24, 2017 •

edited

Loading

bgrozev Jul 24, 2017

bgrozev Jul 24, 2017

nikvaessen Jul 25, 2017 •

edited

Loading

bgrozev Jul 24, 2017

bgrozev Jul 24, 2017

nikvaessen Jul 25, 2017 •

edited

Loading

bgrozev Jul 24, 2017

nikvaessen Jul 25, 2017

bgrozev Jul 24, 2017

nikvaessen Jul 25, 2017

bgrozev commented Jul 24, 2017

bgrozev Jul 24, 2017

nikvaessen Jul 25, 2017

bgrozev Jul 24, 2017

nikvaessen Jul 25, 2017

add initial transcription module, set language level to 1.8, add GoogleCloudTranscriptionService #55

add initial transcription module, set language level to 1.8, add GoogleCloudTranscriptionService #55

Conversation

nikvaessen commented Jul 23, 2017 • edited Loading

bgrozev commented Jul 24, 2017

Choose a reason for hiding this comment

nikvaessen Jul 24, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikvaessen Jul 24, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikvaessen Jul 25, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikvaessen Jul 25, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bgrozev commented Jul 24, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikvaessen commented Jul 23, 2017 •

edited

Loading

nikvaessen Jul 24, 2017 •

edited

Loading

nikvaessen Jul 24, 2017 •

edited

Loading

nikvaessen Jul 25, 2017 •

edited

Loading

nikvaessen Jul 25, 2017 •

edited

Loading