Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Speech samples. #307

Merged
merged 1 commit into from
Feb 1, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -81,8 +81,8 @@
"@google-cloud/monitoring": "0.1.4",
"@google-cloud/pubsub": "0.7.0",
"@google-cloud/resource": "0.5.1",
"@google-cloud/speech": "0.5.0",
"@google-cloud/storage": "0.6.0",
"@google-cloud/speech": "0.6.0",
"@google-cloud/storage": "0.6.1",
"@google-cloud/translate": "0.6.0",
"@google-cloud/vision": "0.7.0",
"@google/cloud-debug": "0.9.1",
Expand Down
24 changes: 13 additions & 11 deletions speech/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,9 @@

# Google Cloud Speech API Node.js Samples

[Sign up for the Alpha][speech_signup].

The [Cloud Speech API][speech_docs] enables easy integration of Google speech
recognition technologies into developer applications.

[speech_signup]: https://services.google.com/fb/forms/speech-api-alpha/
[speech_docs]: https://cloud.google.com/speech/

## Table of Contents
Expand Down Expand Up @@ -36,18 +33,23 @@ __Usage:__ `node recognize.js --help`

```
Commands:
sync <filename> Detects speech in an audio file.
async <filename> Creates a job to detect speech in an audio file, and waits for the job to complete.
stream <filename> Detects speech in an audio file by streaming it to the Speech API.
listen Detects speech in a microphone input stream.
sync <filename> Detects speech in a local audio file.
sync-gcs <gcsUri> Detects speech in an audio file located in a Google Cloud Storage bucket.
async <filename> Creates a job to detect speech in a local audio file, and waits for the job to complete.
async-gcs <gcsUri> Creates a job to detect speech in an audio file located in a Google Cloud Storage bucket, and
waits for the job to complete.
stream <filename> Detects speech in a local audio file by streaming it to the Speech API.
listen Detects speech in a microphone input stream.

Options:
--help Show help [boolean]
--help Show help [boolean]
--encoding, -e [string] [default: "LINEAR16"]
--sampleRate, -r [number] [default: 16000]

Examples:
node recognize.js sync ./resources/audio.raw
node recognize.js async ./resources/audio.raw
node recognize.js stream ./resources/audio.raw
node recognize.js sync ./resources/audio.raw -e LINEAR16 -r 16000
node recognize.js async-gcs gs://my-bucket/audio.raw -e LINEAR16 -r 16000
node recognize.js stream ./resources/audio.raw -e LINEAR16 -r 16000
node recognize.js listen

For more information, see https://cloud.google.com/speech/docs
Expand Down
3 changes: 2 additions & 1 deletion speech/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@
"test": "cd ..; npm run st -- --verbose speech/system-test/*.test.js"
},
"dependencies": {
"@google-cloud/speech": "0.5.0",
"@google-cloud/speech": "0.6.0",
"@google-cloud/storage": "0.6.1",
"node-record-lpcm16": "0.2.0",
"yargs": "6.6.0"
},
Expand Down
244 changes: 188 additions & 56 deletions speech/recognize.js
Original file line number Diff line number Diff line change
Expand Up @@ -23,143 +23,275 @@

'use strict';

const Speech = require('@google-cloud/speech');
function syncRecognize (filename, encoding, sampleRate) {
// [START speech_sync_recognize]
// Imports the Google Cloud client library
const Speech = require('@google-cloud/speech');

// [START speech_sync_recognize]
function syncRecognize (filename) {
// Instantiates a client
const speech = Speech();

const config = {
// Configure these settings based on the audio you're transcribing
encoding: 'LINEAR16',
sampleRate: 16000
// The path to the local file on which to perform speech recognition, e.g. /path/to/audio.raw
// const filename = '/path/to/audio.raw';

// The encoding of the audio file, e.g. 'LINEAR16'
// const encoding = 'LINEAR16';

// The sample rate of the audio file, e.g. 16000
// const sampleRate = 16000;

const request = {
encoding: encoding,
sampleRate: sampleRate
};

// Detects speech in the audio file, e.g. "./resources/audio.raw"
return speech.recognize(filename, config)
// Detects speech in the audio file
speech.recognize(filename, request)
.then((results) => {
const transcription = results[0];

console.log(`Transcription: ${transcription}`);
});
// [END speech_sync_recognize]
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI. Missing JSDoc for the functions - could be nice to have @param blocks with suggested values, e.g. '16000' for samplerate, 'LINEAR16' for encoding. Feel free to ignore.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we don't really do those. If we were writing library code then we certainly would add them.

function syncRecognizeGCS (gcsUri, encoding, sampleRate) {
// [START speech_sync_recognize_gcs]
// Imports the Google Cloud client library
const Speech = require('@google-cloud/speech');

// Instantiates a client
const speech = Speech();

// The Google Cloud Storage URI of the file on which to perform speech recognition, e.g. gs://my-bucket/audio.raw
// const gcsUri = 'gs://my-bucket/audio.raw';

// The encoding of the audio file, e.g. 'LINEAR16'
// const encoding = 'LINEAR16';

// The sample rate of the audio file, e.g. 16000
// const sampleRate = 16000;

const request = {
encoding: encoding,
sampleRate: sampleRate
};

// Detects speech in the audio file
speech.recognize(gcsUri, request)
.then((results) => {
const transcription = results[0];

console.log(`Transcription: ${transcription}`);
});
// [END speech_sync_recognize_gcs]
}

function asyncRecognize (filename, encoding, sampleRate) {
// [START speech_async_recognize]
// Imports the Google Cloud client library
const Speech = require('@google-cloud/speech');

// Instantiates a client
const speech = Speech();

// The path to the local file on which to perform speech recognition, e.g. /path/to/audio.raw
// const filename = '/path/to/audio.raw';

// The encoding of the audio file, e.g. 'LINEAR16'
// const encoding = 'LINEAR16';

// The sample rate of the audio file, e.g. 16000
// const sampleRate = 16000;

const request = {
encoding: encoding,
sampleRate: sampleRate
};

// Detects speech in the audio file. This creates a recognition job that you
// can wait for now, or get its result later.
speech.startRecognition(filename, request)
.then((results) => {
const operation = results[0];
// Get a Promise represention of the final result of the job
return operation.promise();
})
.then((transcription) => {
console.log(`Transcription: ${transcription}`);
return transcription;
});
// [END speech_async_recognize]
}
// [END speech_sync_recognize]

// [START speech_async_recognize]
function asyncRecognize (filename) {
function asyncRecognizeGCS (gcsUri, encoding, sampleRate) {
// [START speech_async_recognize_gcs]
// Imports the Google Cloud client library
const Speech = require('@google-cloud/speech');

// Instantiates a client
const speech = Speech();

const config = {
// Configure these settings based on the audio you're transcribing
encoding: 'LINEAR16',
sampleRate: 16000
// The Google Cloud Storage URI of the file on which to perform speech recognition, e.g. gs://my-bucket/audio.raw
// const gcsUri = 'gs://my-bucket/audio.raw';

// The encoding of the audio file, e.g. 'LINEAR16'
// const encoding = 'LINEAR16';

// The sample rate of the audio file, e.g. 16000
// const sampleRate = 16000;

const request = {
encoding: encoding,
sampleRate: sampleRate
};

// Detects speech in the audio file, e.g. "./resources/audio.raw"
// This creates a recognition job that you can wait for now, or get its result
// later.
return speech.startRecognition(filename, config)
// Detects speech in the audio file. This creates a recognition job that you
// can wait for now, or get its result later.
speech.startRecognition(gcsUri, request)
.then((results) => {
const operation = results[0];
// Get a Promise represention the final result of the job
// Get a Promise represention of the final result of the job
return operation.promise();
})
.then((transcription) => {
console.log(`Transcription: ${transcription}`);
return transcription;
});
// [END speech_async_recognize_gcs]
}
// [END speech_async_recognize]

// [START speech_streaming_recognize]
const fs = require('fs');
function streamingRecognize (filename, encoding, sampleRate) {
// [START speech_streaming_recognize]
const fs = require('fs');

// Imports the Google Cloud client library
const Speech = require('@google-cloud/speech');

function streamingRecognize (filename, callback) {
// Instantiates a client
const speech = Speech();

const options = {
// The path to the local file on which to perform speech recognition, e.g. /path/to/audio.raw
// const filename = '/path/to/audio.raw';

// The encoding of the audio file, e.g. 'LINEAR16'
// const encoding = 'LINEAR16';

// The sample rate of the audio file, e.g. 16000
// const sampleRate = 16000;

const request = {
config: {
// Configure these settings based on the audio you're transcribing
encoding: 'LINEAR16',
sampleRate: 16000
encoding: encoding,
sampleRate: sampleRate
}
};

// Create a recognize stream
const recognizeStream = speech.createRecognizeStream(options)
.on('error', callback)
// Stream the audio to the Google Cloud Speech API
const recognizeStream = speech.createRecognizeStream(request)
.on('error', console.error)
.on('data', (data) => {
console.log('Data received: %j', data);
callback();
});

// Stream an audio file from disk to the Speech API, e.g. "./resources/audio.raw"
fs.createReadStream(filename).pipe(recognizeStream);
// [END speech_streaming_recognize]
}
// [END speech_streaming_recognize]

// [START speech_streaming_mic_recognize]
const record = require('node-record-lpcm16');
function streamingMicRecognize (encoding, sampleRate) {
// [START speech_streaming_mic_recognize]
const record = require('node-record-lpcm16');

// Imports the Google Cloud client library
const Speech = require('@google-cloud/speech');

function streamingMicRecognize () {
// Instantiates a client
const speech = Speech();

const options = {
// The encoding of the audio file, e.g. 'LINEAR16'
// const encoding = 'LINEAR16';

// The sample rate of the audio file, e.g. 16000
// const sampleRate = 16000;

const request = {
config: {
// Configure these settings based on the audio you're transcribing
encoding: 'LINEAR16',
sampleRate: 16000
encoding: encoding,
sampleRate: sampleRate
}
};

// Create a recognize stream
const recognizeStream = speech.createRecognizeStream(options)
const recognizeStream = speech.createRecognizeStream(request)
.on('error', console.error)
.on('data', (data) => process.stdout.write(data.results));

// Start recording and send the microphone input to the Speech API
record.start({
sampleRate: 16000,
sampleRate: sampleRate,
threshold: 0
}).pipe(recognizeStream);

console.log('Listening, press Ctrl+C to stop.');
// [END speech_streaming_mic_recognize]
}
// [END speech_streaming_mic_recognize]

require(`yargs`)
.demand(1)
.command(
`sync <filename>`,
`Detects speech in an audio file.`,
`Detects speech in a local audio file.`,
{},
(opts) => syncRecognize(opts.filename)
(opts) => syncRecognize(opts.filename, opts.encoding, opts.sampleRate)
)
.command(
`sync-gcs <gcsUri>`,
`Detects speech in an audio file located in a Google Cloud Storage bucket.`,
{},
(opts) => syncRecognizeGCS(opts.gcsUri, opts.encoding, opts.sampleRate)
)
.command(
`async <filename>`,
`Creates a job to detect speech in an audio file, and waits for the job to complete.`,
`Creates a job to detect speech in a local audio file, and waits for the job to complete.`,
{},
(opts) => asyncRecognize(opts.filename)
(opts) => asyncRecognize(opts.filename, opts.encoding, opts.sampleRate)
)
.command(
`async-gcs <gcsUri>`,
`Creates a job to detect speech in an audio file located in a Google Cloud Storage bucket, and waits for the job to complete.`,
{},
(opts) => asyncRecognizeGCS(opts.gcsUri, opts.encoding, opts.sampleRate)
)
.command(
`stream <filename>`,
`Detects speech in an audio file by streaming it to the Speech API.`,
`Detects speech in a local audio file by streaming it to the Speech API.`,
{},
(opts) => streamingRecognize(opts.filename, () => {})
(opts) => streamingRecognize(opts.filename, opts.encoding, opts.sampleRate)
)
.command(
`listen`,
`Detects speech in a microphone input stream.`,
{},
streamingMicRecognize
(opts) => streamingMicRecognize(opts.encoding, opts.sampleRate)
)
.example(`node $0 sync ./resources/audio.raw`)
.example(`node $0 async ./resources/audio.raw`)
.example(`node $0 stream ./resources/audio.raw`)
.options({
encoding: {
alias: 'e',
default: 'LINEAR16',
global: true,
requiresArg: true,
type: 'string'
},
sampleRate: {
alias: 'r',
default: 16000,
global: true,
requiresArg: true,
type: 'number'
}
})
.example(`node $0 sync ./resources/audio.raw -e LINEAR16 -r 16000`)
.example(`node $0 async-gcs gs://my-bucket/audio.raw -e LINEAR16 -r 16000`)
.example(`node $0 stream ./resources/audio.raw -e LINEAR16 -r 16000`)
.example(`node $0 listen`)
.wrap(120)
.recommendCommands()
Expand Down
Loading