Skip to content

Commit

Permalink
Bring BigQuery samples up to standard.
Browse files Browse the repository at this point in the history
  • Loading branch information
jmdobry committed Apr 25, 2017
1 parent ce94259 commit 1c73c6d
Show file tree
Hide file tree
Showing 8 changed files with 452 additions and 327 deletions.
79 changes: 41 additions & 38 deletions bigquery/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,21 +42,21 @@ __Usage:__ `node datasets --help`

```
Commands:
create <datasetId> Creates a new dataset.
delete <datasetId> Deletes a dataset.
list [projectId] Lists all datasets in the specified project or the current project.
size <datasetId> [projectId] Calculates the size of a dataset.
create <datasetId> Creates a new dataset.
delete <datasetId> Deletes a dataset.
list Lists datasets.
Options:
--help Show help [boolean]
--projectId, -p The Project ID to use. Defaults to the value of the GCLOUD_PROJECT or GOOGLE_CLOUD_PROJECT
environment variables. [string]
--help Show help [boolean]
Examples:
node datasets create my_dataset Creates a new dataset named "my_dataset".
node datasets delete my_dataset Deletes a dataset named "my_dataset".
node datasets list Lists all datasets in the current project.
node datasets list bigquery-public-data Lists all datasets in the "bigquery-public-data" project.
node datasets size my_dataset Calculates the size of "my_dataset" in the current project.
node datasets size hacker_news bigquery-public-data Calculates the size of "bigquery-public-data:hacker_news".
node datasets.js create my_dataset Creates a new dataset named "my_dataset".
node datasets.js delete my_dataset Deletes a dataset named "my_dataset".
node datasets.js list Lists all datasets in the project specified by the
GCLOUD_PROJECT or GOOGLE_CLOUD_PROJECT environments variables.
node datasets.js list --projectId=bigquery-public-data Lists all datasets in the "bigquery-public-data" project.
For more information, see https://cloud.google.com/bigquery/docs
```
Expand All @@ -77,14 +77,16 @@ Commands:
shakespeare Queries a public Shakespeare dataset.
Options:
--help Show help [boolean]
--projectId, -p The Project ID to use. Defaults to the value of the GCLOUD_PROJECT or GOOGLE_CLOUD_PROJECT
environment variables. [string]
--help Show help [boolean]
Examples:
node queries sync "SELECT * FROM publicdata.samples.natality Synchronously queries the natality dataset.
LIMIT 5;"
node queries async "SELECT * FROM Queries the natality dataset as a job.
node queries.js sync "SELECT * FROM Synchronously queries the natality dataset.
publicdata.samples.natality LIMIT 5;"
node queries shakespeare Queries a public Shakespeare dataset.
node queries.js async "SELECT * FROM Queries the natality dataset as a job.
publicdata.samples.natality LIMIT 5;"
node queries.js shakespeare Queries a public Shakespeare dataset.
For more information, see https://cloud.google.com/bigquery/docs
```
Expand All @@ -100,41 +102,42 @@ __Usage:__ `node tables --help`

```
Commands:
create <datasetId> <tableId> <schema> [projectId] Creates a new table.
list <datasetId> [projectId] Lists all tables in a dataset.
delete <datasetId> <tableId> [projectId] Deletes a table.
create <datasetId> <tableId> <schema> Creates a new table.
list <datasetId> Lists all tables in a dataset.
delete <datasetId> <tableId> Deletes a table.
copy <srcDatasetId> <srcTableId> <destDatasetId> Makes a copy of a table.
<destTableId> [projectId]
browse <datasetId> <tableId> [projectId] Lists rows in a table.
import <datasetId> <tableId> <fileName> [projectId] Imports data from a local file into a table.
<destTableId>
browse <datasetId> <tableId> Lists rows in a table.
import <datasetId> <tableId> <fileName> Imports data from a local file into a table.
import-gcs <datasetId> <tableId> <bucketName> <fileName> Imports data from a Google Cloud Storage file into a
[projectId] table.
table.
export <datasetId> <tableId> <bucketName> <fileName> Export a table from BigQuery to Google Cloud Storage.
[projectId]
insert <datasetId> <tableId> <json_or_file> [projectId] Insert a JSON array (as a string or newline-delimited
insert <datasetId> <tableId> <json_or_file> Insert a JSON array (as a string or newline-delimited
file) into a BigQuery table.
Options:
--help Show help [boolean]
--projectId, -p The Project ID to use. Defaults to the value of the GCLOUD_PROJECT or GOOGLE_CLOUD_PROJECT
environment variables. [string]
--help Show help [boolean]
Examples:
node tables create my_dataset my_table "Name:string, Createss a new table named "my_table" in "my_dataset".
node tables.js create my_dataset my_table "Name:string, Creates a new table named "my_table" in "my_dataset".
Age:integer, Weight:float, IsMagic:boolean"
node tables list my_dataset Lists tables in "my_dataset".
node tables browse my_dataset my_table Displays rows from "my_table" in "my_dataset".
node tables delete my_dataset my_table Deletes "my_table" from "my_dataset".
node tables import my_dataset my_table ./data.csv Imports a local file into a table.
node tables import-gcs my_dataset my_table my-bucket Imports a GCS file into a table.
node tables.js list my_dataset Lists tables in "my_dataset".
node tables.js browse my_dataset my_table Displays rows from "my_table" in "my_dataset".
node tables.js delete my_dataset my_table Deletes "my_table" from "my_dataset".
node tables.js import my_dataset my_table ./data.csv Imports a local file into a table.
node tables.js import-gcs my_dataset my_table my-bucket Imports a GCS file into a table.
data.csv
node tables export my_dataset my_table my-bucket my-file Exports my_dataset:my_table to gcs://my-bucket/my-file
node tables.js export my_dataset my_table my-bucket my-file Exports my_dataset:my_table to gcs://my-bucket/my-file
as raw CSV.
node tables export my_dataset my_table my-bucket my-file -f Exports my_dataset:my_table to gcs://my-bucket/my-file
JSON --gzip as gzipped JSON.
node tables insert my_dataset my_table json_string Inserts the JSON array represented by json_string into
node tables.js export my_dataset my_table my-bucket my-file Exports my_dataset:my_table to gcs://my-bucket/my-file
-f JSON --gzip as gzipped JSON.
node tables.js insert my_dataset my_table json_string Inserts the JSON array represented by json_string into
my_dataset:my_table.
node tables insert my_dataset my_table json_file Inserts the JSON objects contained in json_file (one per
node tables.js insert my_dataset my_table json_file Inserts the JSON objects contained in json_file (one per
line) into my_dataset:my_table.
node tables copy src_dataset src_table dest_dataset Copies src_dataset:src_table to dest_dataset:dest_table.
node tables.js copy src_dataset src_table dest_dataset Copies src_dataset:src_table to dest_dataset:dest_table.
dest_table
For more information, see https://cloud.google.com/bigquery/docs
Expand Down
165 changes: 84 additions & 81 deletions bigquery/datasets.js
Original file line number Diff line number Diff line change
Expand Up @@ -15,123 +15,126 @@

'use strict';

const BigQuery = require('@google-cloud/bigquery');
function createDataset (datasetId, projectId) {
// [START bigquery_create_dataset]
// Imports the Google Cloud client library
const BigQuery = require('@google-cloud/bigquery');

// The project ID to use, e.g. "your-project-id"
// const projectId = "your-project-id";

// [START bigquery_create_dataset]
function createDataset (datasetId) {
// Instantiates a client
const bigquery = BigQuery();
const bigquery = BigQuery({
projectId: projectId
});

// Creates a new dataset, e.g. "my_new_dataset"
return bigquery.createDataset(datasetId)
// The ID for the new dataset, e.g. "my_new_dataset"
// const datasetId = "my_new_dataset";

// Creates a new dataset
bigquery.createDataset(datasetId)
.then((results) => {
const dataset = results[0];
console.log(`Dataset ${dataset.id} created.`);
return dataset;
})
.catch((err) => {
console.error('ERROR:', err);
});
// [END bigquery_create_dataset]
}
// [END bigquery_create_dataset]

// [START bigquery_delete_dataset]
function deleteDataset (datasetId) {
function deleteDataset (datasetId, projectId) {
// [START bigquery_delete_dataset]
// Imports the Google Cloud client library
const BigQuery = require('@google-cloud/bigquery');

// The project ID to use, e.g. "your-project-id"
// const projectId = "your-project-id";

// Instantiates a client
const bigquery = BigQuery();
const bigquery = BigQuery({
projectId: projectId
});

// References an existing dataset, e.g. "my_dataset"
// The ID of the dataset to delete, e.g. "my_new_dataset"
// const datasetId = "my_new_dataset";

// Creates a reference to the existing dataset
const dataset = bigquery.dataset(datasetId);

// Deletes the dataset
return dataset.delete()
dataset.delete()
.then(() => {
console.log(`Dataset ${dataset.id} deleted.`);
})
.catch((err) => {
console.error('ERROR:', err);
});
// [END bigquery_delete_dataset]
}
// [END bigquery_delete_dataset]

// [START bigquery_list_datasets]
function listDatasets (projectId) {
// [START bigquery_list_datasets]
// Imports the Google Cloud client library
const BigQuery = require('@google-cloud/bigquery');

// The project ID to use, e.g. "your-project-id"
// const projectId = "your-project-id";

// Instantiates a client
const bigquery = BigQuery({
projectId: projectId
});

// Lists all datasets in the specified project
return bigquery.getDatasets()
bigquery.getDatasets()
.then((results) => {
const datasets = results[0];
console.log('Datasets:');
datasets.forEach((dataset) => console.log(dataset.id));
return datasets;
})
.catch((err) => {
console.error('ERROR:', err);
});
// [END bigquery_list_datasets]
}
// [END bigquery_list_datasets]

// [START bigquery_get_dataset_size]
function getDatasetSize (datasetId, projectId) {
// Instantiate a client
const bigquery = BigQuery({
projectId: projectId
});

// References an existing dataset, e.g. "my_dataset"
const dataset = bigquery.dataset(datasetId);

// Lists all tables in the dataset
return dataset.getTables()
.then((results) => results[0])
// Retrieve the metadata for each table
.then((tables) => Promise.all(tables.map((table) => table.get())))
.then((results) => results.map((result) => result[0]))
// Select the size of each table
.then((tables) => tables.map((table) => (parseInt(table.metadata.numBytes, 10) / 1000) / 1000))
// Sum up the sizes
.then((sizes) => sizes.reduce((cur, prev) => cur + prev, 0))
// Print and return the size
.then((sum) => {
console.log(`Size of ${dataset.id}: ${sum} MB`);
return sum;
});
}
// [END bigquery_get_dataset_size]

// The command-line program
const cli = require(`yargs`);

const program = module.exports = {
createDataset: createDataset,
deleteDataset: deleteDataset,
listDatasets: listDatasets,
getDatasetSize: getDatasetSize,
main: (args) => {
// Run the command-line program
cli.help().strict().parse(args).argv; // eslint-disable-line
}
};

cli
require(`yargs`) // eslint-disable-line
.demand(1)
.command(`create <datasetId>`, `Creates a new dataset.`, {}, (opts) => {
program.createDataset(opts.datasetId);
})
.command(`delete <datasetId>`, `Deletes a dataset.`, {}, (opts) => {
program.deleteDataset(opts.datasetId);
})
.command(`list [projectId]`, `Lists all datasets in the specified project or the current project.`, {}, (opts) => {
program.listDatasets(opts.projectId || process.env.GCLOUD_PROJECT);
})
.command(`size <datasetId> [projectId]`, `Calculates the size of a dataset.`, {}, (opts) => {
program.getDatasetSize(opts.datasetId, opts.projectId || process.env.GCLOUD_PROJECT);
.options({
projectId: {
alias: 'p',
default: process.env.GCLOUD_PROJECT || process.env.GOOGLE_CLOUD_PROJECT,
description: 'The Project ID to use. Defaults to the value of the GCLOUD_PROJECT or GOOGLE_CLOUD_PROJECT environment variables.',
requiresArg: true,
type: 'string'
}
})
.command(
`create <datasetId>`,
`Creates a new dataset.`,
{},
(opts) => createDataset(opts.datasetId, opts.projectId)
)
.command(
`delete <datasetId>`,
`Deletes a dataset.`,
{},
(opts) => deleteDataset(opts.datasetId, opts.projectId)
)
.command(
`list`,
`Lists datasets.`,
{},
(opts) => listDatasets(opts.projectId)
)
.example(`node $0 create my_dataset`, `Creates a new dataset named "my_dataset".`)
.example(`node $0 delete my_dataset`, `Deletes a dataset named "my_dataset".`)
.example(`node $0 list`, `Lists all datasets in the current project.`)
.example(`node $0 list bigquery-public-data`, `Lists all datasets in the "bigquery-public-data" project.`)
.example(`node $0 size my_dataset`, `Calculates the size of "my_dataset" in the current project.`)
.example(`node $0 size hacker_news bigquery-public-data`, `Calculates the size of "bigquery-public-data:hacker_news".`)
.example(`node $0 list`, `Lists all datasets in the project specified by the GCLOUD_PROJECT or GOOGLE_CLOUD_PROJECT environments variables.`)
.example(`node $0 list --projectId=bigquery-public-data`, `Lists all datasets in the "bigquery-public-data" project.`)
.wrap(120)
.recommendCommands()
.epilogue(`For more information, see https://cloud.google.com/bigquery/docs`);

if (module === require.main) {
program.main(process.argv.slice(2));
}
.epilogue(`For more information, see https://cloud.google.com/bigquery/docs`)
.help()
.strict()
.argv;
2 changes: 1 addition & 1 deletion bigquery/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
"yargs": "7.1.0"
},
"devDependencies": {
"@google-cloud/nodejs-repo-tools": "1.3.1",
"@google-cloud/nodejs-repo-tools": "1.3.2",
"ava": "0.19.1",
"proxyquire": "1.7.11",
"sinon": "2.1.0",
Expand Down
Loading

0 comments on commit 1c73c6d

Please sign in to comment.