This is the fork of GoogleCloudPlatform/protoc-gen-bq-schema repository addressing these PRs:
- #12 Use comments as field description (from
master
branch in this repository)- #14 Add support for message-level extra_fields
- #17 Added enumsasints plugin parameter – since v1.5
- Add support for recursive messages – since v1.6
Default branch of this repository is
develop
.Two satellite repositories:
- protoc-gen-bq-schema-example-proto – example of input Protobuf files
- protoc-gen-bq-schema-example-bq – example of generated Big Schema JSON files
protoc-gen-bq-schema is a plugin for ProtocolBuffer compiler.
It converts messages written in .proto
format into schema files in JSON for BigQuery.
So you can reuse existing data definitions in .proto
for BigQuery with this plugin.
go get github.com/chuhlomin/protoc-gen-bq-schema
protoc --bq-schema_out=path/to/outdir foo.proto
protoc
and protoc-gen-bq-schema
commands must be found in $PATH.
The generated JSON schema files are suffixed with .schema
and their base names are named
after their package names and bq_table_name
options.
If you do not already have the standard Google Protobuf libraries in your proto_path
, you'll need to specify them directly on the command line (and potentially need to copy bq_schema.proto
into a proto_path directory as well), like this:
protoc --bq-schema_out=path/to/out/dir foo.proto --proto_path=. --proto_path=<path_to_google_proto_folder>/src
Suppose that we have the following foo.proto
.
syntax = "proto3";
package foo;
import "google/type/date.proto";
import "bq/bq_table.proto";
import "bq/bq_field.proto";
message Bar {
option (gen_bq_schema.bigquery_opts).table_name = "bar_table";
option (gen_bq_schema.bigquery_opts).extra_fields = "f:INTEGER";
option (gen_bq_schema.bigquery_opts).extra_fields = "g:RECORD:Baz";
message Nested {
repeated int32 a = 1;
}
enum EnumAllowingAlias {
option allow_alias = true;
UNKNOWN = 0;
STARTED = 1;
RUNNING = 1;
}
int32 a = 1; // field comment
Nested b = 2;
string c = 3;
bool d = 4 [(gen_bq_schema.bigquery).ignore = true];
uint64 e = 5 [
(gen_bq_schema.bigquery) = {
require: true
type_override: 'TIMESTAMP'
}
];
google.type.Date date = 6 [(gen_bq_schema.bigquery).type_override = "DATE"];
EnumAllowingAlias status = 8;
}
message Baz {
int32 a = 1;
}
protoc --bq-schema_out=. foo.proto
will generate a file named foo/bar_table.schema
.
The message foo.Baz
is ignored because it doesn't have option gen_bq_schema.bigquery_opts
.
Plugin parameter enumsasints=true
will marshal all enums into integers instead of strings: protoc --bq-schema_out=enumsasints=true:. foo.proto
.
You can use chuhlomin/protoc-gen-bq-schema image on Docker Hub.
Example Docker run:
mkdir bq_schema
docker run -i -t -v $(pwd):/workdir \
chuhlomin/protoc-gen-bq-schema:1.6 \
-I/workdir \
-I/workdir/bq \
--bq-schema_out=/workdir/bq_schema \
/workdir/foo.proto
Example Drone step: .drone.yml
- name: build
image: chuhlomin/protoc-gen-bq-schema:1.6
commands:
- mkdir bq_schema
- protoc -I/protobuf/ -I. -Ibq --bq-schema_out=bq_schema foo.proto
To test build binaries inside an isolated Docker container (recommended):
docker run -i -t -v $(pwd):/workdir golang:1.12.14-alpine3.10 /bin/sh
apk add --no-cache make git gcc libc-dev protobuf
cd /workdir
make clean test install
make examples
exit
To test and build the plugin binary on your machine run the following commands:
make clean test install
# (optionally) build a Docker image
docker build -t protoc-gen-bq-schema:local .
protoc-gen-bq-schema is licensed under the Apache License version 2.0.
This is not an official Google product.