This repository is a fork. Check out the original project here
Create your DDL statements for your database based on your JSON Schema.
Postgres and Redshift are supported.
Install the latest version with pip install jsonschema2ddl
.
JSON Schema is a widely used tool. With jsonschema2ddl
you can take your existing schemas defined with json schema and deploy them in Postgresql and Redshift.
Let's start with a jsonschema:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "Example schema",
"comment": "the root of everything",
"definitions": {
"address": {
"type": "object",
"properties": {
"street_address": {
"type": "string"
},
"city": {
"type": "string"
},
"state": {
"type": "string"
}
}
}
},
"type": "object",
"properties": {
"UserId": {
"type": "integer"
},
"UserName": {
"type": "string"
},
"Age": {
"type": "integer"
},
"Address": {
"$ref": "#/definitions/address"
}
}
}
Load the schema as a python dictionary to use it:
import json
with open('test/schema.json') as f:
schema = json.load(f)
You will also need a connection object to the database:
pg_uri = 'postgresql://localhost:5432/my_db'
# With psycopg2
import psycopg2
conn = psycopg2.connect(pg_uri)
# OR with sqlalchemy
from sqlalchemy import create_engine
conn = create_engine(pg_uri).raw_connection()
Now you can initialize your translators and deploy the tables:
from jsonschema2ddl import JSONSchemaToDatabase
translator = JSONSchemaToDatabase(
schema,
schema_name='my_schema',
root_table_name='my_table',
)
translator.create_tables(conn)
translator.create_links(conn)
translator.analyze(conn)
conn.comit()
- A
root
table is created for the schema with the correct types - Custom PostgreSQL schema to create the tables. The PostgreSQL schema can be drop before deploying the tables.
- Create subtables using the
definitions
sections in the jsonchema document. The name of these tables will be the one defined in the jsonschema. The relationships between the tables can be created as well with thecreate_links()
function. This also requires to have some column as the primary key. You can specify this column with thepk
field in the definition of the schema. If there are nopk
column defined, a defaultid
column will be created for the table. - Following types are supported:
boolean
-> converted to typebool
.number
-> converted to typefloat
.string
-> converted to typevarchar({size})
. The size defaults to 256, and can be specified using themaxLength
field in the definition.- If the format for the string type is defined to
date-time
, it is converted totimestamptz
type. - If the format for the string type is defined to
date
, it is converted todate
type.
- If the format for the string type is defined to
timestamp
-> converted to typetimestamptz
.enum
-> converted to typetext
.integer
-> converted to typebigint
.date
-> converted to typedate
.link
-> converted to typeinteger
.object
-> converted to typejson
.id
-> converted to typeserial
in postgresql. For redshift, it is converted toint identity(1, 1) not null
.
- Schema itself is validated against the
$schema
definition uri.
- Specify constraints in the jsonschema.
- Specify indexes in the jsonschema.
- Create enums in postgresql.
- Support pattern constrints
- Support redirects when validating the
$schema
- Support nested refs in schema
- Parse array in type for ['string', 'null'] cases