Create a data dictionary / schema for your data using simple spreadsheets - no coding required.
Schemasheets is a framework for managing your schema using spreadsheets (Google Sheets, Excel). It works by compiling down to LinkML, which can itself be compiled to a variety of formalisms, or used for different purposes like data validation
See the Schema Sheets Manual
pip install schemasheets
You should then be able to run the following commands:
- sheets2linkml - Convert schemasheets to a LinkML schema
- linkml2sheets - Convert a LinkML schema to schemasheets
- sheets2project - Generate an entire set of schema files (JSON-Schema, SHACL, SQL, ...) from Schemasheets
As an example, take a look at the different tabs in the google sheet with ID 1wVoaiFg47aT9YWNeRfTZ8tYHN8s8PAuDx5i2HUcDpvQ
The personinfo tab contains the bulk of the metadata elements:
record | field | key | multiplicity | range | desc | schema.org |
---|---|---|---|---|---|---|
> class |
slot | identifier | cardinality | range | description | exact_mappings: {curie_prefix: sdo} |
> |
||||||
id | yes | 1 | string | any identifier | identifier | |
description | no | 0..1 | string | a textual description | description | |
Person | n/a | n/a | n/a | a person,living or dead | Person | |
Person | id | yes | 1 | string | identifier for a person | identifier |
Person, Organization | name | no | 1 | string | full name | name |
Person | age | no | 0..1 | decimal | age in years | |
Person | gender | no | 0..1 | decimal | age in years | |
Person | has medical history | no | 0..* | MedicalEvent | medical history | |
Event | grouping class for events | |||||
MedicalEvent | n/a | n/a | n/a | a medical encounter | ||
ForProfit | ||||||
NonProfit |
This demonstrator schema contains both record types (e.g Person, MedicalEvent) as well as fields (e.g. id, age, gender)
You can convert this like this:
sheets2linkml --gsheet-id 1wVoaiFg47aT9YWNeRfTZ8tYHN8s8PAuDx5i2HUcDpvQ personinfo types prefixes -o personinfo.yaml
This will generate a LinkML YAML file personinfo.yaml
from 3 of the tabs in the google sheet
You can also work directly with TSVs:
wget https://raw.githubusercontent.com/linkml/schemasheets/main/tests/input/personinfo.tsv
sheets2linkml personinfo.tsv -o personinfo.yaml
We recommend using COGS to synchronize your google sheets with local files using a git-like mechanism
- Schema Sheets Manual
- linkml/schemasheets code repo
- linkml/linkml main LinkML repo