Skip to content

chop-dbhi/data-models-service

Repository files navigation

Data Models Service

GoDoc Circle CI

Service for consuming files in the data models format. The service is publicly hosted here: http://data-models-service.research.chop.edu

Build

Dependencies

Run

make install build

This will put the data-models binary in your $GOPATH/bin directory. The examples below assumes $GOPATH/bin has been added to your $PATH.

Usage

Once the binary is built, running the binary without any options will start the service. The service clones the data-models repository from GitHub into a data-models directory in your working directory. It is recommended to print the usage message to see the available options.

data-models -help

Docker

Use the pre-built image on Docker Hub.

docker run -it -p 8123:8123 dbhi/data-models-service

Or build the image locally.

docker build -t dbhi/data-models-service .

Top-level Resources

  • Model Specifications - /models - A specification of each data model version is available at a /models/<data model>/<version> endpoint (e.g., /models/omop/5.0.0).

  • Repos - /repos - This endpoint shows the Git repository or repositories being served by this service.

Differences Between Models

Diffs between two versions of a model or between two related models can be viewed at a /compare/<data model 1>/<version 1>/<data model 2>/<version 2> endpoint (e.g., /compare/omop/5.0.0/pedsnet/2.2.0).

Content Negotiation

The service supports representing each resource in various formats using simple content negotation. The supported formats are:

  • HTML - text/html
  • Markdown - text/markdown
  • JSON - application/json

The desired format can be requested either by setting the Accept header to the corresponding mimetype or by adding a format parameter to the URL. For example, below is the OMOP v5 specification resource represented in each format:

Representations are tailored to the clients that are expected to use the resource, as described below. Note that some resources do not support all formats. The HTML format is the default format provided when neither method of content negotiation are used.

Model Specification Resources

HTML

The HTML format (e.g., OMOP v5) is intended as a very simple proof of concept for displaying the data model specification in a web client for review by data model and/or data users. As such, it begins with the data model version id and a reference URL, followed by a list of tables (which serves as a linked table of contents). Each table section includes the table description and a list of fields (again, a linked table of contents). For each field, "refers to" information, if it exists, is followed by the description and any schema specifications. A table of mappings and a table of inbound references are also provided, if that information is found. This content represents an aggregation of information about the data model which we think would be useful for data model and/or data users.

Markdown

The Markdown format (e.g., OMOP v5) provides the same information as the HTML format. In fact, the HTML format is derived directly from the Markdown. The specific choices about header levels and organization can be seen at the actual endpoints linked above. This is intended as an API of sorts from which use-case-specific clients can retrieve, process, and display aggregated data model specification information as they wish.

JSON

The JSON format (e.g., OMOP v5), unlike the previously described formats, is intended for technical implementation clients and therefore presents a readily machine-processable and exhaustive representation of the data model specification. The top-level object contains the data model name, version, and reference url as well as an array of tables. Each object in the tables array contains the table name and description, an array of fields, and the model name and model version, to unambiguously identify the model to which the table belongs. Each object in the fields array contains the field name, description, type, and required status (as per governance), as well as the default (which defaults to ""), length, precision, and scale (which all default to 0). Each field object also contains the table name. This format should be useful in dynamically creating all sorts of data model operations, from schema creation to annotation to transformations.