Skip to content

Commit

Permalink
update design decision record
Browse files Browse the repository at this point in the history
* GA4GH Inherent Properties

* IRIs over CURIEs

* VRS identifier syntax and versioning

---------

Co-authored-by: Alex H. Wagner, PhD <Alex.Wagner@nationwidechildrens.org>
  • Loading branch information
larrybabb and ahwagner authored Dec 16, 2024
1 parent ab026e1 commit de029e5
Show file tree
Hide file tree
Showing 4 changed files with 96 additions and 4 deletions.
91 changes: 91 additions & 0 deletions docs/source/appendices/design_decisions.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
.. _design_decisions:

Design Decisions
!!!!!!!!!!!!!!!!

The following design decisions were made in the development of the VRS:

GA4GH Inherent Properties over Value Objects
--------------------------------------------

In VRS 1.0 we operated under the principle that all identifiable objects in VRS (e.g. Allele, SequenceLocation, etc.)
would be *value objects*. This meant that they should be immutable and contain only required fields that are
necessary to uniquely identify the object. This approach somewhat simplified the ability to genertate the digests by
allowing the computation of the digest to be based on the entire object. An exception was made for properties with a
leading underscore (namely, the *_id* property), which was removed from the object before a digest was calculated.

In VRS 2.0 we extended the principle of excepting designated attributes by explicitly defining *inherent properties*
that constitute the properties used to compute an object digest. This was done to enable expressivity of VRS,
enabling implementations to pass common, descriptive metadata as part of the identifiable objects without sacrificing
the ability to create globally unique, federated identifiers from VRS 1.3.

As a result, we had to introduce a new field in the digest model called *ga4gh.inherent* which is described in detail
in the section on :ref:`ga4gh-inherent-properties`.

IRIs over CURIEs
----------------

In VRS 2.0 we moved away from the use of CURIEs in favor of :ref:`iriReference`. Several factors played a role in
this decision.

JSON Schema, the default data model for GKS specifications, does not allow for encoding of CURIE namespaces as is done
in other frameworks such as JSON-LD or XML. As a result, namespaces must be captured from custom data structures, API
endpoints, or documentation that may not persist as messages are exchanged between systems. To address this, references
in GKS specs now use IRIs to reference objects explicitly.

IRI-References over IRIs
------------------------
We opted for the general use of IRI-References as a way to provide a more flexible approach to the use of IRIs
in most GKS message structures. IRI-references (relative IRIs) benefit the users allow for compact representation
of concepts that are accessible within a system (e.g. a directory structure or web API).

VRS identifier syntax and versioning
------------------------------------

The :ref:`versioning` section describes the versioning and release naming conventions for the VRS product.
Approved releases will be assigned to the version number alone, but connect, ballot and snapshot releases will
include the context term and date in addition to the target version number.

During the GA4GH Connect April 2023 meeting the maturity model was discussed at length and the following
proposal was presented for instance and class GKS identifiers.

.. image:: ../images/2023-connect-gks-identifier-proposal.png
:alt: GKS Identifiers Proposal from 2023 April Connect Session
:align: center

As an example, the Github JSON Schema URL ($id) for the VRS 2.0.0 Allele is:

.. code-block:: json
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://w3id.org/ga4gh/schema/vrs/2.0.0/json/Allele",
...
}
During the **release and versioning** discussion at the GA4GH Connect April 2023 meeting the proposal
delved into the idea of including the major version number in the VRS identifier itself. Proponents of
this approach cited concern for the change in digests (and their derived identifiers) between major
versions of the same VRS object, which would become clearly visible in the identifier itself if the
major version was included.

Opponents of this approach argued that new identifiers would be required for every type of VRS object
for every major version release. Meaning that even if a given type of object has no change that would
result in a new digest, a new identifier would still be required for the new major version.

After much discussion, the decision was made to NOT include the major version number in the VRS identifier
itself. Therefore, the :ref:`identifier-construction` does NOT contain the version number, resulting in
the following syntax:

**CURIE namespace resolution**

.. code-block::
ga4gh:VA.Oop4kjdTtKcg1kiZjIJAAR3bp7qi4aNT
**URI Syntax**

.. code-block::
https://w3id.org/ga4gh/vrs/VA.Oop4kjdTtKcg1kiZjIJAAR3bp7qi4aNT
1 change: 1 addition & 0 deletions docs/source/appendices/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,5 @@ Appendices
ga4gh_identifiers
resource_identifiers
truncated_digest_collision_analysis
design_decisions
glossary
8 changes: 4 additions & 4 deletions docs/source/appendices/maturity_model.rst
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ Product Versioning and Releases

Versions are used to identify releases of the entire specification, not to individual product features.
Technical specification development is intrinsically linked to policy surrounding major and minor version
identification, which follow [semantic versioning v2](https://semver.org) practices for API versioning.
identification, which follow `semantic versioning v2 <https://semver.org>`__ practices for API versioning.

Versioning examples
###################
Expand Down Expand Up @@ -167,10 +167,10 @@ $$$$$$$$$$$$$$$$$$$$$$$
- Addition of implementation guidance, tests, or other supporting product features that do not directly
affect data compatibility

Versioning of approved GA4GH standards additionally follow the procedures for [GA4GH Product Updates](https://www.ga4gh.org/our-products/development-and-approval-process/#section_7).
Versioning of approved GA4GH standards additionally follow the procedures for `GA4GH Product Updates <https://www.ga4gh.org/our-products/development-and-approval-process/#section_7>`__.
Specifically, advancement of data classes to the trial use or normative levels must be accompanied by a
minor release increment, and therefore may only be included in a release following an appropriate community
and PRC consultation process ([GA4GH Product Development 32](https://www.ga4gh.org/our-products/development-and-approval-process/#section_7:~:text=32.%20Public%20comment,reduced%20or%20omitted.)).
and PRC consultation process (`GA4GH Product Development 32 <https://www.ga4gh.org/our-products/development-and-approval-process/#section_7:~:text=32.%20Public%20comment,reduced%20or%20omitted.>`__).

Releases
########
Expand All @@ -196,7 +196,7 @@ These pre-release labels are appended to the major, minor, and patch components
a pre-release version following the SemVer <MAJOR>.<MINOR>.<PATCH>-<LABEL> syntax. For example,
a pre-release of VRS 2.0 for discussion at Spring 2024 Connect would have a version identifier
like 2.0.0-connect.2024-04. Releases and pre-releases should use GitHub Releases for release
packaging and tracking (see [VRS releases](https://github.com/ga4gh/vrs/releases)).
packaging and tracking (see `VRS releases <https://github.com/ga4gh/vrs/releases>`__).

Decision-maker roles
@@@@@@@@@@@@@@@@@@@@
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit de029e5

Please sign in to comment.