Skip to content

Unit Vocabulary Submission Guidelines

steveraysteveray edited this page Jun 25, 2024 · 32 revisions

This document describes the required procedures to submit a conforming unit to an existing unit vocabulary file, as well as the required metadata for submitting a new unit vocabulary file (such as a new domain-specific vocabulary). Please follow our git best practices.

Adding a unit to an existing vocabulary file (Recommended method)

Adding a new unit vocabulary file


Adding a unit to an existing vocabulary file. (Recommended method)

To add a new unit to the existing vocabulary, code should be added either directly to the Units vocabulary file, vocab/unit/VOCAB_QUDT-UNITS-ALL-v2.1.ttl (preferred), or to a staging file in the /submissions folder (If there is uncertainty, please ask). Example entries are given below. The following 6 rules apply:

Qname naming rules

Of course, each qname must be unique in the unit: namespace.

  • Rule 0: Use the common symbol for the qname, if it does not conflict with an even more common use for that symbol.
      Example:
      unit:A for Ampere, and unit:ANGSTROM for Angstrom
  • Rule 1: Underscore: concept qualifiers will be separated from the unit and other qualifiers with underscores. Also note that while units are always specified in UPPERCASE, qualifiers may be either UPPERCASE or TitleCase.
      Example:
      GAL_IMP means "Imperial Gallon"
  • Rule 2: Concept qualifier type ordering

  • Rule 2.1: Dimensional qualifiers will be first to follow the unit. The default interpretation is mass.

      Example:
      LB_M is not required for pound of mass - just LB (however the explicit version, LB_M is also in the vocabulary, related to LB by qudt:exactMatch
      OZ_VOL for volume (but also see rule 2.4 about jurisdictions)
      LB_F for pound of force
  • Rule 2.2: Context qualifiers will follow Dimensional qualifiers, and defaults to empty
      Example:
      GM_Carbon for grams of carbon
  • Rule 2.3: System qualifiers (e.g., _Metric, _Imperial, _SI) will follow Context qualifiers and defaults to _SI
      Example:
      TON_Metric for metric mass ton
  • Rule 2.4: Jurisdiction qualifiers (e.g., _UK, _US, _IT) follow System qualifiers
      Example: TON_US for US mass ton
               OZ_VOL_US for U.S. liquid ounce
  • Rule 2.5: Everything else is added to the end of the qualifier list, in alphabetical order, and defaults to empty

  • Rule 3: Numeric multipliers/prefixes aren't normally separated from the unit. However, note the example shown in Rule 6. The prefixes supported are those found here. Note that they are specified in TitleCase.

    Example:
    MilliSEC
    KiloGM
    
  • Rule 4: Exponents (power)

Number directly after a unit denotes that unit raised to the power of the number (cannot be negative)

  Example:
  M3 means "Cubic Metre"
  • Rule 5: Hyphens

    a) - separates units that should be multiplied together

    b) Units separated by hyphens should appear in alphabetical order as a general rule. This rule is, however, relaxed when the common usage of a term is not alphabetical (such as Newton-Metre, unit:N-M, instead of Metre-Newton, unit:M-N). This is done for ease of use.

    b) -PER- separates numerator units from denominator units

    Example:
    K-M-PER-W means "Kelvin * Metre / Watt"
    

    c) There can be only 1 -PER- per qname

    (For a discussion using a contributor's example, see Issue #129)

  • Rule 6: Order of Operations

The rules above should be applied (and interpreted) according to the following precedence:

  1. Qualifiers

  2. Multipliers/Prefixes

  3. Exponents (power)

  4. Hyphens

    Example: KiloM3 means (Kilometre)**3, or cubic kilometres, not Kilo(M**3), or 1000 cubic metres, because prefixes precede exponents. Thus, the conversionMultiplier to cubic metres is 10**9 rather than 10**3. To represent a unit defined as thousands of cubic metres, we would define a "pseudo unit" CubicM to be used only when in combination with a prefix. The URI would thus be KiloCubicM. (Currently, the only requested example of this is KiloCubicFT, meaning thousands of cubic feet).

Required unit properties

The absolute minimum set of required properties are shown in the following example:

unit:MilliM
  rdf:type qudt:Unit ;
  rdfs:label "Millimetre"@en ;
  qudt:conversionMultiplier 0.001 ;
  qudt:conversionMultiplierSN 1.0E-3 ;
  qudt:conversionOffset 0.0 ; # mandatory for interval scales only
  qudt:hasDimensionVector qkdv:A0E0L1I0M0H0T0D0 ;
  qudt:plainTextDescription "The millimetre (International spelling as used by the International Bureau of Weights and Measures) or millimeter (American spelling) (SI unit symbol mm) is a unit of length in the metric system, equal to one thousandth of a metre, which is the SI base unit of length. It is equal to 1000 micrometres or 1000000 nanometres. A millimetre is equal to exactly 5/127 (approximately 0.039370) of an inch." ;
  qudt:hasQuantityKind quantitykind:Length ; 
  rdfs:isDefinedBy <http://qudt.org/2.1/vocab/unit> ;
.

In other words,

  • rdf:type . qudt:Unit, and optionally any other qudt class names in the Unit class hierarchy
  • rdfs:label . A human-readable ASCII label, with a language tag
  • qudt:conversionMultiplier . The multiplier to convert quantities using this unit to quantities using the SI unit for this quantity kind, expressed as an xsd:decimal
  • qudt:conversionMultiplierSN . The multiplier expressed in scientific notation
  • qudt:conversionOffset (optional) . The offset used to convert units for interval scales to the SI unit of this quantity kind
  • qudt:hasDimensionVector . The dimension vector associated with this unit
  • qudt:plainTextDescription . Some sort of description. (No language tags here, please, unlike rdfs:label, described below).
  • qudt:hasQuantityKind . The quantity kind(s) for this unit. They should all be of the same dimensionality. Also, see appropriateUnit for a description of the consequences of populating this property
  • rdfs:isDefinedBy . The graph this unit is defined in (which by convention is uniquely associated with the containing file)

Conventions for rdfs:label

A few more comments regarding labeling conventions:

Title case basically means to capitalize every word with the exception of articles (a, an, the), coordinating conjunctions (and, or, but,...), and (short) prepositions (in, on, for, up,...).

  • Note that we are using international spelling for labels, with the @en language tag. When the U.S. spelling differs, we add a second label instance with the @en-us tag, and additional label instances for each additional language.

  • skos:altLabel with a language tag is appropriate for alternative ways of referring to that entity other than a direct translation of the label into another language, (e.g. unit:MilliIN has rdfs:label "Milli-inch"@en, skos:altLabel "mil"@en-us and skos:altLabel "thou"@en-gb.

  • For units raised to some power, we use the terms "Square xyz", "Cubic xyz", "Quartic xyz"... rather that "xyz Squared"

  • For reciprocal units, we use the term "Reciprocal xyz" rather than "Per xyz", e.g. "Reciprocal Hour". But note Rule 5 regarding the separation of numerator and denominator in compound units, using "-PER-"

  • Case - In QUDT, Titlecase means that "per" will not be capitalized

    • Microgray
    • Picocoulomb
    • Becquerel Second per Cubic Metre
    • Erg per Gram Second
    • Gray per Second
    • Microradian
    • Minute
  • Plural vs. Singular - Everything will be singular

    • Millimole per Kilogram
    • Femtomole per Litre
  • Abbreviations - Terms will be written out fully, with the abbreviation in parentheses. Note that we record the symbol explicitly using qudt:symbol, which may be different from the abbreviation.

    • British Thermal Unit (BTU)
  • Parentheses - It is also worth noting that qualifiers (denoted with an underscore in the URI) normally appear in the label within parentheses.

  • Hyphens - Just use Titlecase, but with a hyphen when it is commonly used in natural language

    • Astronomical Unit (AU)
    • Light-year
    • Atomic Number

These conventions are not perfect, nor are they comprehensive, underscoring the reason we have paid a lot of attention to removing ambiguity in naming URIs.

Finally, since assigning these labels doesn't easily lend itself to automation, we look to you, the open source community, to help us with crowd-sourced fixes! We encourage you to submit pull requests!

There are additional recommended properties, shown in this excerpt for the Metre. Please note that we are in the process of migrating all the values of qudt:symbol to Unicode, so please do that for any new submissions. (Note that some of these are subject to revision in future versions)

unit:M
  a qudt:BaseUnit ;
  a qudt:LengthUnit ;
  a qudt:MKS-Unit ;
  qudt:abbreviation "m" ;
  qudt:code "1090" ;
  qudt:longDescription "The metric and SI base unit of distance.  The 17th General Conference on Weights and Measures in 1983 defined the metre as that distance that makes the speed of light in a vacuum equal to exactly 299 792 458 metres per second. The speed of light in a vacuum, $c$, is one of the fundamental constants of nature. The metre is equal to approximately 1.093 613 3 yards, 3.280 840 feet, or 39.370 079 inches." ;
  qudt:symbol "m" ;
  qudt:ucumCode "m"^^qudt:UCUMcs-term ;
  qudt:uneceCommonCode "MTR" ;
  qudt:exactMatch <http://dbpedia.org/resource/Metre> ;
  prov:wasInfluencedBy <http://en.wikipedia.org/wiki/Metre?oldid=495145797> ;
.

Synonyms

Sometimes units with different common names are really the same in magnitude and dimension, but there is not agreement on which is the "main" name. AMU (Atomic Mass Unit) and Da (Dalton) are examples. In these cases, rather than populating a skos:altLabel relation, it might be more appropriate to define both units, with each one pointing to the other using qudt:exactMatch.

Final note

All possible properties that could be included can be found in the QUDT Schema file /schema/SCHEMA_QUDT-v2.1.ttl in this repository.

Adding a new unit vocabulary

Adding an entirely new vocabulary involves some additional work related to defining an additional ontology along with the metadata for the ontology and the catalog entries needed to have it appear on the QUDT website. You can use the template vocabulary file found here can be used as a starting point. After making a copy of this template file, change all occurrences of:

  • "Template" to your vocabulary name
  • "2019-01-01" to the date of creation, publication, etc.
  • "9999" to an appropriate code number, if applicable
  • "Your name" to your name

One sample unit (with qname unit:ExampleTemplateUnit) is included in the template file, which can be used for reference and then deleted. Remember to change the file name according to the content being added. Below is the file naming convention used in QUDT:

  • VOCAB_QUDT-UNITS-<TYPE>-v2.1.ttl

Alternatively, you could copy an existing unit graph and modify it, making sure to follow the naming and property rules.

Once your new vocabulary file has been created, you should submit a pull request as documented in our Git Best Practices.

(File last modified 2024-06-25 by Steve Ray)