DocFX supports different document processors to handle different kinds of input. For now, if the data model changes a bit, a new document processor is needed, even most of the work in processors are the same.
DocFX Document Schema (abbreviated to THIS schema below) is introduced to address this problem. This schema is a JSON media type for defining the structure of a DocFX document. This schema is intended to annotate, validate and interpret the document data.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
DocFX Document Schema is in JSON format. It borrows most syntax from JSON Schema, while it also introduces some other syntax to manipulate the data.
THIS schema is a JSON based format for the structure of a DocFX document.
JSON schema validation already defines many keywords. This schema starts from supporting limited keyword like type
, properties
.
Besides annotate and validate the input document model, THIS schema also defines multiple interpretations for each property of the document model.
For example, a property named summary
contains value in Markdown format, THIS schema can define a markup
interpretation for the summary
property, so that the property can be marked using DFM syntax.
- THIS schema leverages JSON schema definition, that is to say, keywords defined in JSON schema keeps its meaning in THIS schema when it is supported by THIS schema.
The files describing DocFX document model in accordance with the DocFX document schema specification are represented as JSON objects and conform to the JSON standards. YAML, being a superset of JSON, can be used as well to represent a DocFX document schema specification file.
All field names in the specification are case sensitive.
This schema exposes two types of fields. Fixed fields, which have a declared name, and Patterned fields, which declare a regex pattern for the field name. Patterned fields can have multiple occurrences as long as each has a unique name.
By convention, the schema file is suffixed with .schema.json
.
Primitive data types in THIS schema are based on JSON schema Draft 6 4.2 Instance
For a given field, *
as the starting character in Description cell stands for required.
This is the root document object for THIS schema.
Field Name | Type | Description |
---|---|---|
$schema | string | * The version of the schema specification, for example, https://github.com/dotnet/docfx/v1.0/schema# . |
version | string | * The version of current schema object. |
id | string | It is best practice to include an id property as an unique identifier for each schema. |
title | string | The title of current schema, LandingPage , for example. In DocFX, this value can be used to determine what kind of documents apply to this schema, If not specified, file name before schema.json of this schema is used. Note that . is not allowed. |
description | string | A short description of current schema. |
type | string | * The type of the root document model MUST be object . |
properties | Property Definitions Object | An object to hold the schema of all the properties. |
metadata | string | In json-pointer format as defined in http://json-schema.org/latest/json-schema-validation.html#rfc.section.8.3.9. The format for JSON pointer is defined by https://tools.ietf.org/html/rfc6901, referencing to the metadata object. Metadata object is the object to define the metadata for current document, and can be also set through globalMetadata or fileMetadata in DocFX. The default value for metadata is empty which stands for the root object. |
Field Name | Type | Description |
---|---|---|
^x- | Any | Allows extensions to THIS schema. The field name MUST begin with x-, for example, x-internal-id. The value can be null, a primitive, an array or an object. |
It is an object where each key is the name of a property and each value is a schema to describe that property.
Field Name | Type | Description |
---|---|---|
{name} | Property Object | The schema object for the {name} property |
An object to describe the schema of the value of the property.
Field Name | Type | Description |
---|---|---|
title | string | The title of the property. |
description | string | A lengthy explanation about the purpose of the data described by the schema. |
default | what type defined |
The default value for current field. |
type | string | The type of the root document model. Refer to type keyword for detailed description. |
properties | Property Definitions Object | An object to hold the schema of all the properties if type for the model is object . Omitting this keyword has the same behavior as an empty object. |
items | Property Object | An object to hold the schema of the items if type for the model is array . Omitting this keyword has the same behavior as an empty schema. |
reference | string | Defines whether current property is a reference to the actual value of the property. Refer to reference for detailed explanation. |
contentType | string | Defines the content type of the property. Refer to contentType for detailed explanation. |
tags | array | Defines the tags of the property. Refer to tags for detailed explanation. |
mergeType | string | Defines how to merge the property. Omitting this keyword has the same behavior as merge . Refer to mergeType for detailed explanation. |
xrefProperties | array | Defines the properties of current object when it is cross referenced by others. Each item is the name of the property in the instance. |
Field Name | Type | Description |
---|---|---|
^x- | Any | Allows extensions to THIS schema. The field name MUST begin with x-, for example, x-internal-id. The value can be null, a primitive, an array or an object. |
Same as in JSON schema: http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.25
The value of this keyword MUST be either a string or an array. If it is an array, elements of the array MUST be strings and MUST be unique.
String values MUST be one of the six primitive types ("null", "boolean", "object", "array", "number", or "string"), or "integer" which matches any number with a zero fractional part.
An instance validates if and only if the instance is in any of the sets listed for this keyword.
It defines whether current property is a reference to the actual value of the property. The values MUST be one of the following:
Value | Description |
---|---|
none |
It means the property is not a reference. |
file |
It means current property stands for a file path that contains content to be included. |
It defines how applications interpret the property. If not defined, the behavior is similar to default
value. The values MUST be one of the following:
Value | Description |
---|---|
default |
It means that no interpretion will be done to the property. |
uid |
type MUST be string . With this value, the property name MUST be uid . It means the property defines a unique identifier inside current document model. |
href |
type MUST be string . It means the property defines a file link inside current document model. Application CAN help to validate if the linked file exists, and update the file link if the linked file changes its output path. |
xref |
type MUST be string . It means the property defines a UID link inside current document model. Application CAN help to validate if the linked UID exists, and resolve the UID link to the corresponding file output link. |
file |
type MUST be string . It means the property defines a file path inside current document model. Application CAN help to validate if the linked file exists, and resolve the path to the corresponding file output path. The difference between file and href is that href is always URL encoded while file is not. |
markdown |
type MUST be string . It means the property is in DocFX flavored Markdown syntax. Application CAN help to transform it into HTML format. |
The value of this keyword MUST be an array
, elements of the array MUST be strings and MUST be unique. It provides hints for applications to decide how to interpret the property, for example, localizable
tag can help Localization team to interpret the property as localizable.
The value of this keyword MUST be a string. It specifies how to merge two values of the given property. One use scenario is how DocFX uses the overwrite files to overwrite the existing values. In the below table, we use source
and target
to stands for the two values for merging.
The value MUST be one of the following:
Value | Description |
---|---|
key |
If key for source equals to the one for target , these two values are ready to merge. |
merge |
The default behavior. For array , items in the list are merged by key for the item. For string or any value type, target replaces source . For object , merge each property along with its own merge value. |
replace |
target replaces source . |
ignore |
source is not allowed to be merged. |
Here's an sample of the schema. Assume we have the following YAML file:
### YamlMime:LandingPage
title: Web Apps Documentation
metadata:
title: Azure Web Apps Documentation - Tutorials, API Reference
meta.description: Learn how to use App Service Web Apps to build and host websites and web applications.
services: app-service
author: apexprodleads
manager: carolz
ms.service: app-service
ms.tgt_pltfrm: na
ms.devlang: na
ms.topic: landing-page
ms.date: 01/23/2017
ms.author: carolz
sections:
- title: 5-Minute Quickstarts
children:
- text: .NET
href: app-service-web-get-started-dotnet.md
- text: Node.js
href: app-service-web-get-started-nodejs.md
- text: PHP
href: app-service-web-get-started-php.md
- text: Java
href: app-service-web-get-started-java.md
- text: Python
href: app-service-web-get-started-python.md
- text: HTML
href: app-service-web-get-started-html.md
- title: Step-by-Step Tutorials
children:
- content: "Create an application using [.NET with Azure SQL DB](app-service-web-tutorial-dotnet-sqldatabase.md) or [Node.js with MongoDB](app-service-web-tutorial-nodejs-mongodb-app.md)"
- content: "[Map an existing custom domain to your application](app-service-web-tutorial-custom-domain.md)"
- content: "[Bind an existing SSL certificate to your application](app-service-web-tutorial-custom-SSL.md)"
In this sample, we want to use the JSON schema to describe the overall model structure. Further more, the href
is a file link. It need to be resolved from the relative path to the final href. The content
property need to be marked up as a Markdown string. The metadata
need to be tagged for further custom operations. We want to use section
's title
as the key for overwrite section
array.
Here's the schema to describe these operations:
{
"$schema": "https://dotnet.github.io/docfx/schemas/v1.0/schema.json#",
"version": "1.0.0",
"id": "https://github.com/dotnet/docfx/schemas/landingpage.schema.json",
"title": "LandingPage",
"description": "The schema for landing page",
"type": "object",
"properties": {
"metadata": {
"type": "object",
"tags": [ "metadata" ]
},
"sections": {
"type": "array",
"items": {
"type": "object",
"properties": {
"children": {
"type": "array",
"items": {
"type": "object",
"properties": {
"href": {
"type": "string",
"contentType": "href"
},
"text": {
"type": "string",
"tags": [ "localizable" ]
},
"content": {
"type": "string",
"contentType": "markdown"
}
}
}
},
"title": {
"type": "string",
"mergeType": "key"
}
}
}
},
"title": {
"type": "string"
}
}
}
- DocFX fills
_global
metadata into the processed data model, should the schema reflect this behavior?- If YES:
- Pros:
- Users are aware of the existence of
_global
metadata, they can overwrite the property if they want. - Template writers are aware of it, they can completely rely on the schema to write the template.
- Users are aware of the existence of
- Cons:
- Schema writers need aware of the existence of
_global
metadata, it should always exists for any schema. (Should we introduce in a concept of base schema?)
- Schema writers need aware of the existence of
- Pros:
- Decision: NOT include, this schema is for general purpose, use documents to describe the changes introduced by DocFX.
- If YES:
- Is it necessary to prefix
d-
to every field that DocFX introduces in?- If keep
d-
- Pros:
d-
makes it straightforward that these keywords are introduced by DocFX- Keywords DocFX introduces in will never duplicate with the one preserved by JSON schema
- Cons:
d-
prefix provides a hint that these keywords are not first class keywords- Little chance that keywords DocFX defines duplicate with what JSON schema defines, after all, JSON schema defines a finite set of reserved keywords.
- For exampleSwagger spec is also based on JSON schema and the fields it introduces in has no prefix.
- Pros:
- Decision: Remove
d-
prefix.
- If keep
- What's remaining work if to apply schema to the complex data model, for example, ManagedReference, or UniversalReference?
-
- OPS plugin framework, to insert metadata into the data model, for example, git commit id, git contributers.
Solution: A TagInterpreter plugin framework to insert metadata if the property contains
metadata
tag
- OPS plugin framework, to insert metadata into the data model, for example, git commit id, git contributers.
Solution: A TagInterpreter plugin framework to insert metadata if the property contains
-
- The schema is able to support complex Json Schema syntax, such as definition reference
#/definiton/commonobject
- The schema is able to support complex Json Schema syntax, such as definition reference
-
- Support complex syntax in
<xref>
to support specify the html content to be rendered. Currentxref
always renders to<a/> if
uidcan be resolved Idea: One idea is to support syntax similar to
that template writer can specify the template used to render
xref`
- Support complex syntax in
-
- Support overwrite the object with given
uid
Challenge: The schema can define multipleuid
s inside one document.
- Support overwrite the object with given
-
- Support incremental build
-