Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema Descriptors Framework #288

Merged
merged 10 commits into from
May 7, 2018
12 changes: 12 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -351,6 +351,17 @@ The third schema is `third.schema.json`, it extends both `second`, and transitiv
}
```

### Schema Descriptors

Schema descriptors are an extensible mechanism for providing additional metadata about an XDM schema. For example, schema descriptors can be used to define relationships between schemas or to annotate schema properties with additional metadata. Schema descriptors may be used when certain properties of a schema are not static (which could usually be described in the schema directly) but may vary from usage to usage.

Details on using and defining schema descriptors may be found in the section [Schema Descriptors](./docs/descriptors.md) of the specification.

Schema descriptors are extensible, and new descriptors may be creating by defining a new URI value and using it in
the `@type` property of the descriptor object. Readers should ignore descriptors they do not understand.

Schema descriptors are defined in XDM using the `SchemaDescriptor` schema.

### Structuring Schemas - Nesting versus Namespaces

The use of JSON-LD namespaces in XDM means that schema definitions are organized around two axes. The first is the structure of the JSON, which may be nested to an arbitrary depth. The second is the orthogonal layer created by each independent namespace. While both organizing axes are available, it is important to use each for its intended purpose.
Expand All @@ -374,6 +385,7 @@ XDM is using a couple of custom keywords that are not part of the JSON Schema st

* `meta:extensible`: see above, to describe schemas that allow custom properties
* `meta:auditable`: for schemas that have created and last modified dates
* `meta:descriptors`: to annotate schemas with additional metadata (see Schema Descriptors above)
* `meta:enum`: for known values in enums, strings, and as property keys

## Writing Styleguides
Expand Down
232 changes: 232 additions & 0 deletions docs/descriptors.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,232 @@
# Schema Descriptors

## Overview

XDM allows for additional metadata about a schema to be described using a "schema descriptor". A schema descriptor is applied to a schema using the "meta:descriptors" property. Descriptors may be embedded directly in the schema document, or may be described as independent external entities. The ability to define descriptors outside of a schema is useful for, among other things, annotating schemas that are not under an application's direct control with additional metadata. See examples below.

Schema descriptors are extensible, and new descriptors may be created by defining a new URI value and using it in
the `@type` property of the descriptor object. Readers should ignore descriptors they do not understand.

Schema descriptors are defined in XDM using the `SchemaDescriptor` schema.

## Defining Schema Relationships

While schema descriptors can be used to define metadata about a single schema, they are also common used to describe relationships between schemas. This mechanism can be used to link schemas together at the property level, defining the equivalent of "foreign key" relationships in a relational database.

The following relationship types are defined by XDM:

* `xdm:oneToOne`: describes a 1:1 relationship between a source schema and a destination schema
* `xdm:oneToMany`: describes a 1:m relationship between a source schema and a destination schema
* `xdm:manyToMany`: describes an m:n relationship between a source schema and a destination schema

These relationships are defined in XDM using the `RelationshipDescriptor` schema.

## Update Policies

Data described by an XDM schema may change over time, and as such a data object may reflect an update of a previous instance of that object. There are different ways that an update may be handled, and this way depends both on the nature of the data and the specific application it is being used for.

XDM defines a schema descriptor of type `xdm:descriptorUpdatePolicy`, which describes several common methods of handling an update:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really about schema updating? seems more about the data described in a schema...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is about how the data described by the schema is updated. How the update occurs can be described on a property-by-property basis. The update policies might be different for different datasets that share a schema, which is why we need to put this in a descriptor versus just annotating the schema in the repo.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but isn't this outside the scope of XDM and the schemas? This seems like a business thing, perhaps even specific to XC?


* `xdm:updateMerge`: the data in the new object should be merged into the existing object; the method by which a merge is applied is defined by the application
* `xdm:updateReplace`: the new data object should replace the existing data object
* `xdm:updateTimeSeries`: the data is time series data, and the new object should be logged/collected without changing any existing data

Update policies are defined using the `UpdatePolicyDescriptor` schema.

## Other Supported Schema Descriptors

A number of additional schema descriptors are defined by XDM:

* `xdm:identityContext`: allows a property in a schema to be used as an [Identity](https://github.com/adobe/xdm/blob/master/docs/reference/context/identity.schema.md), even if it does not conform to the Identity schema.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these don't seem like good ideas to me! I would remove this whole section.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @trieloff intends on having a review for each of these. Probably easier to leave the doc text in for now and updating it if/when those PRs are merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lrosenthol, @kstreeter – yes, that was my intention. I can remove the entire contentious section from the doc, we just have to remember putting them back in in the individual PRs.

* `xdm:primaryKey`: allows a property other than `@id` to be flagged as the primary key for a schema
* `xdm:instantiable`: allows a schema to be flagged as 'instantiable', which may be used to differentiate schemas that define primary business objects versus supporting schemas intended to be embedded in another schema.

## Embedding Schema Descriptors in a Schema

The `SchemaDescriptor` schema is designed such that a descriptor can be fully defined as a standalone entity, or embedded in the schema it is describing. When embedded, a schema descriptor may be placed at the root of the schema (which is appropriate for a descriptor that applies to the whole schema) or placed on the sub-schema for a specific property (which is appropriate when the descriptor applies to a property).

In some cases, a descriptor may describe a symmetric relationship. For example, an `xdm:oneToOne` relationship is true for both the source and the destination properties. In this case, it is recommended that descriptors be placed on both the source and the destination.

Examples for each of these cases are shown below.

## Schema Descriptor Examples

### Example Relationship Descriptor

We have two schemas, which form a parent/child relationship. The first is parent.json:

```json
{
"$schema": "http://json-schema.org/draft-06/schema#",
"$id": "https://ns.adobe.com/xdm/example/parent",
"title": "Parent",
"type": "object",
"properties": {
"@id": {
"meta:descriptors": [
{
"@type": "xdm:oneToMany",
"xdm:destinationSchema": "https://ns.adobe.com/xdm/example/child",
"xdm:destinationProperty": "xdm:parent"
}
],
"type": "string"
}
}
}
```

The second is child.json:

```json
{
"$schema": "http://json-schema.org/draft-06/schema#",
"$id": "https://ns.adobe.com/xdm/example/child",
"title": "Child",
"type": "object",
"properties": {
"@id": { "type": "string" },
"xdm:parent": {
"meta:descriptors": [
{
"@type": "xdm:manyToOne",
"xdm:destinationSchema": "https://ns.adobe.com/xdm/example/parent",
"xdm:destinationProperty": "@id"
}
],
"type": "string",
"format": "uri"
}
}
}
```

The source schema in this example is Parent, which contains a single relationship descriptor describing a one-to-many relationship between objects of schema Parent to objects of schema Child.

The above example shows how a descriptor may be embedded in the schema being described, directly on the property where it applies. The example also shows the reciprocal relationship between the parent and child entities. If we were to define this as a stand-alone descriptor, it would look like this:

```json
{
"@id": "https://example.com/descriptors/1",
"@type": "xdm:oneToMany",
"xdm:sourceSchema": "https://ns.adobe.com/xdm/example/parent",
"xdm:sourcePropery": "@id",
"xdm:destinationSchema": "https://ns.adobe.com/xdm/example/child",
"xdm:destinationProperty": "xdm:parent"
}
```

This highlights the ability to use schema descriptors both directly in schemas and also as independent entities.

### Example Identity Descriptor

We have a schema that describes a customer record, which contains an customer ID as a property:

```json
{
"$schema": "http://json-schema.org/draft-06/schema#",
"$id": "https://ns.example.com/xdm/customerrecord",
"title": "CustomerRecord",
"type": "object",
"properties": {
"@id": { "type": "string" },
"https://ns.example.com/xdm/customerID": {
"meta:descriptors": [
{
"@type": "xdm:identityContext",
"xdm:namespace": "https://id-server.adobe.com/1234",
"xdm:property": "code"
}
],
"type": "string"
}
}
}
```

The customer ID is present, but does not contain other information needed to ensure the identity is fully described, such as the ID namespace, or whether this value represents the application's native ID for this customer or if this is an ID given my some external system.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this specific to identify and IDs? Wouldn't this concept apply to other things?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could: I am sure we could define descriptors to transform or augment any data to match an XDM definition. This descriptor isn't meant to be that general though. This is used to solve a specific (and common) problem we have when ingesting customer data: pulling out the identifiers so that we can link or join the customer-provided data to data that we generate that contains identifiers.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so just as with the other case, this seems very specific to your business and not to XDM in general

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concept of end user identity is a key aspect of both how XDM models data about end-user interactions (ExperienceEvents) and how that is translated to an actionable view of the end user (Profile). The Identity and Namespace schemas describe the framework used to manage that end user identity. All of the schemas mentioned are part of XDM.

This descriptor solves a part of that overall problem: sometimes we need to connect data modeled as XDM (like EE and Profile) back to data that isn't modeled in XDM (for example, a legacy customer management system). It is likely that anyone using EE or Profile would have this problem, and need this descriptor.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and EEs are very tied to your business - and that's fine. But they have nothing to do with mine or that of CC. As such, identity is not a core piece of our XDM strategy.
I am perfectly fine with this being an XC-specific extension to XDM - but as is, it is not something that belongs in the "core" (IMO).


We can use an identity descriptor to provide the additional details. The descriptor signals the namespace the ID is managed under (in this case, a fictitious service at id-server.adobe.com), and also signals that the value is a "code", meaning it is the externally managed handle for some ID managed by the namespace.

### Example Primary Key Descriptor

We have a schema that describes a sales order taken from an external sales management system. As this schema is directly transcribed from the external system's data schema, it does not follow the XDM best practice of using `@id` as the primary key:

```json
{
"$schema": "http://json-schema.org/draft-06/schema#",
"$id": "https://ns.example.com/xdm/salesorder",
"title": "SalesOrder",
"type": "object",
"properties": {
"https://ns.example.com/xdm/txID": {
"meta:descriptors": [
{
"@type": "xdm:primaryKey"
}
],
"type": "string"
},
"https://ns.example.com/xdm/confirmationNum": { "type": "string" },
"https://ns.example.com/xdm/customerID": { "type": "string" },
"https://ns.example.com/xdm/productID": { "type": "string" }
}
}
```

It is not obvious which field is best suited to be the primary key for this data. The descriptor signals that the transaction identifier at 'txID' is the appropriate key to be used for this data.

### Example of Defining a New Schema Descriptor

Let's say Example.com would like to annotate their schemas with information on whether they are actively being used in their application, a cloud service. They'd like to know if the schema is being used in production, in staging, or is unused.

They need to do two things to define the new descriptor. First, they create a new URI to define the type of the descriptor: 'https://ns.example.com/descriptors/inuse'.

Next, they define an extension to `SchemaDescriptor` containing the in-use flag:

```json
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this example is wrong, esp in terms of namespacing

{
"$id": "https://ns.example.com/xdm/inusedescriptor",
"$schema": "http://json-schema.org/draft-06/schema#",
"title": "In Use Descriptor",
"meta:extends": [
"https://ns.adobe.com/xdm/common/schemadescriptor#/definitions/descriptor"
],
"meta:abstract": false,
"type": "object",
"description": "where is this schema being used?",
"definitions": {
"inusedescriptor": {
"properties": {
"xdm:usage": {
"title": "Usage",
"type": "string",
"description": "the usage state of the schema",
"enum": ["production", "stage", "none"]
}
},
"required": ["xdm:usage"]
}
},
"allOf": [
{
"$ref":
"https://ns.adobe.com/xdm/common/schemadescriptor#/definitions/descriptor"
},
{
"$ref": "#/definitions/inusedescriptor"
}
]
}
```

Applying this descriptor might look like:

```json
{
"@id": "https://example.com/descriptors/4",
"@type": "https://ns.example.com/descriptors/inuse",
"xdm:sourceSchema": "https://ns.example.com/xdm/salesorder",
"xdm:usage": "production"
}
```
8 changes: 8 additions & 0 deletions meta.schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,14 @@
}
]
},
"meta:descriptors": {
"type": "array",
"items": {
"type": "object",
"$ref":
"https://ns.adobe.com/xdm/common/desciptors/schemadescriptor#/definitions/descriptor"
}
},
"type": {
"type": "string",
"const": "object"
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
"aem_user": "packageUser",
"aem_password": "override me securely",
"markdown-importer-version": "0.0.4",
"schemas": 137
"schemas": 139
},
"scripts": {
"clean": "rm -rf docs/reference",
Expand Down
3 changes: 3 additions & 0 deletions schemas/common/descriptors/itemselector.description.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Describes how to select or match to a specific item from an array of values described by an XDM schema.

Matching may be done based on array index, `@id`, `@type`, or schema URI.
3 changes: 3 additions & 0 deletions schemas/common/descriptors/itemselector.example.1.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"xdm:id": "https://example.com/objects/12345"
}
3 changes: 3 additions & 0 deletions schemas/common/descriptors/itemselector.example.2.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"xdm:type": "https://ns.adobe.com/experience/mcid"
}
3 changes: 3 additions & 0 deletions schemas/common/descriptors/itemselector.example.3.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"xdm:index": 0
}
3 changes: 3 additions & 0 deletions schemas/common/descriptors/itemselector.example.4.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"xdm:schema": "https://ns.adobe.com/xdm/context/identity"
}
75 changes: 75 additions & 0 deletions schemas/common/descriptors/itemselector.schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
{
"meta:license": [
"Copyright 2018 Adobe Systems Incorporated. All rights reserved.",
"This work is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license",
"you may not use this file except in compliance with the License. You may obtain a copy",
"of the License at https://creativecommons.org/licenses/by/4.0/"
],
"$id": "https://ns.adobe.com/xdm/common/descriptors/itemselector",
"$schema": "http://json-schema.org/draft-06/schema#",
"title": "Item Selector",
"meta:extensible": false,
"meta:abstract": false,
"type": "object",
"description":
"Describes how to select or match to a specific item from an array of values described by an XDM schema. Matching may be done based on array index, @id, @type, or schema URI.",
"definitions": {
"selector": {
"oneOf": [
{
"properties": {
"xdm:index": {
"title": "Index",
"type": "integer",
"description":
"When present, indicates the item at this array index should be selected.",
"minimum": 0
}
},
"required": ["xdm:index"]
},
{
"properties": {
"xdm:id": {
"title": "ID",
"type": "string",
"format": "uri",
"description":
"When present, indicates the item with this @id value should be selected."
}
},
"required": ["xdm:id"]
},
{
"properties": {
"xdm:type": {
"title": "Type",
"type": "string",
"format": "uri",
"description":
"When present, indicates the item with this @type value should be selected."
}
},
"required": ["xdm:type"]
},
{
"properties": {
"xdm:schema": {
"title": "Schema",
"type": "string",
"format": "uri",
"description":
"When present, indicates the item which conforms to this schema URI should be selected."
}
},
"required": ["xdm:schema"]
}
]
}
},
"allOf": [
{
"$ref": "#/definitions/selector"
}
]
}
9 changes: 9 additions & 0 deletions schemas/common/descriptors/schemadescriptor.example.1.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"@id": "https://example.com/descriptors/1",
"@type": "xdm:descriptorPrimaryKey",
"xdm:source": "https://ns.adobe.com/xdm/context/profile",
"xdm:sourceProperty": "xdm:identities",
"xdm:sourceItem": {
"xdm:type": "https://ns.adobe.com/experience/mcid"
}
}
Loading