Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile a Schema including all of it's referenced Schemas into one final Schema? #361

Open
nathan-fiscaletti opened this issue Mar 7, 2022 · 5 comments

Comments

@nathan-fiscaletti
Copy link

Is there a way to create an object from a schema?

I'm effectively trying to compare two schemas. These might have references to other schemas as they are stored in two completely different locations. But my goal is to ensure that the schema generated via one method is exactly identical to a schema generated by another method.

I'm using this for input/output validation on a JSON related network event. I want to make sure that the output of one endpoint matches the input of another. I currently have an "output" schema for one endpoint and an "input" schema for another. So long as these match (using lodash's isEqual function), i'm considering it fine. However, in the instances where they rely on other schemas referenced by ID, they might not match if the two separate places generating the schemas don't use the same identifiers for schemas.

So I'm looking for either a way to generate a "final schema" (basically replacing any schema IDs with the actual schema) or a way to generate an object from the schema so that i can compare the resulting objects.

@awwright
Copy link
Collaborator

awwright commented Mar 8, 2022

So you want to know if two schemas describe the same set of valid JSON documents? That the sets of JSON documents that are valid according to the schemas are identical?

If the two schemas specify the same set of instances with different keywords, this might be impossible to answer in the general (the existence of the "not" keyword complicates things considerably). As long as all the keywords in use can be expressed as a finite state automation (which is most of them), you would have to compile the JSON Schema into a finite state automation and then test the DFAs for equality.

If the keywords are the same but the IDs are different, then this is merely a graph equality problem. While not impossible, it is potentially O(n!) complexity, with n being the number of IDs in use.

As far as I'm aware, there's no library to test for equality besides the most trivial of cases (e.g. this library just does JSON.stringify on the schemas).

What do you mean by "Create an object from a schema"?

@nathan-fiscaletti
Copy link
Author

nathan-fiscaletti commented Mar 8, 2022

@awwright I think schema equality theoretically would be possible if you could compile a schema from the base schema and all of it's referenced schema into a single JSON object describing the entire schema. (basically just replace the references to other schemas with the schemas themselves).

What do you mean by "Create an object from a schema"?

As far as this goes, I'm asking if there's a way to take a schema and generate an object that would be valid for that particular schema.

For example, given the following schema:

{
    type: 'object',
    properties: {
        name: { type: 'string' },
        message: { type: 'string' }
    },
    required: [ 'name' ]
}

A function named validObjectsForSchema(), when passed this schema, would return an array containing the following:

{
    "name": ""
}
{
    "name": "",
    "message": ""
}

(Values would obviously not be considered when generating the "valid objects".

@awwright
Copy link
Collaborator

awwright commented Mar 8, 2022

if you could compile a schema from the base schema and all of it's referenced schema into a single JSON object describing the entire schema

Something like this would work, but you would need to detect recursion.

You would also want to ignore some annotation-only keywords like "description".

I'm asking if there's a way to take a schema and generate an object that would be valid for that particular schema.

This is supposed to be the purpose of the "default" keyword, although technically the default doesn't need to be valid against its own schema. Maybe this is OK.

There's also "coercion" where a minimal number of changes are made to turn an invalid instance into a valid one. For example, given a schema {type: "number"}, turning the string "40" into the number 40. Unfortunately again, there's so many ways to do this, I'm not aware of a library that does it in a standard fashion.

@awwright
Copy link
Collaborator

awwright commented Mar 8, 2022

However, see https://github.com/tdegrunt/jsonschema#pre-property-validation-hook for some additional guidance on this.

@nathan-fiscaletti
Copy link
Author

Something like this would work, but you would need to detect recursion.

What I've come up with so far for this is as follows:

class SchemaCompiler {
    constructor(rootSchema) {
        this.rootSchema = rootSchema;
        this.schemas = [];
    }

    addSchema(schema, id) {
        id = id || schema.$id || schema.id;
        if (id === undefined) {
            throw new Error('missing schema id');
        }
        this.schemas[id] = schema; 
    }

    compile() {
        const _compile = (schema) => {
            let resolveFailures = [];
            const compiled = schema;
            Object.entries(schema).forEach(([key, val]) => {
                if (typeof val === 'object' && !Array.isArray(val)) {
                    if (val.$ref !== undefined) {
                        if (this.schemas[val.$ref]) {
                            const { 
                                compiled: _compiled,
                                resolveFailures: _resolveFailures
                            } = _compile(this.schemas[val.$ref]);
                            _resolveFailures.forEach(id => resolveFailures.push(id));
                            compiled[key] = _compiled;
                        } else {
                            resolveFailures.push(val.$ref);
                            const { 
                                compiled: _compiled,
                                resolveFailures: _resolveFailures
                            } = _compile(val);
                            _resolveFailures.forEach(id => resolveFailures.push(id));
                            compiled[key] = _compiled;
                        }
                    } else {
                        const { 
                            compiled: _compiled,
                            resolveFailures: _resolveFailures
                        } = _compile(val);
                        _resolveFailures.forEach(id => resolveFailures.push(id));
                        compiled[key] = _compiled;
                    }
                }
            });
        
            delete compiled.id;
            return { compiled, resolveFailures };
        };

        const { compiled, resolveFailures } = _compile(this.rootSchema);
        return {
            compiled,
            warnings: [... new Set(resolveFailures)].map(
                id => new Error(`Failed to resolve schema with ID ${id}.`)
            )
        };
    }
}

Using the following schemas:

var personAttributesSchema = {
    id: '/PersonAttributes',
    type: 'object',
    properties: {
        location: { type: 'string' },
        age: { type: 'number' }
    },
    required: [ 'location', 'age' ]
};

var personSchema = {
    id: '/Person',
    type: 'object',
    properties: {
        name: { type: 'string' },
        attributes: {
            $ref: '/PersonAttributes'
        }
    },
    required: [ 'name', 'attributes' ]
};

var teamSchema = {
    id: '/Team',
    type: 'array',
    items: {
        $ref: '/Person'
    }
};

The following:

const sc = new SchemaCompiler(teamSchema);
sc.addSchema(personAttributesSchema);
sc.addSchema(personSchema);

const { compiled, warnings } = sc.compile();

console.log(JSON.stringify(compiled, null, 4));
console.log(warnings);

Produces this output:

{
    "type": "array",
    "items": {
        "type": "object",
        "properties": {
            "name": {
                "type": "string"
            },
            "attributes": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string"
                    },
                    "age": {
                        "type": "number"
                    }
                },
                "required": [
                    "location",
                    "age"
                ]
            }
        },
        "required": [
            "name",
            "attributes"
        ]
    }
}
[]

If I remove comment out one of the Schemas, like bellow:

const sc = new SchemaCompiler(teamSchema);
//sc.addSchema(personAttributesSchema);
sc.addSchema(personSchema);

const { compiled, warnings } = sc.compile();

console.log(JSON.stringify(compiled, null, 4));
console.log(warnings);

I will get the following output:

{
    "type": "array",
    "items": {
        "type": "object",
        "properties": {
            "name": {
                "type": "string"
            },
            "attributes": {
                "$ref": "/PersonAttributes"
            }
        },
        "required": [
            "name",
            "attributes"
        ]
    }
}
[
  Error: Failed to resolve schema with ID /PersonAttributes.
      at C:\Users\Nathan\git-repos\personal\schema-equals\index.js:57:23
      at Array.map (<anonymous>)
      at SchemaCompiler.compile (C:\Users\Nathan\git-repos\personal\schema-equals\index.js:56:54)
      at Object.<anonymous> (C:\Users\Nathan\git-repos\personal\schema-equals\index.js:97:35)
      at Module._compile (internal/modules/cjs/loader.js:999:30)
      at Object.Module._extensions..js (internal/modules/cjs/loader.js:1027:10)
      at Module.load (internal/modules/cjs/loader.js:863:32)
      at Function.Module._load (internal/modules/cjs/loader.js:708:14)
      at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:60:12)
      at internal/main/run_main_module.js:17:47
]

I'm sure i'm missing quite a bit here. But I feel like this is at least a starting point?

@nathan-fiscaletti nathan-fiscaletti changed the title Create an object from a schema? Compile a Schema including all of it's referenced Schemas into one final Schema? Mar 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants