Self-hosting gen of the schema-schema. #62

warpfork · 2020-07-27T20:26:08Z

Self-hosting gen of the schema-schema!

... Mostly. I've taken the opportunity to make a few tweaks to naming as I went, and there are a few other minor divergences:

A few cases use keyed unions when they should be kinded; this is a significant todo.
A few cases use keyed unions where the schema-schema declaration says they should use inline representations!
- In these, I've come to believe the schema-schema made a mistake; we probably will update it.
We're still making little hacks to dodge the current placeholder typeinfo's lack of support for inline defns.
- ...but this is purely a problem of the placeholder typeinfo structures, and can disappear the instant we replace them.

And he whole specification is still written in code: the 'SpawnFoo' placeholder methods are in heavy use. (The beginning of the end for them might be nigh, though!)

The overall structure, however, is not significantly diverged. This is a description of the schema-schema, and we're generating code for it.

...!

Which means... we can soon turn around and start using this to build up tooling which actually uses schema JSON as a config mechanism. Which will then bring us quite a bit closer to being able to make free-standing usable CLI tools for working with further codegen.

There's a few other bits to go. For starters, right now, this is just generating output into a demo output dir. I've made no attempt in this commit to rig it up as a proper snake-eating-its-tail by replacing the 'SpawnFoo' methods and placeholder type info; that'll come in due time. (And I think we may still have fun choices coming up with that, incidentally; the distinction between string type names and reified pointers is still looming, and we need to figure out what the story is for gen outputs containing their own type descriptions (which may touch on the same interface design choices); etc.) I'll probably move somewhat cautiously with this, and only cut over after polishing the gen outputs some more... but it's now near in reach.

The size of the generated output is also very likely to need work. We're looking something on the order of 1.6MB of generated output. (It's highly redundant: if you gzip it, it's 95kb.) Mind: I've made no effort whatsoever to bring this down. So, it's probably safe to assume we'll find some low-hanging fruit when we actually look into it. (I'm not yet sure what the bar will be for satisfaction with this: I regard the current number as vaguely "seems rather high", but it's also for a fairly sizable schema and for a lot of features provided, so maybe some size trades are just what we're going to face in golang.)

That's it. There will probably be some PRs to the schema-schema documents in the specs repo shortly. Other than those, this should also be about ready to line up with and parse JSON output created by the other IPLD Schema DSL->JSON parsers we already have, which could start unlocking some really neat stuff. 🎉

... With a few minor alterations. I've taken the opportunity to make a few tweaks to naming as I went. A few cases use keyed unions when they should be kinded; this is a significant todo. A few cases use keyed unions where the schema-schema declaration says they should use inline representations! In these, I've come to believe the schema-schema made a mistake; we probably will update it. The overall structure, however, is not significantly diverged. The whole specification is still written in code: the 'SpawnFoo' placeholder methods are in heavy use. (This might herald the beginning of the end, for them, though!) (We're also still making little hacks to dodge the current placeholder typeinfo's lack of support for inline defns; but this is purely a problem of the placeholder typeinfo structures, and can disappear the instant we replace them.) If you run this generation, the emitted code is (aside from those caveats listed above) suitable for parsing schema declarations. ...! Which means... we can soon turn around and start using this to build up tooling which actually uses schema JSON as a config mechanism. Which will then bring us quite a bit closer to being able to make free-standing usable CLI tools for working with further codegen. There's a few other bits to go. For starters, right now, this is just generating output into a demo output dir. I've made no attempt in this commit to rig it up as a proper snake-eating-its-tail by replacing the 'SpawnFoo' methods and placeholder type info; that'll come in due time. (And I think we may still have fun choices coming up with that, incidentally; the distinction between string type names and reified pointers is still looming, and we need to figure out what the story is for gen outputs containing their own type descriptions (which may touch on the same interface design choices); etc.) I'll probably move somewhat cautiously with this, and only cut over after polishing the gen outputs some more... but it's now near in reach. The size of the generated output is also very likely to need work. We're looking something on the order of 1.6MB of generated output. (It's *highly* redundant: if you gzip it, it's 95kb.) Mind: I've made *no* effort whatsoever to bring this down. So, it's probably safe to assume we'll find some low-hanging fruit when we actually look into it. (I'm not yet sure what the bar will be for satisfaction with this: I regard the current number as vaguely "seems rather high", but it's also for a fairly sizable schema and for a lot of features provided, so maybe some size trades are just what we're going to face in golang.) That's it. There will probably be some PRs to the schema-schema documents in the specs repo shortly. Other than those, this should also be about ready to line up with and parse JSON output created by the other IPLD Schema DSL->JSON parsers we already have, which could start unlocking some really neat stuff. 🎉

warpfork · 2020-07-27T20:28:18Z

If you want to peak at the actual outputs, but can't be bothered to check out the repo and run it yourself: there's a commit here that has the full output checked in: 1f5a5cb

(It's a temporary commit, though, on a branch not destined to merge -- it will probably disappear at some point.)

warpfork · 2020-07-27T20:29:36Z

@rvagg , this might be interesting to you. Or might not -- you can wait until I make schema-schema PRs in the specs repo to talk about this stuff, rather than sift through the gory "Spawn" method slew, if you like.

rvagg · 2020-07-29T03:25:15Z

Ok, I think I see what you might be getting at with the inline representations and inline types, they appear to be coupled problems.

And you should really get "unit" into the schema-schema and docs, it's a very particular concept we need to socialise.

Also yay!

I started down a path or writing a schema validator for JS but while doing it I felt like I was manually writing a schema-schema representation in code and it was just wrong and I couldn't expend the brainpower at the time to figure out how to make it right and whether I was chasing a rabbit down a hole that I'd come to regret once I got too far, so I stopped! I need to revisit some of that work at some point and figure out the relationship of schema validation to schema-schema and how to reuse as much of the schema-schema as possible to apply to the task of validating that a schema takes a proper structure.

Re next steps, I assume at some point soon you might check in a copy of the codegen'd schema-schema so it can be linked against by other tools (like the schema parser), but right now it's just too darn big?

warpfork · 2020-07-29T04:27:27Z

Yep yep and yep.

I think it might still remain entirely possible to check in the generated code, even at this size, but... it would be nice if the size goes down, yeah. And either way I think I'd like to knock around with it a bit first and try to get a feel for if the ergonomics are at all right, what other methods might be needed, etc. (Which might involve a few other demos, too.)

warpfork · 2020-07-30T11:22:06Z

Merginate'n, todos and all. I'll be working on relevant missing features (e.g. kinded union gen) on master and continuing to update this going forward as those pieces become available.

warpfork merged commit ac99bd4 into master Jul 30, 2020

warpfork deleted the self-hosting-gen-of-schema branch July 30, 2020 11:22

aschmahmann mentioned this pull request Feb 18, 2021

Release v0.8.0 ipfs/kubo#7707

Closed

73 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Self-hosting gen of the schema-schema. #62

Self-hosting gen of the schema-schema. #62

warpfork commented Jul 27, 2020

warpfork commented Jul 27, 2020 •

edited

Loading

warpfork commented Jul 27, 2020

rvagg commented Jul 29, 2020

warpfork commented Jul 29, 2020

warpfork commented Jul 30, 2020

Self-hosting gen of the schema-schema. #62

Self-hosting gen of the schema-schema. #62

Conversation

warpfork commented Jul 27, 2020

warpfork commented Jul 27, 2020 • edited Loading

warpfork commented Jul 27, 2020

rvagg commented Jul 29, 2020

warpfork commented Jul 29, 2020

warpfork commented Jul 30, 2020

warpfork commented Jul 27, 2020 •

edited

Loading