-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Self-hosting gen of the schema-schema. #62
Conversation
... With a few minor alterations. I've taken the opportunity to make a few tweaks to naming as I went. A few cases use keyed unions when they should be kinded; this is a significant todo. A few cases use keyed unions where the schema-schema declaration says they should use inline representations! In these, I've come to believe the schema-schema made a mistake; we probably will update it. The overall structure, however, is not significantly diverged. The whole specification is still written in code: the 'SpawnFoo' placeholder methods are in heavy use. (This might herald the beginning of the end, for them, though!) (We're also still making little hacks to dodge the current placeholder typeinfo's lack of support for inline defns; but this is purely a problem of the placeholder typeinfo structures, and can disappear the instant we replace them.) If you run this generation, the emitted code is (aside from those caveats listed above) suitable for parsing schema declarations. ...! Which means... we can soon turn around and start using this to build up tooling which actually uses schema JSON as a config mechanism. Which will then bring us quite a bit closer to being able to make free-standing usable CLI tools for working with further codegen. There's a few other bits to go. For starters, right now, this is just generating output into a demo output dir. I've made no attempt in this commit to rig it up as a proper snake-eating-its-tail by replacing the 'SpawnFoo' methods and placeholder type info; that'll come in due time. (And I think we may still have fun choices coming up with that, incidentally; the distinction between string type names and reified pointers is still looming, and we need to figure out what the story is for gen outputs containing their own type descriptions (which may touch on the same interface design choices); etc.) I'll probably move somewhat cautiously with this, and only cut over after polishing the gen outputs some more... but it's now near in reach. The size of the generated output is also very likely to need work. We're looking something on the order of 1.6MB of generated output. (It's *highly* redundant: if you gzip it, it's 95kb.) Mind: I've made *no* effort whatsoever to bring this down. So, it's probably safe to assume we'll find some low-hanging fruit when we actually look into it. (I'm not yet sure what the bar will be for satisfaction with this: I regard the current number as vaguely "seems rather high", but it's also for a fairly sizable schema and for a lot of features provided, so maybe some size trades are just what we're going to face in golang.) That's it. There will probably be some PRs to the schema-schema documents in the specs repo shortly. Other than those, this should also be about ready to line up with and parse JSON output created by the other IPLD Schema DSL->JSON parsers we already have, which could start unlocking some really neat stuff. 🎉
If you want to peak at the actual outputs, but can't be bothered to check out the repo and run it yourself: there's a commit here that has the full output checked in: 1f5a5cb (It's a temporary commit, though, on a branch not destined to merge -- it will probably disappear at some point.) |
@rvagg , this might be interesting to you. Or might not -- you can wait until I make schema-schema PRs in the specs repo to talk about this stuff, rather than sift through the gory "Spawn" method slew, if you like. |
Ok, I think I see what you might be getting at with the inline representations and inline types, they appear to be coupled problems. And you should really get "unit" into the schema-schema and docs, it's a very particular concept we need to socialise. Also yay! I started down a path or writing a schema validator for JS but while doing it I felt like I was manually writing a schema-schema representation in code and it was just wrong and I couldn't expend the brainpower at the time to figure out how to make it right and whether I was chasing a rabbit down a hole that I'd come to regret once I got too far, so I stopped! I need to revisit some of that work at some point and figure out the relationship of schema validation to schema-schema and how to reuse as much of the schema-schema as possible to apply to the task of validating that a schema takes a proper structure. Re next steps, I assume at some point soon you might check in a copy of the codegen'd schema-schema so it can be linked against by other tools (like the schema parser), but right now it's just too darn big? |
Yep yep and yep. I think it might still remain entirely possible to check in the generated code, even at this size, but... it would be nice if the size goes down, yeah. And either way I think I'd like to knock around with it a bit first and try to get a feel for if the ergonomics are at all right, what other methods might be needed, etc. (Which might involve a few other demos, too.) |
Merginate'n, todos and all. I'll be working on relevant missing features (e.g. kinded union gen) on master and continuing to update this going forward as those pieces become available. |
Self-hosting gen of the schema-schema!
... Mostly. I've taken the opportunity to make a few tweaks to naming as I went, and there are a few other minor divergences:
And he whole specification is still written in code: the 'SpawnFoo' placeholder methods are in heavy use. (The beginning of the end for them might be nigh, though!)
The overall structure, however, is not significantly diverged. This is a description of the schema-schema, and we're generating code for it.
...!
Which means... we can soon turn around and start using this to build up tooling which actually uses schema JSON as a config mechanism. Which will then bring us quite a bit closer to being able to make free-standing usable CLI tools for working with further codegen.
There's a few other bits to go. For starters, right now, this is just generating output into a demo output dir. I've made no attempt in this commit to rig it up as a proper snake-eating-its-tail by replacing the 'SpawnFoo' methods and placeholder type info; that'll come in due time. (And I think we may still have fun choices coming up with that, incidentally; the distinction between string type names and reified pointers is still looming, and we need to figure out what the story is for gen outputs containing their own type descriptions (which may touch on the same interface design choices); etc.) I'll probably move somewhat cautiously with this, and only cut over after polishing the gen outputs some more... but it's now near in reach.
The size of the generated output is also very likely to need work. We're looking something on the order of 1.6MB of generated output. (It's highly redundant: if you gzip it, it's 95kb.) Mind: I've made no effort whatsoever to bring this down. So, it's probably safe to assume we'll find some low-hanging fruit when we actually look into it. (I'm not yet sure what the bar will be for satisfaction with this: I regard the current number as vaguely "seems rather high", but it's also for a fairly sizable schema and for a lot of features provided, so maybe some size trades are just what we're going to face in golang.)
That's it. There will probably be some PRs to the schema-schema documents in the specs repo shortly. Other than those, this should also be about ready to line up with and parse JSON output created by the other IPLD Schema DSL->JSON parsers we already have, which could start unlocking some really neat stuff. 🎉