Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(tasks/ast_codegen): prototype for codegen AST related code #3815

Merged
merged 1 commit into from
Jun 25, 2024

Conversation

rzvxa
Copy link
Contributor

@rzvxa rzvxa commented Jun 21, 2024

Part of #3819

Copy link

graphite-app bot commented Jun 21, 2024

Your org has enabled the Graphite merge queue for merging into main

Add the label “merge” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge. Or use the label “hotfix” to add to the merge queue as a hot fix.

You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link.

@github-actions github-actions bot added the A-ast Area - AST label Jun 21, 2024
@rzvxa rzvxa changed the title feat(ast_gen): add ast code gen. feat(tasks/ast_codegen): add ast code gen. Jun 21, 2024
Copy link

codspeed-hq bot commented Jun 21, 2024

CodSpeed Performance Report

Merging #3815 will not alter performance

Comparing 06-22-feat_ast_gen_add_ast_code_gen (f6c4ec4) with main (92c21b2)

Summary

✅ 28 untouched benchmarks

@rzvxa rzvxa force-pushed the 06-22-feat_ast_gen_add_ast_code_gen branch 2 times, most recently from 8d45b85 to e203bc9 Compare June 22, 2024 12:11
@overlookmotel
Copy link
Contributor

overlookmotel commented Jun 22, 2024

What I had in mind for the "schema" is more abstracted, and would not use syn's types (kind of an IR). e.g.:

// `TypeId` is an index into `Types`
type TypeId = usize;
struct Types(Vec<TypeDef>);

enum TypeDef {
  Struct(StructDef),
  Enum(EnumDef),
  Box(BoxDef),
  Vec(VecDef),
  Option(OptionDef),
  Cell(CellDef),
  Primitive(PrimitiveDef),
  StrRef(StrRefDef),
  // I think that's all of them?
}

struct StructDef {
  name: String,
  fields: Vec<StructFieldDef>,
  has_lifetime: bool,
}

struct StructFieldDef {
  name: String,
  type_id: TypeId,
}

struct EnumDef {
  name: String,
  variants: Vec<EnumVariantDef>,
  // For `@inherits` inherited enum variants
  inherits: Vec<TypeId>,
  has_lifetime: bool,
}

struct EnumVariantDef {
  name: String,
  type_id: Option<TypeId>, // `None` for fieldless variants
  discriminant: Option<u8>,
}

// ... and so on ...

My thought was codegen proceeds in 2 steps:

  1. Parse .rs files and build Types.
  2. Generate code from Types (without needing original syn AST).

The types above are similar to what layout_inspect uses (and is what AST transfer will need): https://github.com/overlookmotel/layout_inspect/blob/master/layout_inspect/src/defs.rs

StructDef etc can have more fields added later on if we need them, but I think initially this is all that's needed.

Types could alternatively be a hashmap keyed by type name, but I thought a Vec which is indexed into would be simpler. It may come in useful later for every AST type to have a simple integer ID.

Does this make any sense? What do you think?

@rzvxa
Copy link
Contributor Author

rzvxa commented Jun 22, 2024

What I had in mind for the "schema" is more abstracted, and would not use syn's types (kind of an IR). e.g.:

// `TypeId` is an index into `Types`
type TypeId = usize;
struct Types(Vec<TypeDef>);

enum TypeDef {
  Struct(StructDef),
  Enum(EnumDef),
  Box(BoxDef),
  Vec(VecDef),
  Option(OptionDef),
  Cell(CellDef),
  Primitive(PrimitiveDef),
  StrRef(StrRefDef),
  // I think that's all of them?
}

struct StructDef {
  name: String,
  fields: Vec<StructFieldDef>,
  has_lifetime: bool,
}

struct StructFieldDef {
  name: String,
  type_id: TypeId,
}

struct EnumDef {
  name: String,
  variants: Vec<EnumVariantDef>,
  // For `@inherits` inherited enum variants
  inherits: Vec<TypeId>,
  has_lifetime: bool,
}

struct EnumVariantDef {
  name: String,
  type_id: Option<TypeId>, // `None` for fieldless variants
  discriminant: Option<u8>,
}

// ... and so on ...

My thought was codegen proceeds in 2 steps:

1. Parse `.rs` files and build `Types`.

2. Generate code from `Types` (without needing original `syn` AST).

The types above are similar to what layout_inspect uses (and is what AST transfer will need): https://github.com/overlookmotel/layout_inspect/blob/master/layout_inspect/src/defs.rs

StructDef etc can have more fields added later on if we need them, but I think initially this is all that's needed.

Types could alternatively be a hashmap keyed by type name, but I thought a Vec which is indexed into would be simpler. It may come in useful later for every AST type to have a simple integer ID.

Does this make any sense? What do you think?

I'll add the simple output, I keep the syn type in the building process so we can expand macros and generate AST in the future if needed.

you are looking at the Module sturct not the Schema.
I basically drop the syn types and keep their meta after the build as the schema which then can be outputed as either json or binary.

@overlookmotel
Copy link
Contributor

you are looking at the Module sturct not the Schema.
I basically drop the syn types and keep their meta after the build as the schema which then can be outputed as either json or binary.

Sorry if I misunderstood. I don't quite understand what you're doing, but no matter. I'm sure you have a plan!

@rzvxa
Copy link
Contributor Author

rzvxa commented Jun 23, 2024

you are looking at the Module sturct not the Schema.
I basically drop the syn types and keep their meta after the build as the schema which then can be outputed as either json or binary.

Sorry if I misunderstood. I don't quite understand what you're doing, but no matter. I'm sure you have a plan!

It is totally on me, Currently, the code is full of unused types and dirty prototypes, I'm focused on getting the whole pipeline to work and then going back to clean it up.

I also opted for interior mutability and cloning where needed since there aren't many performance concerns in this crate. I thought it would make development easier if we could be a little careless with how we pass around stuff, After all this script only runs once in a while and the runtime is minimal so not having to fight with borrow checker constantly is much more favorable than having a well-optimized build script.

tasks/ast_codegen/schema.json Outdated Show resolved Hide resolved
@rzvxa rzvxa force-pushed the 06-22-feat_ast_gen_add_ast_code_gen branch 3 times, most recently from 8922486 to 2f96e2a Compare June 23, 2024 09:12
@rzvxa rzvxa changed the base branch from main to 06-23-chore_ast_move_the_rest_of_ast_implementations_to_ast_impl June 23, 2024 09:12
@Boshen Boshen force-pushed the 06-23-chore_ast_move_the_rest_of_ast_implementations_to_ast_impl branch from 89d63cc to f029273 Compare June 23, 2024 09:56
@Boshen Boshen changed the base branch from 06-23-chore_ast_move_the_rest_of_ast_implementations_to_ast_impl to main June 23, 2024 10:01
@Boshen Boshen force-pushed the 06-22-feat_ast_gen_add_ast_code_gen branch from 37b34fc to 50fdc91 Compare June 23, 2024 10:01
@rzvxa rzvxa force-pushed the 06-22-feat_ast_gen_add_ast_code_gen branch from 50fdc91 to ee085f6 Compare June 23, 2024 11:34
@rzvxa
Copy link
Contributor Author

rzvxa commented Jun 23, 2024

@Boshen @overlookmotel This is almost ready, You may want to give it a spin.

@rzvxa rzvxa force-pushed the 06-22-feat_ast_gen_add_ast_code_gen branch from 2f010c1 to 835e59c Compare June 23, 2024 15:32
@rzvxa rzvxa changed the base branch from main to 06-23-refactor_ast_add_span_field_to_the_bindingpattern_type June 23, 2024 15:32
@Boshen Boshen force-pushed the 06-23-refactor_ast_add_span_field_to_the_bindingpattern_type branch from 7c0eb38 to 363d3d5 Compare June 23, 2024 16:00
@rzvxa rzvxa requested a review from Boshen June 24, 2024 14:59
@Boshen
Copy link
Member

Boshen commented Jun 24, 2024

@overlookmotel can you help me review and merge this one? This is beyond me 😅

I'll chime in when the edit - save workflow is up and running.

@rzvxa
Copy link
Contributor Author

rzvxa commented Jun 24, 2024

@overlookmotel I suggest you only review #3852 there is some stuff I've changed after this PR and I was too lazy to cherry-pick and merge them here.

@rzvxa
Copy link
Contributor Author

rzvxa commented Jun 24, 2024

@overlookmotel Just to make your life easier, I'm adding comments to help with the review process, I'll mention you in up stack PR as soon as I submit my docs commit.

@Boshen Boshen changed the title feat(tasks/ast_codegen): add ast code gen. feat(tasks/ast_codegen): add ast codegen Jun 24, 2024
@Boshen Boshen changed the title feat(tasks/ast_codegen): add ast codegen feat(tasks/ast_codegen): prototype for codegen AST related code Jun 24, 2024
@rzvxa rzvxa force-pushed the 06-22-feat_ast_gen_add_ast_code_gen branch from 8c3a170 to f014d23 Compare June 24, 2024 15:29
@rzvxa rzvxa marked this pull request as draft June 25, 2024 08:59
@rzvxa rzvxa force-pushed the 06-22-feat_ast_gen_add_ast_code_gen branch 2 times, most recently from d22b9d0 to b6e7c2b Compare June 25, 2024 09:30
@rzvxa rzvxa changed the base branch from main to remove_span_from_binding_pattern June 25, 2024 09:31
@rzvxa rzvxa marked this pull request as ready for review June 25, 2024 09:32
@Boshen Boshen force-pushed the remove_span_from_binding_pattern branch from 0599a1d to 1f85f1a Compare June 25, 2024 09:43
@Boshen Boshen changed the base branch from remove_span_from_binding_pattern to main June 25, 2024 09:47
@Boshen Boshen force-pushed the 06-22-feat_ast_gen_add_ast_code_gen branch from b6e7c2b to 47b89bd Compare June 25, 2024 09:47
@rzvxa rzvxa force-pushed the 06-22-feat_ast_gen_add_ast_code_gen branch from 47b89bd to e7b0225 Compare June 25, 2024 10:36
Copy link

graphite-app bot commented Jun 25, 2024

Merge activity

@Boshen Boshen force-pushed the 06-22-feat_ast_gen_add_ast_code_gen branch from e7b0225 to f6c4ec4 Compare June 25, 2024 13:55
@graphite-app graphite-app bot merged commit f6c4ec4 into main Jun 25, 2024
24 checks passed
@graphite-app graphite-app bot deleted the 06-22-feat_ast_gen_add_ast_code_gen branch June 25, 2024 13:59
overlookmotel pushed a commit that referenced this pull request Jul 20, 2024
Mark everything mentioned in #3815 (comment) as AST.

We now error on the occurrence of non-ast items in the source of truth. It doesn't make sure that all fields and variants are `#[ast]` and therefore `repr_stable` but there are only a handful of non-AST types used here(mainly Atom and Span). Since we don't have access to the external types we can't make sure of it unless we find a way to const assert it.

The best we can do until then is to check all field/variant types to be either `#[ast]` or in a white list. I can add this check to the codegen in an upcoming PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ast Area - AST
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants