Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hir #860

Merged
merged 67 commits into from
Apr 20, 2023
Merged

Hir #860

merged 67 commits into from
Apr 20, 2023

Conversation

Y-Nak
Copy link
Member

@Y-Nak Y-Nak commented Mar 16, 2023

Define HIR, implement lowering from AST, and add helper structs to evaluate span lazily.

Architecture

Hir items

The hir_def module is a collection of HIR node definitions.
All items that correspond to the Rust items are defined as a salsa tracked struct so that they can work as a granularity of analysis.
Below is all item's definition.

pub enum ItemKind {
    TopMod(TopLevelMod),
    Mod(Mod),
    Func(Func),
    ExternFunc(ExternFunc),
    Struct(Struct),
    Contract(Contract),
    Enum(Enum),
    TypeAlias(TypeAlias),
    Impl(Impl),
    Trait(Trait),
    ImplTrait(ImplTrait),
    Const(Const),
    Use(Use),
    /// Body is not an `Item`, but this makes it easier for analyzers to handle
    /// it.
    Body(Body),
  • TopMod is the top-level module that corresponds to each file.
  • Mod is an internal module that is defined inside a TopMod by mod my_mod {...}.
  • ExternFunc is a desugared function that is defined inside extern block. extern block doesn't create a scope, so the HIR item definition doesn't contain extern block explicitly.
  • Body represents a function body or nameless body that appears in a constant expression context, e.g., 10 in [u32; 10] or LEN in [1; LEN]. Expressions, statements, and patterns are always stored inside the body's arena, not in the salsa database. Thus a body is a minimum granularity for analysis of the expressions, statements, and patterns.

Three different databases

HirDb

HirDb is the core database for analysis and integration with LSP implementation.
HirDb works mainly as a container for Hir nodes. But it also contains some private tracked functions for lowering.
HirDb can be considered a completely span-agnostic database from the perspective of external crates. In reality, the tracked structs stored in HirDb do contain span information, but they are designed so that they cannot be accessed through HirDb (all fields dependent on spans have private visibility). Furthermore, since lowering from AST (or source file) to HIR can also be considered span-dependent, the visibility of tracked functions that perform lowering is defined as private. However, functions that depend on HirDb may internally call the lowering function. These calls are unavoidable for HIR construction and do not unnecessarily invalidate the cache. Additionally, from external crates, the fact that these internal functions are being called is completely hidden.
It is possible to embed span information in diagnostics without directly depending on spans by using types that implement the LazySpan trait, which will be described later. This allows for each analysis phase to be span-agnostic.

LowerHirDb

LowerHirDb is a marker trait used for lowering source files or ASTs to HIR. All public functions related to lowering take LowerHirDb as an argument. All analysis passes must NOT depend on the db.

SpannedHieDb

SpannedHirDb is a marker trait used for evaluating the LazySpan trait lazily, which will be described later. All public functions related to span information take SpannedHirDb as an argument. All analysis passes must NOT depend on the db.

LazySpan

Types implementing the LazySpan trait provide the ability to extract span information lazily, but types themselves don't directly depend on it. These types are implemented under hir/span and basically correspond one-to-one with each HIR node. For example, LazyFuncSpan corresponds to the Func item. Please see the tests in span/item.rs for more examples. To construct these types, there is no need to depend on SpannedHirDb, but when "evaluating" the types to extract the actual span, you need to use SpannedHirDb. Types implementing LazySpan internally hold a SpanTransitionChain. This chain has a span-independent starting point of the chain and a transition function from the starting point. To evaluate the chain, SpannedHirDb first converts the starting point to a span-dependent structure, then applies the transition function to extract the specific span information.

@Y-Nak Y-Nak force-pushed the hir branch 2 times, most recently from f61edbd to e11d401 Compare March 21, 2023 02:08
@Y-Nak Y-Nak force-pushed the hir branch 2 times, most recently from d168f1f to 689720d Compare April 7, 2023 21:33
@Y-Nak Y-Nak marked this pull request as ready for review April 7, 2023 21:34
@Y-Nak Y-Nak force-pushed the hir branch 2 times, most recently from 0881acf to c983488 Compare April 8, 2023 17:21
@Y-Nak
Copy link
Member Author

Y-Nak commented Apr 8, 2023

A potential issue with types implementing the LazySpan is that they are not coupled with the items defined in hir_def. For example, the following code is clearly redundant:

// Perform name resolution for function parameters
for (i, param) in func.params(..).enumerate() {
    if !resolve_type(param.type()) {
        // Construct the corresponding LazySpan
        let span = func.lazy_span().params().param(i).ty();
        Diagnostics::new(span, ...)
    }
}

To solve this issue, we could create a thin wrapper like the following that includes both the Hir Item and LazySpan:

pub struct SpannedItem<Item, Span> {
    item: Item,
    span: Span,
}
impl SpannedItem {
    pub fn name(db: &dyn HirDb) -> SpanendItem<ItentId, LazySpanAtom> {
        ...
    }

    pub fn  params(db: &dyn HirDb) -> Spanneditem<FnParamListId, LazyFnParamList> {
        ...
    }
}
impl<Item, Span: LazySpan> LazySpan for Spanneditem<Item, LazySpan> {
    fn resolve(&self, db: &dyn SpannedHirDb) -> Span {
        self.span.resolve(db)
    }
}

However, at the moment, it is unclear whether SpannedItem is actually needed, so it has not been implemented. Also, regarding the spans of Expr, Stmt, and Pat, BodySourceMap directly manages them, so this redundancy wouldn't occur. Therefore, I plan to add it once I'm confident that it is a common pattern while writing some analyses.

@Y-Nak Y-Nak force-pushed the hir branch 2 times, most recently from 744195e to c74bb85 Compare April 9, 2023 10:11
/// The root node of the tree is the top level module, which corresponds to the
/// `module_tree::TopLevelModule`.
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct ItemTree {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ItemTree will be replaced with ScopeGraph in the next PR.

@Y-Nak
Copy link
Member Author

Y-Nak commented Apr 19, 2023

@sbillig
I have rewritten the implementation of LazySpan. The change is that I replaced the implementation of TransitionFn from an Fn trait object to:

#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug)]
pub(crate) struct LazyTransitionFn {
    pub(super) f: fn(ResolvedOrigin, LazyArg) -> ResolvedOrigin,
    pub(super) arg: LazyArg,
}

By explicitly holding a function pointer and captured variables as LazyArg, it becomes possible to implement Clone and Eq. Since the captured variables are very limited in the implementation of LazySpan, the codebase should not become bloated even if we hold them as LazyArg explicitly.

Furthermore, I defined DynLazySpan(SpanTransitionChain) as a type corresponding to dyn LazySpan. Since conversions from all types implementing LazySpan to this type are implemented, it should be able to be integrated with salsa more flexibly as an alternative to dyn LazySpan.

@Y-Nak
Copy link
Member Author

Y-Nak commented Apr 19, 2023

Also, I added more precise support to track the node by changing the argument of LazyTransitionFn from SyntaxNode to ResolvedOrigin.
Please see the tests in hir/span/stmt.rs and hir/span/expr.rs as examples of how lowered HIR node can be remapped to the source location.

Copy link
Collaborator

@sbillig sbillig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice. I haven't verified that everything is correct, but I suspect that any issues will become apparent as the analyzer is implemented. The lazy span stuff is clever, if perhaps a bit complicated (though I can't think of a way to avoid such complication of course).

@Y-Nak Y-Nak merged commit 95b2c99 into ethereum:fe-v2 Apr 20, 2023
@Y-Nak Y-Nak deleted the hir branch April 24, 2023 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants