Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup the code a bit (a lot) #1595

Merged
merged 8 commits into from
Sep 18, 2023
Merged

Cleanup the code a bit (a lot) #1595

merged 8 commits into from
Sep 18, 2023

Conversation

gtsiam
Copy link
Contributor

@gtsiam gtsiam commented Sep 11, 2023

I was going to work on simplifying the ImportResolver trait, but I got very sidetracked with fixing various paper cuts in the code base. So I thought I'd make a pr with only those changes now so I don't go to merge conflict hell later on.

Don't let the huge diff scare you, the only meaningful change is the first one where I removed the explicit state from the traverse trait. Rust closures can capture state, so it shouldn't be necessary. That said, I had to do some copy-pasting to make it work, as the RecordRows::try_map_state family of methods you had before are incompatible with this. I might (if I find the time) rework the relevant datastructures to use Vecs underneath at some point, which should also drastically bring down the allocations.

The rest is all formatting plus a few Default impls to make clippy happy.

I kept running into this while going through the codebase,
so I thought I'd fix it once and for all.

And no, I did not do this by hand. So I might have missed some cases.
Mainly includes wrapping of comments to the 100 column mark,
as well as some more formatting on macros and strings.
@gtsiam gtsiam changed the title Cleanup the code a ~~bit~~ lot Cleanup the code a bit (a lot) Sep 11, 2023
Copy link
Member

@Radvendii Radvendii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the work! TL;DR let's move the Traverse changes to a separate PR and discuss more.

Formatting

The formatting changes look good.

It seems like a good number of them should be getting caught by cargo fmt. I'll put that on my list of things to look into.

It would be nice also if the comments would get reflowed in the pre-commit-hook / CI. @vkleen mentioned that unstable versions of cargo fmt do that, so maybe we should switch to that.

Traverse

With regard to the Traverse changes, there's definitely something worth fixing here. You've removed the ugliness and code smell from a bunch of places, but not being able to reuse try_map_state is pretty painful (and introduces its own ugliness).

Why don't we merge this PR quickly without the Traverse changes, so we don't build up merge conflicts. Then open a new PR for the Traverse changes and we can discuss there and see if there's a way to make everything nice. I have a couple ideas of how this might work, but I haven't tested them out yet.

We're also having a hacking session tomorrow at 15:15 UTC. I think we'll take a look at the Traverse stuff then; let me know if you want to join for that.

core/src/eval/tests.rs Show resolved Hide resolved
@Radvendii
Copy link
Member

Actually I tested out my first idea and it seemed to work. Take a look at gtsiam#1

@gtsiam
Copy link
Contributor Author

gtsiam commented Sep 13, 2023

We're also having a hacking session tomorrow at 15:15 UTC. I think we'll take a look at the Traverse stuff then; let me know if you want to join for that.

I can't, unfortunately. But I'd say traverse is fine-ish for now. It might be a good idea to convert it to a Visitor similar to the one in syn at some point, but I don't know enough about the code base to say that.

If you want something to look at, I'd say look at RecordRowsF / EnumRowsF. I gave fixing that code a shot, but I didn't know much about some of the code that interacted with (say the UnifRecordRows stuff). The diff became to large and intricate, so I just gave up. Specifically:

pub enum RecordRowsF<Ty, RRows> {
    Empty,
    Extend { row: RecordRowF<Ty>, tail: RRows },
    TailVar(LocIdent),
    TailDyn,
}

becomes:

pub enum RecordRows {
    rows: Vec<RecordRow>,
    tail: Option<LocIdent>,
}

and the same applies to all similar linked list-y code. Unfortunately, Rust isn't Haskell. It has functional aspects, but still.

To quote Learning Rust With Entirely Too Many Linked Lists:

99% of the time you should just use a Vec (array stack), and 99% of the other 1% of the time you should be using a VecDeque (array deque)

Not bashing linked lists in functional languages, but in systems languages they're just a bad idea (at the very least from a memory layout perspective).

It was technically only used in the PartialEq implmentation of Thunk.
Still, I don't think that had any effect.
@gtsiam
Copy link
Contributor Author

gtsiam commented Sep 13, 2023

Actually, I forgot: After making the pr, I found that IdentKind is not used anywhere, so I removed it. I'll push this commit here too while I'm at it.

@gtsiam
Copy link
Contributor Author

gtsiam commented Sep 13, 2023

And I completely forgot to address some of the things in your comment, so here goes:

It seems like a good number of them should be getting caught by cargo fmt. I'll put that on my list of things to look into.

This is largely match_sharedterm!'s fault.

It would be nice also if the comments would get reflowed in the pre-commit-hook / CI. @vkleen mentioned that unstable versions of cargo fmt do that, so maybe we should switch to that.

Didn't know nightly cargo fmt did that! TIL. I use the Rewrap extension.

Why don't we merge this PR quickly without the Traverse changes, ...

I'll consider this nicely solved by your change :)

@yannham
Copy link
Member

yannham commented Sep 13, 2023

Hey @gtsiam, many thanks for contributing! I haven't looked at the code yet (I was out sick), but I'll give it a review today. First things first, I would like to remind this passage in the preamble of CONTRIBUTING.MD:

Before contributing any non-trivial change to this repository, please first check for the existence of related work such as issues or open pull requests. If there are none, it's better to discuss the change you wish to make via an issue, by email, or using any other method with the maintainers of this repository (listed below) before actually submitting something.

Sometimes things are what they are for non obvious technical reasons (and sometimes for no good reason though, beside historical ones, so it's good to kick the anthill 🙂 !). The easiest is probably to join us on the our Matrix (on the Developer chan) for example.

About the linked list representation of row types, it has been done this way because it's more amenable to unification during type inference. You can think of them as a form of difference list (although the concatenation is not really O(1), because with Rust's borrowing it's probably annoying to keep a direct pointer to the tail, but it could be). So we need to keep some form of linked-listness at least, so that we can unify the tail with something new. In this case the right representation isn't totally obvious to me: if you create a lot of one-element lists that you unify together one by one (which is what the current algorithm is doing when unifying two record rows), then you might end up with just a linked list but where elements are one-element Vec<T> instead of just being a T. We could have different representation for Type and UnifType but now this duplicates quite a bit of code.

That being said, the current setting is far from being the only, or even the best solution: we could probably do something better with vectors. But it's just to say that changing the representation to Vec without considering the rest isn't guaranteed to be automatically better (it can, but this has to be measured). It might make the code simpler as well though, so it's worth to investigating. Maybe in a separate PR?

Copy link
Member

@yannham yannham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for cleaning that up 🙂 ! To be frank, it was quite hard to digest. I would suggest, next time, to split things up - here for example separate the traverse change, the removal of IdentKind, and the cosmetic changes. The reason is that not only a big diff is harder to review, but it's even harder when most of it is mechanical but some of it is not only formatting but more semantically meaningful.

That being said, I think it's better for the most part. I left a few comments but I don't see any blocker otherwise.

core/src/error.rs Outdated Show resolved Hide resolved
core/src/error.rs Show resolved Hide resolved
core/src/repl/mod.rs Show resolved Hide resolved
core/src/term/mod.rs Show resolved Hide resolved
@gtsiam
Copy link
Contributor Author

gtsiam commented Sep 17, 2023

Sorry, I've been away from my computer for a few days.

About the linked list representation of row types, it has been done this way because it's more amenable to unification during type inference. ...

Thanks for the explanation! Still, I do think I'll try to kick that particular ant hill with some code sooner or later. Definitely in another pr.

gtsiam and others added 2 commits September 17, 2023 20:15
Co-authored-by: Yann Hamdaoui <yann.hamdaoui@gmail.com>
@yannham
Copy link
Member

yannham commented Sep 18, 2023

@gtsiam got it! I took the freedom to fix the merge conflict in the meantime, so that it's in a state to be merged.

@yannham yannham added this pull request to the merge queue Sep 18, 2023
Merged via the queue into tweag:master with commit 365e14d Sep 18, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants