Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC007] Bytecode interpreter #2045

Open
wants to merge 20 commits into
base: master
Choose a base branch
from
Open

[RFC007] Bytecode interpreter #2045

wants to merge 20 commits into from

Conversation

yannham
Copy link
Member

@yannham yannham commented Sep 17, 2024

Although the name is a bit pompous, the goal of this RFC is mostly to be a working document for designing a more compact and efficient run-time representation for Nickel expressions.

While this is something that won't be user-facing (at least in a direct way), and thus can be changed later without breaking backward-compatibility, I think the technical scope of this effort is such that I find it better to discuss it formally here before going for a first implementation.

@github-actions github-actions bot temporarily deployed to pull request September 17, 2024 15:08 Inactive
Copy link
Contributor

github-actions bot commented Sep 17, 2024

🐰 Bencher Report

Branch2045/merge
Testbedubuntu-latest

⚠️ WARNING: The following Measure does not have a Threshold. Without a Threshold, no Alerts will ever be generated!

Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the --ci-only-thresholds CLI flag.

Click to view all benchmark results
BenchmarkLatencynanoseconds (ns)
fibonacci 10📈 view plot
⚠️ NO THRESHOLD
485,840.00
foldl arrays 50📈 view plot
⚠️ NO THRESHOLD
1,805,200.00
foldl arrays 500📈 view plot
⚠️ NO THRESHOLD
6,850,000.00
foldr strings 50📈 view plot
⚠️ NO THRESHOLD
7,166,100.00
foldr strings 500📈 view plot
⚠️ NO THRESHOLD
62,580,000.00
generate normal 250📈 view plot
⚠️ NO THRESHOLD
45,525,000.00
generate normal 50📈 view plot
⚠️ NO THRESHOLD
2,089,000.00
generate normal unchecked 1000📈 view plot
⚠️ NO THRESHOLD
3,370,000.00
generate normal unchecked 200📈 view plot
⚠️ NO THRESHOLD
746,970.00
pidigits 100📈 view plot
⚠️ NO THRESHOLD
3,209,200.00
pipe normal 20📈 view plot
⚠️ NO THRESHOLD
1,495,000.00
pipe normal 200📈 view plot
⚠️ NO THRESHOLD
10,120,000.00
product 30📈 view plot
⚠️ NO THRESHOLD
827,560.00
scalar 10📈 view plot
⚠️ NO THRESHOLD
1,509,600.00
sum 30📈 view plot
⚠️ NO THRESHOLD
826,840.00
🐰 View full continuous benchmarking report in Bencher

@github-actions github-actions bot temporarily deployed to pull request September 19, 2024 16:26 Inactive
@github-actions github-actions bot temporarily deployed to pull request September 20, 2024 14:58 Inactive
@github-actions github-actions bot temporarily deployed to pull request September 23, 2024 08:32 Inactive
@github-actions github-actions bot temporarily deployed to pull request September 23, 2024 10:47 Inactive
@github-actions github-actions bot temporarily deployed to pull request September 23, 2024 16:39 Inactive
@github-actions github-actions bot temporarily deployed to pull request September 25, 2024 12:38 Inactive
@github-actions github-actions bot temporarily deployed to pull request September 25, 2024 16:07 Inactive
@github-actions github-actions bot temporarily deployed to pull request September 30, 2024 14:33 Inactive
@github-actions github-actions bot temporarily deployed to pull request October 3, 2024 17:27 Inactive
@github-actions github-actions bot temporarily deployed to pull request October 6, 2024 16:52 Inactive
@github-actions github-actions bot temporarily deployed to pull request October 6, 2024 21:38 Inactive
@github-actions github-actions bot temporarily deployed to pull request October 7, 2024 15:22 Inactive
@yannham yannham marked this pull request as ready for review October 15, 2024 13:06
@yannham
Copy link
Member Author

yannham commented Oct 15, 2024

Some parts might need refinement, but I think it's in a good shape for a first round of reviews.

rfcs/007-bytecode-interpreter.md Outdated Show resolved Hide resolved
rfcs/007-bytecode-interpreter.md Outdated Show resolved Hide resolved
Copy link
Member

@aspiwack aspiwack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some random comments.

Comment on lines +209 to +211
The following notes on the memory representation applies to the native code
backend's representation. I'm not sure how closures are represented in the Zinc
Abstract Machine.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it ought to be called Zinc anymore, but anyway: I'm pretty sure that the representation of values (including closures) is the same in native and bytecode. It must be so at least to some degree for the sake of the FFI, where Ocaml values can be manipulated.

Comment on lines +238 to +239
no argument (the tag byte then doesn't store the actual contructor's tag but has
the same value than for a boxed `int`). For a variant with parameters, the tag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite it. A constructor without argument is represented as an unboxed integer value. It doesn't point to a block, so there is no tag involved. Constructors with arguments are pointers, and point to a structure as above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I don't know where I got this idea

Comment on lines +299 to +300
Despite not being advertised, Haskell has an interpreter as well, which is used
mostly for the GHCi REPL. This section describes what we know of the actual the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An Template Haskell.

Comment on lines +316 to +317
code), such that thunk access is uniform: it's an unconditional jump to the
corresponding code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, when pointing to an info table, you actually point directly to the code pointer. The metadata in the info table is accessed backwards by subtracting from the pointer. This way entering a thunk is an indirect jump, which many processors support in a single instruction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right; I think it's mentioned in the STG paper that this was difficult to do with the approach proposed there going through ANSI C, but easy to do with a custom native code generation backend, which I suppose is what GHC does today.

Comment on lines +324 to +327
The STG paper argues that this uniform thunk representation (with "self-handled
update") simplifies the compilation process and gives room for some specific
optimizations (vectored return for pattern matching, for example) that should be
beneficial to Haskell programs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth noting that since the STG paper was written, things have become a little more complex. GHC uses some pointer tagging to mark if the thunk is already forced in the form of one of the 3 (on 32 bits) or 7 (on 64 bits) constructors of the data type. So pattern-matching will check these bits before entering the closure. For efficiency.

environment in the case of closures). As each constructor usage potentially
generates very similar code, GHC is smart enough to share common constructors
instead of generating them again and again (typically the one for an empty
list).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An interesting difference between GHC's memory representation and Ocaml's is the way unboxed and boxed values are distinguished. This is only a concern for the GC, so maybe it doesn't matter too much for this discussion.

  • In Ocaml, if a value, viewed as an unsigned int, is odd, then it's unboxed (and the GC doesn't follow it), when it's even, it's a pointer to a boxed value. This way it's always evident, when looking at a value whether it's a pointer or not. The cost is that integers lose one bit, and that it becomes impossible to unbox floats in most cases.
  • In Haskell, pointers and non-pointer values are indistinguishable. So the GC asks the info table which of the fields are pointers. In practice, this is just a number, and all the non-pointer fields are stored first, and all the pointer fields after (or maybe the other way around, I don't remember).

machine. Those machines still have a number of differences on how they handle
fundamental operations such as function application. We take inspiration from
the
[call-by-push-value](https://www.cs.bham.ac.uk/~pbl/papers/thesisqmwphd.pdf)(CBPV)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😊

Copy link
Member Author

@yannham yannham Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's funny because I clearly remember you telling me something like 4 years ago that if Nickel had a VM it should probably be a CPBV one. At that time I knew CPBV rather well but probably didn't have enough fresh VM or native compiler knowledge to really connect the two or really understand what it meant, I might have said something like "ah, ok", but it stuck as an (EDIT: un)evaluated thunk in the back of my head, so to speak.

Well, it took me a bit of time but I can now finally make sense of it 🙂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I said that, I don't know. But I think my broader point was that I was thinking, at the time, of metadata (doc, default, …) as being attached to thunks. And CBPV helps us understand what's going on, because of thunking being an explicit operation in the language, so you can change what's going on there.

That being said, I always think of any evaluation model in terms of CBPV/polarised system L. Even when I don't present it that way: it's almost certainly a translation of the CPBV I have in my head. And I'm keen to embrace it in abstract machines as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speaking of, you can think of system L as representing an abstract machine with a structured stack. Implemented directly, it's not as cache friendly as an actual stack machine. But maybe it's a middle ground to consider, as you don't have to come up with a complex linearised continuation representation for pattern matching, in particular.

On the other hand, maybe it's too tree-like for this proposal. I don't know, I haven't thought about it. I just said the name, and this idea popped up in response. Do whatever you want to do with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants