Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(gnovm): assign to deref pointers #1919

Closed
wants to merge 18 commits into from

Conversation

deelawn
Copy link
Contributor

@deelawn deelawn commented Apr 11, 2024

Closes #1326.

Some context about the problem this solves

This PR solves an issue where values don't get persisted after assignments to pointer values, where the LHS of the assign statement is a star expression. For example, this statement could cause an issue: *i = 7. The only time there is an issue with this kind of assignment is when the pointer was previously assigned an address using new or a composite literal -- so the above example of the assignment to i would only be an issue if i was initialized by i = new(int). Or some composite value might be initialized with something like s = &S{}.

Now that the cases where this can be an issue have been laid out, let's look at the underlying cause of the issue. Here is the definition of a pointer value:

type PointerValue struct {
	TV    *TypedValue // escape val if pointer to var.
	Base  Value       // array/struct/block.
	Index int         // list/fields/values index, or -1 or -2 (see below).
	Key   *TypedValue `json:",omitempty"` // for maps.
}

In the case outlined above with pointer assignment using new or a composite literal, the Base of the pointer's underlying value is nil. Let's look at another example:

package asdf

var (
  intPtr *int
  intValue int
  concreteIntPtr *int
)

func Assign() {
  intPtr = new(int)
  concreteIntPtr = &intValue
}

After this package is created and Assign is called, this will be the composition of both intPtr and concreteIntPtr:

intPtr
------
PointerValue {
	TV: TypedValue {
		T: PointerType{}, // abbreviated
		V: PointerValue {
			TV: TypedValue {
				T: PrimitiveType, // approximate
				V: Value, // some int value
			},
			Base: nil,
		},
	},
	Base: Block{}, // abbreviated
	Index: 2, // approximate
}

concreteIntPtr
---------------
PointerValue {
	TV: TypedValue {
		T: PointerType{}, // abbreviated
		V: PointerValue {
			TV: TypedValue {
				T: PrimitiveType, // approximate
				V: Value, // some int value
			},
			Base: Block{}, // abbreviated; same block as outer pointer
			Index: 3, // Index of the int it points to
		},
	},
	Base: Block{}, // abbreviated
	Index: 4, // approximate
}

Notice that the pointer that points to a concrete value has a base and the pointer that points to a value created dynamically using new does not have a base. This is fine, but causes an issue with persisting a value after assignment.

Keeping that in mind, let's next look at how star expressions are resolved to values. Here is the star expression definition:

type StarExpr struct { // *X
	Attributes
	X Expr // operand
}

So when we see a star expression, we will evaluate the expression the star expression is dereferencing; the evaluation op will push this value to the machine stack.

case *StarExpr:
// evaluate X (a reference)
m.PushExpr(lx.X)
m.PushOp(OpEval)

All (I think?) assignment statements will call the machine's PopAsPointer method to retrieve a pointer of the LHS value. In our case with the star expression on the LHS, this will be the value pushed after the evaluation above. The value is popped here:

case *StarExpr:
ptr := m.PopValue().V.(PointerValue)
return ptr

This value that is popped is the value of the "inner" PointerValue (see values of intPtr and concreteIntPtr after assignment above). Therefore, if the LHS pointer value was assigned using new or a composite literal, the Base will be nil. This is the problem and reason why assignments with these properties won't end up persisting any values. As an example, consider the standard assignment statement *i = 5, setting one value equal to another. Assignment pops the pointer to the LHS value and sends it and the RHS value to Assign2:

func (m *Machine) doOpAssign() {
s := m.PopStmt().(*AssignStmt)
// Assign each value evaluated for Lhs.
// NOTE: PopValues() returns a slice in
// forward order, not the usual reverse.
rvs := m.PopValues(len(s.Lhs))
for i := len(s.Lhs) - 1; 0 <= i; i-- {
// Pop lhs value and desired type.
lv := m.PopAsPointer(s.Lhs[i])
// XXX HACK (until value persistence impl'd)
if m.ReadOnly {
if oo, ok := lv.Base.(Object); ok {
if oo.GetIsReal() {
panic("readonly violation")
}
}
}
lv.Assign2(m.Alloc, m.Store, m.Realm, rvs[i], true)
}
}

The problem is that Assign2 requires a realm and a target pointer value with a non-nil base in order to do the assignment and mark the values to be persisted:

if rlm != nil && pv.Base != nil {
oo1 := pv.TV.GetFirstObject(store)
pv.TV.Assign(alloc, tv2, cu)
oo2 := pv.TV.GetFirstObject(store)
rlm.DidUpdate(pv.Base.(Object), oo1, oo2)
} else {

In our case, the pointer value, pv, has no base defined for reasons outlined earlier. This means that, while pv.TV.Assign is still called to do the assignment, DidUpdate is never called. DidUpdate needs to be called so that the parent object (base) is marked as dirty so the realm finalization step knows to check its descendent objects for other dirty and new values to persist.

The Solution

What are the criteria for this problem to occur?

  1. There must be an assignment statement with a star expression on the LHS
  2. The address of the underlying value on the LHS of the statement must have been most recently assigned using new or a composite literal

We can check the first criterion in the preprocessor. When transcribing a star expression, set a flag IsLHS if this transcription step was called from a LHS assignment step.

Next, when calling PushForPointer as part of an assign statement evaluation, check if the star expression is on the LHS. If it is, we want to evaluate a reference to the expression. This means that the value pushed to the stack will not be the "inner pointer", but the full pointer value (see the value of intPtr above). This is important because now, during assignment, we have the outer pointer's Base value at our disposal.

During the assignment operation, PopAsPointer is called; this will pop the reference value that was just evaluated. At this point, the pointer value popped actually refers to intPtr rather than *intPtr. So it checks if this star expression is on the LHS. If it is, it knows that the value needs to be dereferenced. But first it does one more check -- if the base of the dereferenced pointer is nil, assign it the value of the base of the parent pointer. This ensures that the value will be persisted after assignment.

Contributors' checklist...
  • Added new tests, or not needed, or not feasible
  • Provided an example (e.g. screenshot) to aid review or the PR is self-explanatory
  • Updated the official documentation or not needed
  • No breaking changes were made, or a BREAKING CHANGE: xxx message was included in the description
  • Added references to related issues and PRs
  • Provided any useful hints for running manual tests
  • Added new benchmarks to generated graphs, if any. More info here.

@github-actions github-actions bot added the 📦 🤖 gnovm Issues or PRs gnovm related label Apr 11, 2024
Copy link

codecov bot commented Apr 11, 2024

Codecov Report

Attention: Patch coverage is 73.33333% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 58.73%. Comparing base (b2f12a9) to head (9fc4800).
Report is 132 commits behind head on master.

Files Patch % Lines
gnovm/pkg/gnolang/machine.go 73.33% 4 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #1919       +/-   ##
===========================================
+ Coverage   48.25%   58.73%   +10.47%     
===========================================
  Files         408      436       +28     
  Lines       62338    68638     +6300     
===========================================
+ Hits        30081    40314    +10233     
+ Misses      29749    25313     -4436     
- Partials     2508     3011      +503     
Flag Coverage Δ
gno.land 61.64% <ø> (?)
gnovm 60.00% <73.33%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@github-actions github-actions bot added the 📦 ⛰️ gno.land Issues or PRs gno.land package related label Apr 23, 2024
@deelawn deelawn marked this pull request as ready for review April 23, 2024 16:41
gnovm/pkg/gnolang/nodes.go Outdated Show resolved Hide resolved
gnovm/pkg/gnolang/nodes.go Outdated Show resolved Hide resolved
@leohhhn leohhhn changed the title fix: assign to deref pointers fix(gnovm): assign to deref pointers May 1, 2024
@deelawn deelawn requested review from zivkovicmilos and a team as code owners May 31, 2024 20:05
// The star expression is on the LHS, so evaluate the expression as
// a reference. This ensures the value that is pushed is a pointer to
// the pointer value represented by the lx.X expression. This will be
// helpful if the underlying pointer value dos not have a base;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does not have a base

@jaekwon
Copy link
Contributor

jaekwon commented May 31, 2024

looking at this now.

@jaekwon
Copy link
Contributor

jaekwon commented Jun 2, 2024

See #2255 for fix.
There are two problems with this PR.

  1. The Base is saved as if modified, even when it is not.
var i = new(int)
func() {
    *i = 2
}

In the above example there are two Blocks; the global, and the inner func block. The line *i = 2 would according to this PR require a re-persistence of the global block, but this should not be the case, because the global block's i is still referring to the same pointer, even though its referenced value was modified.

  1. It's not addressing the underlying issue; new(xxx) and &xxx{} type values should be values inside some container object, but the current VM does not give it any container. There are several issues with this which becomes apparent when reviewing fix: (gnovm) star expr assign for #1919 #2255; for example, a PointerValue with nil base persists the PointerValue.Value with copy, so when multiple pointers refer to the same value (as in var j *int = i) in the above example) there would now be multiple copies of the same referenced integer value instead of a shared common one.

@deelawn
Copy link
Contributor Author

deelawn commented Jun 3, 2024

Closing in favor of #2255

@deelawn deelawn closed this Jun 3, 2024
jaekwon added a commit that referenced this pull request Jun 20, 2024
This is a complete solution, alternative to #1919, and (I think) closes
#1326.
It creates a new container for "baseless" (floating) values constructed
via `new(xxx)` or `&struct{}`, which currently do not have a base
containing object for that value, and are currently represented as
PointerValues with .Base set to nil.

The containing object is like a Block but minimal -- it only contains
one Value, and has no Source or Parent. The modifications to realm.go
allow for proper ref-counting so that even when there are multiple
references to the baseless value, and even when the value is primitive,
gc and ref-counting works (since the containing HeapItemValue is
ref-counted). PointerValue.Base should now never be nil.

See also
#1919 (comment) for why
the previous solution doesn't work.

A better optimization than the one mentioned in the comment above, is to
always store the HeapItemValue along with the Value, since the Value's
refcount should always be 1. This is left for the future, after first
checking that this invariant is true.

---------

Co-authored-by: deelawn <dboltz03@gmail.com>
gfanton pushed a commit to gfanton/gno that referenced this pull request Jul 23, 2024
This is a complete solution, alternative to gnolang#1919, and (I think) closes
gnolang#1326.
It creates a new container for "baseless" (floating) values constructed
via `new(xxx)` or `&struct{}`, which currently do not have a base
containing object for that value, and are currently represented as
PointerValues with .Base set to nil.

The containing object is like a Block but minimal -- it only contains
one Value, and has no Source or Parent. The modifications to realm.go
allow for proper ref-counting so that even when there are multiple
references to the baseless value, and even when the value is primitive,
gc and ref-counting works (since the containing HeapItemValue is
ref-counted). PointerValue.Base should now never be nil.

See also
gnolang#1919 (comment) for why
the previous solution doesn't work.

A better optimization than the one mentioned in the comment above, is to
always store the HeapItemValue along with the Value, since the Value's
refcount should always be 1. This is left for the future, after first
checking that this invariant is true.

---------

Co-authored-by: deelawn <dboltz03@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📦 ⛰️ gno.land Issues or PRs gno.land package related 📦 🤖 gnovm Issues or PRs gnovm related
Projects
Status: Done
Status: No status
Development

Successfully merging this pull request may close these issues.

unexpected unreal object when assigning a local variable to a global variable (pointer)
5 participants