-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: proposal: Modularise VM Ops. #794
Conversation
Example showing a proposal to modularise VM functionality into nodes and offload complexity from the VM itself. Signed-off-by: Antonio Navarro Perez <antnavper@gmail.com>
//---------------------------------------- | ||
// main run loop. | ||
|
||
func (m *Machine) doExec(op Op) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method will eventually disappear. It is made to make it compatible with old behavior.
My initial impression is that this approach could increase cognitive complexity. While the complete plan may have merit, the current partial implementation seems inadequate. Could you provide more information about your vision for this pull request and how it will look when fully implemented? Any example of how much a future change will be simplified? |
Navigating through a complex switch statement is exactly what I want to avoid :). These are all the switch cases removed just after refactoring one statement: https://github.com/gnolang/gno/pull/794/files#diff-0830c8c26a4042165bbd447eb4955f11a0789f85ddfe85b794f6db2f26115aa5L1249-L1287 Also this switch cases will disappear and they will integrated with the new structure called by Exec method: https://github.com/gnolang/gno/blob/master/gnovm/pkg/gnolang/op_exec.go#L435-L482
Actual Machine.Run method: Detailsfunc (m *Machine) Run() {
for {
op := m.PopOp()
// TODO: this can be optimized manually, even into tiers.
switch op {
/* Control operators */
case OpHalt:
m.incrCPU(OpCPUHalt)
return
case OpNoop:
m.incrCPU(OpCPUNoop)
continue
case OpExec:
m.incrCPU(OpCPUExec)
m.doOpExec(op)
case OpPrecall:
m.incrCPU(OpCPUPrecall)
m.doOpPrecall()
case OpCall:
m.incrCPU(OpCPUCall)
m.doOpCall()
case OpCallNativeBody:
m.incrCPU(OpCPUCallNativeBody)
m.doOpCallNativeBody()
case OpReturn:
m.incrCPU(OpCPUReturn)
m.doOpReturn()
case OpReturnFromBlock:
m.incrCPU(OpCPUReturnFromBlock)
m.doOpReturnFromBlock()
case OpReturnToBlock:
m.incrCPU(OpCPUReturnToBlock)
m.doOpReturnToBlock()
case OpDefer:
m.incrCPU(OpCPUDefer)
m.doOpDefer()
case OpPanic1:
m.incrCPU(OpCPUPanic1)
m.doOpPanic1()
case OpPanic2:
m.incrCPU(OpCPUPanic2)
m.doOpPanic2()
case OpCallDeferNativeBody:
m.incrCPU(OpCPUCallDeferNativeBody)
m.doOpCallDeferNativeBody()
case OpGo:
m.incrCPU(OpCPUGo)
panic("not yet implemented")
case OpSelect:
m.incrCPU(OpCPUSelect)
panic("not yet implemented")
case OpSwitchClause:
m.incrCPU(OpCPUSwitchClause)
m.doOpSwitchClause()
case OpSwitchClauseCase:
m.incrCPU(OpCPUSwitchClauseCase)
m.doOpSwitchClauseCase()
case OpTypeSwitch:
m.incrCPU(OpCPUTypeSwitch)
m.doOpTypeSwitch()
case OpIfCond:
m.incrCPU(OpCPUIfCond)
m.doOpIfCond()
case OpPopValue:
m.incrCPU(OpCPUPopValue)
m.PopValue()
case OpPopResults:
m.incrCPU(OpCPUPopResults)
m.PopResults()
case OpPopBlock:
m.incrCPU(OpCPUPopBlock)
m.PopBlock()
case OpPopFrameAndReset:
m.incrCPU(OpCPUPopFrameAndReset)
m.PopFrameAndReset()
/* Unary operators */
case OpUpos:
m.incrCPU(OpCPUUpos)
m.doOpUpos()
case OpUneg:
m.incrCPU(OpCPUUneg)
m.doOpUneg()
case OpUnot:
m.incrCPU(OpCPUUnot)
m.doOpUnot()
case OpUxor:
m.incrCPU(OpCPUUxor)
m.doOpUxor()
case OpUrecv:
m.incrCPU(OpCPUUrecv)
m.doOpUrecv()
/* Binary operators */
case OpLor:
m.incrCPU(OpCPULor)
m.doOpLor()
case OpLand:
m.incrCPU(OpCPULand)
m.doOpLand()
case OpEql:
m.incrCPU(OpCPUEql)
m.doOpEql()
case OpNeq:
m.incrCPU(OpCPUNeq)
m.doOpNeq()
case OpLss:
m.incrCPU(OpCPULss)
m.doOpLss()
case OpLeq:
m.incrCPU(OpCPULeq)
m.doOpLeq()
case OpGtr:
m.incrCPU(OpCPUGtr)
m.doOpGtr()
case OpGeq:
m.incrCPU(OpCPUGeq)
m.doOpGeq()
case OpAdd:
m.incrCPU(OpCPUAdd)
m.doOpAdd()
case OpSub:
m.incrCPU(OpCPUSub)
m.doOpSub()
case OpBor:
m.incrCPU(OpCPUBor)
m.doOpBor()
case OpXor:
m.incrCPU(OpCPUXor)
m.doOpXor()
case OpMul:
m.incrCPU(OpCPUMul)
m.doOpMul()
case OpQuo:
m.incrCPU(OpCPUQuo)
m.doOpQuo()
case OpRem:
m.incrCPU(OpCPURem)
m.doOpRem()
case OpShl:
m.incrCPU(OpCPUShl)
m.doOpShl()
case OpShr:
m.incrCPU(OpCPUShr)
m.doOpShr()
case OpBand:
m.incrCPU(OpCPUBand)
m.doOpBand()
case OpBandn:
m.incrCPU(OpCPUBandn)
m.doOpBandn()
/* Expression operators */
case OpEval:
m.incrCPU(OpCPUEval)
m.doOpEval()
case OpBinary1:
m.incrCPU(OpCPUBinary1)
m.doOpBinary1()
case OpIndex1:
m.incrCPU(OpCPUIndex1)
m.doOpIndex1()
case OpIndex2:
m.incrCPU(OpCPUIndex2)
m.doOpIndex2()
case OpSelector:
m.incrCPU(OpCPUSelector)
m.doOpSelector()
case OpSlice:
m.incrCPU(OpCPUSlice)
m.doOpSlice()
case OpStar:
m.incrCPU(OpCPUStar)
m.doOpStar()
case OpRef:
m.incrCPU(OpCPURef)
m.doOpRef()
case OpTypeAssert1:
m.incrCPU(OpCPUTypeAssert1)
m.doOpTypeAssert1()
case OpTypeAssert2:
m.incrCPU(OpCPUTypeAssert2)
m.doOpTypeAssert2()
case OpStaticTypeOf:
m.incrCPU(OpCPUStaticTypeOf)
m.doOpStaticTypeOf()
case OpCompositeLit:
m.incrCPU(OpCPUCompositeLit)
m.doOpCompositeLit()
case OpArrayLit:
m.incrCPU(OpCPUArrayLit)
m.doOpArrayLit()
case OpSliceLit:
m.incrCPU(OpCPUSliceLit)
m.doOpSliceLit()
case OpSliceLit2:
m.incrCPU(OpCPUSliceLit2)
m.doOpSliceLit2()
case OpFuncLit:
m.incrCPU(OpCPUFuncLit)
m.doOpFuncLit()
case OpMapLit:
m.incrCPU(OpCPUMapLit)
m.doOpMapLit()
case OpStructLit:
m.incrCPU(OpCPUStructLit)
m.doOpStructLit()
case OpConvert:
m.incrCPU(OpCPUConvert)
m.doOpConvert()
/* GoNative Operators */
case OpArrayLitGoNative:
m.incrCPU(OpCPUArrayLitGoNative)
m.doOpArrayLitGoNative()
case OpSliceLitGoNative:
m.incrCPU(OpCPUSliceLitGoNative)
m.doOpSliceLitGoNative()
case OpStructLitGoNative:
m.incrCPU(OpCPUStructLitGoNative)
m.doOpStructLitGoNative()
case OpCallGoNative:
m.incrCPU(OpCPUCallGoNative)
m.doOpCallGoNative()
/* Type operators */
case OpFieldType:
m.incrCPU(OpCPUFieldType)
m.doOpFieldType()
case OpArrayType:
m.incrCPU(OpCPUArrayType)
m.doOpArrayType()
case OpSliceType:
m.incrCPU(OpCPUSliceType)
m.doOpSliceType()
case OpChanType:
m.incrCPU(OpCPUChanType)
m.doOpChanType()
case OpFuncType:
m.incrCPU(OpCPUFuncType)
m.doOpFuncType()
case OpMapType:
m.incrCPU(OpCPUMapType)
m.doOpMapType()
case OpStructType:
m.incrCPU(OpCPUStructType)
m.doOpStructType()
case OpInterfaceType:
m.incrCPU(OpCPUInterfaceType)
m.doOpInterfaceType()
case OpMaybeNativeType:
m.incrCPU(OpCPUMaybeNativeType)
m.doOpMaybeNativeType()
/* Statement operators */
case OpAssign:
m.incrCPU(OpCPUAssign)
m.doOpAssign()
case OpAddAssign:
m.incrCPU(OpCPUAddAssign)
m.doOpAddAssign()
case OpSubAssign:
m.incrCPU(OpCPUSubAssign)
m.doOpSubAssign()
case OpMulAssign:
m.incrCPU(OpCPUMulAssign)
m.doOpMulAssign()
case OpQuoAssign:
m.incrCPU(OpCPUQuoAssign)
m.doOpQuoAssign()
case OpRemAssign:
m.incrCPU(OpCPURemAssign)
m.doOpRemAssign()
case OpBandAssign:
m.incrCPU(OpCPUBandAssign)
m.doOpBandAssign()
case OpBandnAssign:
m.incrCPU(OpCPUBandnAssign)
m.doOpBandnAssign()
case OpBorAssign:
m.incrCPU(OpCPUBorAssign)
m.doOpBorAssign()
case OpXorAssign:
m.incrCPU(OpCPUXorAssign)
m.doOpXorAssign()
case OpShlAssign:
m.incrCPU(OpCPUShlAssign)
m.doOpShlAssign()
case OpShrAssign:
m.incrCPU(OpCPUShrAssign)
m.doOpShrAssign()
case OpDefine:
m.incrCPU(OpCPUDefine)
m.doOpDefine()
case OpInc:
m.incrCPU(OpCPUInc)
m.doOpInc()
case OpDec:
m.incrCPU(OpCPUDec)
m.doOpDec()
/* Decl operators */
case OpValueDecl:
m.incrCPU(OpCPUValueDecl)
m.doOpValueDecl()
case OpTypeDecl:
m.incrCPU(OpCPUTypeDecl)
m.doOpTypeDecl()
/* Loop (sticky) operators */
case OpBody:
m.incrCPU(OpCPUBody)
m.doOpExec(op)
case OpForLoop:
m.incrCPU(OpCPUForLoop)
m.doOpExec(op)
case OpRangeIter:
m.incrCPU(OpCPURangeIter)
m.doOpExec(op)
case OpRangeIterArrayPtr:
m.incrCPU(OpCPURangeIterArrayPtr)
m.doOpExec(op)
case OpRangeIterString:
m.incrCPU(OpCPURangeIterString)
m.doOpExec(op)
case OpRangeIterMap:
m.incrCPU(OpCPURangeIterMap)
m.doOpExec(op)
case OpReturnCallDefers:
m.incrCPU(OpCPUReturnCallDefers)
m.doOpReturnCallDefers()
default:
panic(fmt.Sprintf("unexpected opcode %s", op.String()))
}
}
} After the changes: Detailsfunc (m *Machine) Run() {
for {
op := m.PopOp()
// TODO: this can be optimized manually, even into tiers.
switch op {
/* Control operators */
case OpHalt:
m.incrCPU(OpCPUHalt)
return
case OpNoop:
m.incrCPU(OpCPUNoop)
continue
}
cycles := m.PopNode().Exec(m, op)
m.incrCPU(cycles)
}
}
|
|
||
var ErrOpNotSupported = errors.New("operation not supported") | ||
|
||
var assignOps map[Op]func(s *AssignStmt, m *Machine) (cpuCycles int64) = map[Op]func(s *AssignStmt, m *Machine) int64{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maps are expensive and Op
is a uint8
, so change to map to [256]func(...) ...
?
@mvertes, interesting topic for you here. |
As @moul gently asked, here is my view on this proposal. It's sure a good way for me to learn more on this project that I'm just starting to discover. At first, I was reluctant to this change. Why? What is expected then (outside the scope of this PR) is that all the opcode processing functions are so trivial and simple that the compiler would inline them in And this is where I changed my mind. It would work with a real byte code architecture, where all instructions are atomic, or close to. But the instruction set here is not of a real or virtual CPU. It maps the AST instead, with all the control flow constructs ( Probably this PR improves slightly the situation. I am new to this code and was confused to see a VM processing function as a virtual CPU, not an AST processor. Or maybe the refactoring should clarify this ? I'm not sure. |
As we delayed the effort needed to tackle this, I will share some slides I had to help with the explanation in future meetings. You can have a look here: https://hackmd.io/@ajnavarro/BJDfoONL2 |
Description
This is a proposal to refactor the VM and modularize code on the nodes themselves, offloading complexity from the VM.
Why?
IMHO the
Machine
struct is too complex and will benefit from a refactoring modularizing code into its own nodes.This modularization will allow to evolve the VM much faster when adding new language expressions, tests, or fixing bugs.
Also, after implementing this PoC, I discovered that a lot of duplicated code can be avoided.
Another benefit will be that Operations, Nodes using these operations, and costs will all be in the same place.
How?
We can define a new method on the Node interface:
This method will be in charge of analyzing the operation, adding or removing elements into the stack, and returning the CPU usage depending on the executed instructions. This method is temporal, and eventually, instead of receiving the
*Machine
pointer, it will use a context containing the program stack (that can be serialized and mutated), the store (that can be queried), GC, Allocator..., making the VM stateless:Also, all specific code for each node when calling OpExec can ve moved to its specific Node implementation too, simplifying
func (m *Machine) doOpExec(op Op)
method.I think that this change, instead of having a list of Statements and Expressions, will also allow to have everything on the same list of Nodes. We can iterate over that list of nodes and call
Exec
method on each, simplifyingRun()
method.On this PR you can see an example of only refactoring Assign Statements, to let you have an idea about how it will look.
Some cases are not fully covered yet (the main one is related to some logic on
PopStmt()
method), that's why some tests are still failing.I would love to know what you think about this.
Checklist (for contributors)