The introduction of Generators in ES6 has greatly changed the way JavaScript programmers view iterators and provided a new way to solve callback hell
.
The iterator pattern is a commonly used design pattern, but when the iteration rules are complex, maintaining the state within the iterator can be cumbersome. This is where generators come in. What are generators?
Generators: a better way to build Iterators.
With the help of the yield
keyword, we can implement the Fibonacci sequence more elegantly.
function* fibonacci() {
let a = 0, b = 1;
while(true) {
yield a;
[a, b] = [b, a + b];
}
}
yield
can pause the execution flow, which provides the possibility of changing the execution flow. This is similar to Python's coroutine.
The reason why generators can be used to control code flow is that they use yield
to switch the execution paths of two or more generators. This switching is at the statement level, not the function call level. Its essence is CPS transformation.
After yield
, the current call actually ends, and the control has actually been transferred to the function that called the next
method of the generator externally, accompanied by a change in state. So if the external function does not continue to call the next
method, the function where yield
is located is equivalent to stopping at yield
. So to complete the asynchronous operation and continue the function execution, just call the next
method of the generator at the appropriate place, just like the function is executing after pausing.
The processing of generator functions and the yield
keyword is in parser.cc
. Let's take a look at the AST parsing function: Parser::ParseEagerFunctionBody()
3928 ZoneList<Statement*>* Parser::ParseEagerFunctionBody(
3929 const AstRawString* function_name, int pos, Variable* fvar,
3930 Token::Value fvar_init_op, FunctionKind kind, bool* ok) {
3931 .....
3954 // For generators, allocate and yield an iterator on function entry.
3955 if (IsGeneratorFunction(kind)) {
3956 ZoneList<Expression*>* arguments =
3957 new(zone()) ZoneList<Expression*>(0, zone());
3958 CallRuntime* allocation = factory()->NewCallRuntime(
3959 ast_value_factory()->empty_string(),
3960 Runtime::FunctionForId(Runtime::kCreateJSGeneratorObject), arguments,
3961 pos);
3962 VariableProxy* init_proxy = factory()->NewVariableProxy(
3963 function_state_->generator_object_variable());
3964 Assignment* assignment = factory()->NewAssignment(
3965 Token::INIT_VAR, init_proxy, allocation, RelocInfo::kNoPosition);
3966 VariableProxy* get_proxy = factory()->NewVariableProxy(
3967 function_state_->generator_object_variable());
3968 Yield* yield = factory()->NewYield(
3969 get_proxy, assignment, Yield::kInitial, RelocInfo::kNoPosition);
3970 body->Add(factory()->NewExpressionStatement(
3971 yield, RelocInfo::kNoPosition), zone());
3972 }
3973
3974 ParseStatementList(body, Token::RBRACE, false, NULL, CHECK_OK);
3975
3976 if (IsGeneratorFunction(kind)) {
3977 VariableProxy* get_proxy = factory()->NewVariableProxy(
3978 function_state_->generator_object_variable());
3979 Expression* undefined =
3980 factory()->NewUndefinedLiteral(RelocInfo::kNoPosition);
3981 Yield* yield = factory()->NewYield(get_proxy, undefined, Yield::kFinal,
3982 RelocInfo::kNoPosition);
3983 body->Add(factory()->NewExpressionStatement(
3984 yield, RelocInfo::kNoPosition), zone());
3985 }
3986 ...
L3955 determines whether it is a generator function. ParseStatementList
parses the function body. Note that a generator function is also a function, and in V8, it is also represented by JSFunction
.
In the two if function bodies, Yield::kInitial
and Yield::kFinal
two Yield AST nodes are created.
Yield states are:
enum Kind {
kInitial, // The initial yield that returns the unboxed generator object.
The machine code generation (x64 platform) mainly focuses on runtime-generator.cc
and full-codegen-x64.cc
.
runtime-generator.cc
provides stub code segments such as Create
, Suspend
, Resume
, Close
, etc.,
which are used by full-codegen for inline use to generate assembly code.
Let's take a look at RUNTIME_FUNCTION(Runtime_CreateJSGeneratorObject)
,
14 RUNTIME_FUNCTION(Runtime_CreateJSGeneratorObject) {
15 HandleScope scope(isolate);
16 DCHECK(args.length() == 0);
17
18 JavaScriptFrameIterator it(isolate);
19 JavaScriptFrame* frame = it.frame();
20 Handle<JSFunction> function(frame->function());
21 RUNTIME_ASSERT(function->shared()->is_generator());
22
23 Handle<JSGeneratorObject> generator;
24 if (frame->IsConstructor()) {
25 generator = handle(JSGeneratorObject::cast(frame->receiver()));
26 } else {
27 generator = isolate->factory()->NewJSGeneratorObject(function);
28 }
29 generator->set_function(*function);
30 generator->set_context(Context::cast(frame->context()));
31 generator->set_receiver(frame->receiver());
32 generator->set_continuation(0);
33 generator->set_operand_stack(isolate->heap()->empty_fixed_array());
34 generator->set_stack_handler_index(-1);
35
36 return *generator;
37 }
The function creates a JSGeneratorObject
object to store JSFunction
, Context
, and pc pointer based on the current Frame,
and sets the operand stack to empty.
After yield
, the current execution environment is actually saved. L74 saves the current operand stack and saves it to the JSGeneratorObject object.
40 RUNTIME_FUNCTION(Runtime_SuspendJSGeneratorObject) {
41 HandleScope handle_scope(isolate);
42 DCHECK(args.length() == 1);
43 CONVERT_ARG_HANDLE_CHECKED(JSGeneratorObject, generator_object, 0);
44
45 JavaScriptFrameIterator stack_iterator(isolate);
46 JavaScriptFrame* frame = stack_iterator.frame();
47 RUNTIME_ASSERT(frame->function()->shared()->is_generator());
48 DCHECK_EQ(frame->function(), generator_object->function());
49
50 // The caller should have saved the context and continuation already.
51 DCHECK_EQ(generator_object->context(), Context::cast(frame->context()));
52 DCHECK_LT(0, generator_object->continuation());
53
54 // We expect there to be at least two values on the operand stack: the return
55 // value of the yield expression, and the argument to this runtime call.
56 // Neither of those should be saved.
57 int operands_count = frame->ComputeOperandsCount();
58 DCHECK_GE(operands_count, 2);
59 operands_count -= 2;
60
61 if (operands_count == 0) {
62 // Although it's semantically harmless to call this function with an
63 // operands_count of zero, it is also unnecessary.
64 DCHECK_EQ(generator_object->operand_stack(),
65 isolate->heap()->empty_fixed_array());
66 DCHECK_EQ(generator_object->stack_handler_index(), -1);
67 // If there are no operands on the stack, there shouldn't be a handler
68 // active either.
69 DCHECK(!frame->HasHandler());
70 } else {
71 int stack_handler_index = -1;
72 Handle<FixedArray> operand_stack =
73 isolate->factory()->NewFixedArray(operands_count);
74 frame->SaveOperandStack(*operand_stack, &stack_handler_index);
75 generator_object->set_operand_stack(*operand_stack);
76 generator_object->set_stack_handler_index(stack_handler_index);
77 }
78
79 return isolate->heap()->undefined_value();
80 }
L108 sets the PC offset of the current frame. L118 restores the operand stack, and L126-L130 returns the value based on the restored mode.
90 RUNTIME_FUNCTION(Runtime_ResumeJSGeneratorObject) {
91 SealHandleScope shs(isolate);
92 DCHECK(args.length() == 3);
93 CONVERT_ARG_CHECKED(JSGeneratorObject, generator_object, 0);
94 CONVERT_ARG_CHECKED(Object, value, 1);
95 CONVERT_SMI_ARG_CHECKED(resume_mode_int, 2);
96 JavaScriptFrameIterator stack_iterator(isolate);
97 JavaScriptFrame* frame = stack_iterator.frame();
98
99 DCHECK_EQ(frame->function(), generator_object->function());
100 DCHECK(frame->function()->is_compiled());
101
102 STATIC_ASSERT(JSGeneratorObject::kGeneratorExecuting < 0);
103 STATIC_ASSERT(JSGeneratorObject::kGeneratorClosed == 0);
104
105 Address pc = generator_object->function()->code()->instruction_start();
106 int offset = generator_object->continuation();
107 DCHECK(offset > 0);
108 frame->set_pc(pc + offset);
109 ...
113 generator_object->set_continuation(JSGeneratorObject::kGeneratorExecuting);
114
115 FixedArray* operand_stack = generator_object->operand_stack();
116 int operands_count = operand_stack->length();
117 if (operands_count != 0) {
118 frame->RestoreOperandStack(operand_stack,
119 generator_object->stack_handler_index());
120 generator_object->set_operand_stack(isolate->heap()->empty_fixed_array());
121 generator_object->set_stack_handler_index(-1);
122 }
123
124 JSGeneratorObject::ResumeMode resume_mode =
125 static_cast<JSGeneratorObject::ResumeMode>(resume_mode_int);
126 switch (resume_mode) {
127 case JSGeneratorObject::NEXT:
128 return value;
129 case JSGeneratorObject::THROW:
130 return isolate->Throw(value);
131 }
132 ...
133 }
这边我们关注下 args 参数, args[0]是JSGeneratorObject
对象generator_object
, args[1]是Object 对象
value
, 也就是 next
的返回值,args[2]是表示 resume 模式的值。
对应的我们看到 FullCodeGenerator::EmitGeneratorResume()
中的这几行代码:
2296 __ Push(rbx);
2297 __ Push(result_register());
2298 __ Push(Smi::FromInt(resume_mode));
2299 __ CallRuntime(Runtime::kResumeJSGeneratorObject, 3);
L2297从 result 寄存器中取出 value, L2299调用 RUNTIME_FUNCTION(Runtime_ResumeJSGeneratorObject)
。
这样,从 yield value 到 g.next() 取出 value, 相信大家有了一个大概的认知了。
我们看到node.js依托 v8层面实现了协程,有兴趣的同学可以关心下 fibjs, 它是用 C库实现了协程,遇到异步调用就 "yield" 放弃 CPU, 交由协程调度,也解决了 callback hell 的问题。 本质思想上两种方案没本质区别:
- Generator是利用yield特殊关键字来暂停执行,而fibers是利用Fiber.yield()暂停
- Generator是利用函数返回的Generator句柄来控制函数的继续执行,而fibers是在异步回调中利用Fiber.current.run()继续执行。