-
Notifications
You must be signed in to change notification settings - Fork 29.6k
/
README.md
1075 lines (811 loc) · 41.3 KB
/
README.md
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# Node.js C++ codebase
Hi! 👋 You've found the C++ code backing Node.js. This README aims to help you
get started working on it and document some idioms you may encounter while
doing so.
## Coding style
Node.js has a document detailing its [C++ coding style][]
that can be helpful as a reference for stylistic issues.
## V8 API documentation
A lot of the Node.js codebase is around what the underlying JavaScript engine,
V8, provides through its API for embedders. Knowledge of this API can also be
useful when working with native addons for Node.js written in C++, although for
new projects [N-API][] is typically the better alternative.
V8 does not provide much public API documentation beyond what is
available in its C++ header files, most importantly `v8.h`, which can be
accessed online in the following locations:
* On GitHub: [`v8.h` in Node.js][]
* On GitHub: [`v8.h` in V8][]
* On the Chromium project's Code Search application: [`v8.h` in Code Search][]
V8 also provides an [introduction for V8 embedders][],
which can be useful for understanding some of the concepts it uses in its
embedder API.
Important concepts when using V8 are the ones of [`Isolate`][]s and
[JavaScript value handles][].
V8 supports [fast API calls][], which can be useful for improving the
performance in certain cases.
## libuv API documentation
The other major dependency of Node.js is [libuv][], providing
the [event loop][] and other operating system abstractions to Node.js.
There is a [reference documentation for the libuv API][].
## File structure
The Node.js C++ files follow this structure:
The `.h` header files contain declarations, and sometimes definitions that don't
require including other headers (e.g. getters, setters, etc.). They should only
include other `.h` header files and nothing else.
The `-inl.h` header files contain definitions of inline functions from the
corresponding `.h` header file (e.g. functions marked `inline` in the
declaration or `template` functions). They always include the corresponding
`.h` header file, and can include other `.h` and `-inl.h` header files as
needed. It is not mandatory to split out the definitions from the `.h` file
into an `-inl.h` file, but it becomes necessary when there are multiple
definitions and contents of other `-inl.h` files start being used. Therefore, it
is recommended to split a `-inl.h` file when inline functions become longer than
a few lines to keep the corresponding `.h` file readable and clean. All visible
definitions from the `-inl.h` file should be declared in the corresponding `.h`
header file.
The `.cc` files contain definitions of non-inline functions from the
corresponding `.h` header file. They always include the corresponding `.h`
header file, and can include other `.h` and `-inl.h` header files as needed.
## Helpful concepts
A number of concepts are involved in putting together Node.js on top of V8 and
libuv. This section aims to explain some of them and how they work together.
<a id="isolate"></a>
### `Isolate`
The `v8::Isolate` class represents a single JavaScript engine instance, in
particular a set of JavaScript objects that can refer to each other
(the “heap”).
The `v8::Isolate` is often passed to other V8 API functions, and provides some
APIs for managing the behaviour of the JavaScript engine or querying about its
current state or statistics such as memory usage.
V8 APIs are not thread-safe unless explicitly specified. In a typical Node.js
application, the main thread and any `Worker` threads each have one `Isolate`,
and JavaScript objects from one `Isolate` cannot refer to objects from
another `Isolate`.
Garbage collection, as well as other operations that affect the entire heap,
happen on a per-`Isolate` basis.
Typical ways of accessing the current `Isolate` in the Node.js code are:
* Given a `FunctionCallbackInfo` for a [binding function][],
using `args.GetIsolate()`.
* Given a [`Context`][], using `context->GetIsolate()`.
* Given a [`Environment`][], using `env->isolate()`.
### V8 JavaScript values
V8 provides classes that mostly correspond to JavaScript types; for example,
`v8::Value` is a class representing any kind of JavaScript type, with
subclasses such as `v8::Number` (which in turn has subclasses like `v8::Int32`),
`v8::Boolean` or `v8::Object`. Most types are represented by subclasses
of `v8::Object`, e.g. `v8::Uint8Array` or `v8::Date`.
<a id="internal-fields"></a>
### Internal fields
V8 provides the ability to store data in so-called “internal fields” inside
`v8::Object`s that were created as instances of C++-backed classes. The number
of fields needs to be defined when creating that class.
Both JavaScript values and `void*` pointers may be stored in such fields.
In most native Node.js objects, the first internal field is used to store a
pointer to a [`BaseObject`][] subclass, which then contains all relevant
information associated with the JavaScript object.
Typical ways of working with internal fields are:
* `obj->InternalFieldCount()` to look up the number of internal fields for an
object (`0` for regular JavaScript objects).
* `obj->GetInternalField(i)` to get a JavaScript value from an internal field.
* `obj->SetInternalField(i, v)` to store a JavaScript value in an
internal field.
* `obj->GetAlignedPointerFromInternalField(i)` to get a `void*` pointer from an
internal field.
* `obj->SetAlignedPointerInInternalField(i, p)` to store a `void*` pointer in an
internal field.
[`Context`][]s provide the same feature under the name “embedder data”.
<a id="js-handles"></a>
### JavaScript value handles
All JavaScript values are accessed through the V8 API through so-called handles,
of which there are two types: [`Local`][]s and [`Global`][]s.
<a id="local-handles"></a>
#### `Local` handles
A `v8::Local` handle is a temporary pointer to a JavaScript object, where
“temporary” usually means that is no longer needed after the current function
is done executing. `Local` handles can only be allocated on the C++ stack.
Most of the V8 API uses `Local` handles to work with JavaScript values or return
them from functions.
Whenever a `Local` handle is created, a `v8::HandleScope` or
`v8::EscapableHandleScope` object must exist on the stack. The `Local` is then
added to that scope and deleted along with it.
When inside a [binding function][], a `HandleScope` already exists outside of
it, so there is no need to explicitly create one.
`EscapableHandleScope`s can be used to allow a single `Local` handle to be
passed to the outer scope. This is useful when a function returns a `Local`.
The following JavaScript and C++ functions are mostly equivalent:
```js
function getFoo(obj) {
return obj.foo;
}
```
```cpp
v8::Local<v8::Value> GetFoo(v8::Local<v8::Context> context,
v8::Local<v8::Object> obj) {
v8::Isolate* isolate = context->GetIsolate();
v8::EscapableHandleScope handle_scope(isolate);
// The 'foo_string' handle cannot be returned from this function because
// it is not “escaped” with `.Escape()`.
v8::Local<v8::String> foo_string =
v8::String::NewFromUtf8(isolate, "foo").ToLocalChecked();
v8::Local<v8::Value> return_value;
if (obj->Get(context, foo_string).ToLocal(&return_value)) {
return handle_scope.Escape(return_value);
} else {
// There was a JS exception! Handle it somehow.
return v8::Local<v8::Value>();
}
}
```
See [exception handling][] for more information about the usage of `.To()`,
`.ToLocalChecked()`, `v8::Maybe` and `v8::MaybeLocal` usage.
##### Casting local handles
If it is known that a `Local<Value>` refers to a more specific type, it can
be cast to that type using `.As<...>()`:
```cpp
v8::Local<v8::Value> some_value;
// CHECK() is a Node.js utilitity that works similar to assert().
CHECK(some_value->IsUint8Array());
v8::Local<v8::Uint8Array> as_uint8 = some_value.As<v8::Uint8Array>();
```
Generally, using `val.As<v8::X>()` is only valid if `val->IsX()` is true, and
failing to follow that rule may lead to crashes.
##### Detecting handle leaks
If it is expected that no `Local` handles should be created within a given
scope unless explicitly within a `HandleScope`, a `SealHandleScope` can be used.
For example, there is a `SealHandleScope` around the event loop, forcing
any functions that are called from the event loop and want to run or access
JavaScript code to create `HandleScope`s.
<a id="global-handles"></a>
#### `Global` handles
A `v8::Global` handle (sometimes also referred to by the name of its parent
class `Persistent`, although use of that is discouraged in Node.js) is a
reference to a JavaScript object that can remain active as long as the engine
instance is active.
Global handles can be either strong or weak. Strong global handles are so-called
“GC roots”, meaning that they will keep the JavaScript object they refer to
alive even if no other objects refer to them. Weak global handles do not do
that, and instead optionally call a callback when the object they refer to
is garbage-collected.
```cpp
v8::Global<v8::Object> reference;
void StoreReference(v8::Isolate* isolate, v8::Local<v8::Object> obj) {
// Create a strong reference to `obj`.
reference.Reset(isolate, obj);
}
// Must be called with a HandleScope around it.
v8::Local<v8::Object> LoadReference(v8::Isolate* isolate) {
return reference.Get(isolate);
}
```
##### `Eternal` handles
`v8::Eternal` handles are a special kind of handles similar to `v8::Global`
handles, with the exception that the values they point to are never
garbage-collected while the JavaScript Engine instance is alive, even if
the `v8::Eternal` itself is destroyed at some point. This type of handle
is rarely used.
<a id="context"></a>
### `Context`
JavaScript allows multiple global objects and sets of built-in JavaScript
objects (like the `Object` or `Array` functions) to coexist inside the same
heap. Node.js exposes this ability through the [`vm` module][].
V8 refers to each of these global objects and their associated builtins as a
`Context`.
Currently, in Node.js there is one main `Context` associated with an
[`Environment`][] instance, and most Node.js features will only work inside
that context. (The only exception at the time of writing are
[`MessagePort`][] objects.) This restriction is not inherent to the design of
Node.js, and a sufficiently committed person could restructure Node.js to
provide built-in modules inside of `vm.Context`s.
Often, the `Context` is passed around for [exception handling][].
Typical ways of accessing the current `Context` in the Node.js code are:
* Given an [`Isolate`][], using `isolate->GetCurrentContext()`.
* Given an [`Environment`][], using `env->context()` to get the `Environment`'s
main context.
<a id="event-loop"></a>
### Event loop
The main abstraction for an event loop inside Node.js is the `uv_loop_t` struct.
Typically, there is one event loop per thread. This includes not only the main
thread and Workers, but also helper threads that may occasionally be spawned
in the course of running a Node.js program.
The current event loop can be accessed using `env->event_loop()` given an
[`Environment`][] instance. The restriction of using a single event loop
is not inherent to the design of Node.js, and a sufficiently committed person
could restructure Node.js to provide e.g. the ability to run parts of Node.js
inside an event loop separate from the active thread's event loop.
<a id="environment"></a>
### `Environment`
Node.js instances are represented by the `Environment` class.
Currently, every `Environment` class is associated with:
* One [event loop][]
* One [`Isolate`][]
* One main [`Context`][]
The `Environment` class contains a large number of different fields for
different Node.js modules, for example a libuv timer for `setTimeout()` or
the memory for a `Float64Array` that the `fs` module uses for storing data
returned from a `fs.stat()` call.
It also provides [cleanup hooks][] and maintains a list of [`BaseObject`][]
instances.
Typical ways of accessing the current `Environment` in the Node.js code are:
* Given a `FunctionCallbackInfo` for a [binding function][],
using `Environment::GetCurrent(args)`.
* Given a [`BaseObject`][], using `env()` or `self->env()`.
* Given a [`Context`][], using `Environment::GetCurrent(context)`.
This requires that `context` has been associated with the `Environment`
instance, e.g. is the main `Context` for the `Environment` or one of its
`vm.Context`s.
* Given an [`Isolate`][], using `Environment::GetCurrent(isolate)`. This looks
up the current [`Context`][] and then uses that.
<a id="isolate-data"></a>
### `IsolateData`
Every Node.js instance ([`Environment`][]) is associated with one `IsolateData`
instance that contains information about or associated with a given
[`Isolate`][].
#### String table
`IsolateData` contains a list of strings that can be quickly accessed
inside Node.js code, e.g. given an `Environment` instance `env` the JavaScript
string “name” can be accessed through `env->name_string()` without actually
creating a new JavaScript string.
### Platform
Every process that uses V8 has a `v8::Platform` instance that provides some
functionalities to V8, most importantly the ability to schedule work on
background threads.
Node.js provides a `NodePlatform` class that implements the `v8::Platform`
interface and uses libuv for providing background threading abilities.
The platform can be accessed through `isolate_data->platform()` given an
[`IsolateData`][] instance, although that only works when:
* The current Node.js instance was not started by an embedder; or
* The current Node.js instance was started by an embedder whose `v8::Platform`
implementation also implement's the `node::MultiIsolatePlatform` interface
and who passed this to Node.js.
<a id="binding-functions"></a>
### Binding functions
C++ functions exposed to JS follow a specific signature. The following example
is from `node_util.cc`:
```cpp
void ArrayBufferViewHasBuffer(const FunctionCallbackInfo<Value>& args) {
CHECK(args[0]->IsArrayBufferView());
args.GetReturnValue().Set(args[0].As<ArrayBufferView>()->HasBuffer());
}
```
(Namespaces are usually omitted through the use of `using` statements in the
Node.js source code.)
`args[n]` is a `Local<Value>` that represents the n-th argument passed to the
function. `args.This()` is the `this` value inside this function call.
`args.Holder()` is equivalent to `args.This()` in all use cases inside of
Node.js.
`args.GetReturnValue()` is a placeholder for the return value of the function,
and provides a `.Set()` method that can be called with a boolean, integer,
floating-point number or a `Local<Value>` to set the return value.
Node.js provides various helpers for building JS classes in C++ and/or attaching
C++ functions to the exports of a built-in module:
```cpp
void Initialize(Local<Object> target,
Local<Value> unused,
Local<Context> context,
void* priv) {
Environment* env = Environment::GetCurrent(context);
SetMethod(context, target, "getaddrinfo", GetAddrInfo);
SetMethod(context, target, "getnameinfo", GetNameInfo);
// 'SetMethodNoSideEffect' means that debuggers can safely execute this
// function for e.g. previews.
SetMethodNoSideEffect(context, target, "canonicalizeIP", CanonicalizeIP);
// ... more code ...
Isolate* isolate = env->isolate();
// Building the `ChannelWrap` class for JS:
Local<FunctionTemplate> channel_wrap =
NewFunctionTemplate(isolate, ChannelWrap::New);
// Allow for 1 internal field, see `BaseObject` for details on this:
channel_wrap->InstanceTemplate()->SetInternalFieldCount(1);
channel_wrap->Inherit(AsyncWrap::GetConstructorTemplate(env));
// Set various methods on the class (i.e. on the prototype):
SetProtoMethod(isolate, channel_wrap, "queryAny", Query<QueryAnyWrap>);
SetProtoMethod(isolate, channel_wrap, "queryA", Query<QueryAWrap>);
// ...
SetProtoMethod(isolate, channel_wrap, "querySoa", Query<QuerySoaWrap>);
SetProtoMethod(isolate, channel_wrap, "getHostByAddr", Query<GetHostByAddrWrap>);
SetProtoMethodNoSideEffect(isolate, channel_wrap, "getServers", GetServers);
SetConstructorFunction(context, target, "ChannelWrap", channel_wrap);
}
// Run the `Initialize` function when loading this binding through
// `internalBinding('cares_wrap')` in Node.js's built-in JavaScript code:
NODE_BINDING_CONTEXT_AWARE_INTERNAL(cares_wrap, Initialize)
```
If the C++ binding is loaded during bootstrap, it needs to be registered
with the utilities in `node_external_reference.h`, like this:
```cpp
namespace node {
namespace utils {
void RegisterExternalReferences(ExternalReferenceRegistry* registry) {
registry->Register(GetHiddenValue);
registry->Register(SetHiddenValue);
// ... register all C++ functions used to create FunctionTemplates.
}
} // namespace util
} // namespace node
// The first argument passed to `NODE_BINDING_EXTERNAL_REFERENCE`,
// which is `util` here, needs to be added to the
// `EXTERNAL_REFERENCE_BINDING_LIST_BASE` list in node_external_reference.h
NODE_BINDING_EXTERNAL_REFERENCE(util, node::util::RegisterExternalReferences)
```
Otherwise, you might see an error message like this when building the
executables:
```console
FAILED: gen/node_snapshot.cc
cd ../../; out/Release/node_mksnapshot out/Release/gen/node_snapshot.cc
Unknown external reference 0x107769200.
<unresolved>
/bin/sh: line 1: 6963 Illegal instruction: 4 out/Release/node_mksnapshot out/Release/gen/node_snapshot.cc
```
You can try using a debugger to symbolicate the external reference. For example,
with lldb's `image lookup --address` command (with gdb it's `info symbol`):
```console
$ lldb -- out/Release/node_mksnapshot out/Release/gen/node_snapshot.cc
(lldb) run
Process 7012 launched: '/Users/joyee/projects/node/out/Release/node_mksnapshot' (x86_64)
Unknown external reference 0x1004c8200.
<unresolved>
Process 7012 stopped
(lldb) image lookup --address 0x1004c8200
Address: node_mksnapshot[0x00000001004c8200] (node_mksnapshot.__TEXT.__text + 5009920)
Summary: node_mksnapshot`node::util::GetHiddenValue(v8::FunctionCallbackInfo<v8::Value> const&) at node_util.cc:159
```
Which explains that the unregistered external reference is
`node::util::GetHiddenValue` defined in `node_util.cc`.
<a id="per-binding-state"></a>
#### Per-binding state
Some internal bindings, such as the HTTP parser, maintain internal state that
only affects that particular binding. In that case, one common way to store
that state is through the use of `Realm::AddBindingData`, which gives
binding functions access to an object for storing such state.
That object is always a [`BaseObject`][].
In the binding, call `SET_BINDING_ID()` with an identifier for the binding
type. For example, for `http_parser::BindingData`, the identifier can be
`http_parser_binding_data`.
If the binding should be supported in a snapshot, the id and the
fully-specified class name should be added to the `SERIALIZABLE_BINDING_TYPES`
list in `base_object_types.h`, and the class should implement the serialization
and deserialization methods. See the comments of `SnapshotableObject` on how to
implement them. Otherwise, add the id and the class name to the
`UNSERIALIZABLE_BINDING_TYPES` list instead.
```cpp
// In base_object_types.h, add the binding to either
// UNSERIALIZABLE_BINDING_TYPES or SERIALIZABLE_BINDING_TYPES.
// The second parameter is a descriptive name of the class, which is
// usually the fully-specified class name.
#define UNSERIALIZABLE_BINDING_TYPES(V) \
V(http_parser_binding_data, http_parser::BindingData)
// In the HTTP parser source code file:
class BindingData : public BaseObject {
public:
BindingData(Environment* env, Local<Object> obj) : BaseObject(env, obj) {}
SET_BINDING_ID(http_parser_binding_data)
std::vector<char> parser_buffer;
bool parser_buffer_in_use = false;
// ...
};
// Available for binding functions, e.g. the HTTP Parser constructor:
static void New(const FunctionCallbackInfo<Value>& args) {
BindingData* binding_data = Realm::GetBindingData<BindingData>(args);
new Parser(binding_data, args.This());
}
// ... because the initialization function told the Environment to store the
// BindingData object:
void InitializeHttpParser(Local<Object> target,
Local<Value> unused,
Local<Context> context,
void* priv) {
Realm* realm = Realm::GetCurrent(context);
BindingData* const binding_data =
realm->AddBindingData<BindingData>(context, target);
if (binding_data == nullptr) return;
Local<FunctionTemplate> t = NewFunctionTemplate(realm->isolate(), Parser::New);
...
}
```
<a id="exception-handling"></a>
### Exception handling
The V8 engine provides multiple features to work with JavaScript exceptions,
as C++ exceptions are disabled inside of Node.js:
#### Maybe types
V8 provides the `v8::Maybe<T>` and `v8::MaybeLocal<T>` types, typically used
as return values from API functions that can run JavaScript code and therefore
can throw exceptions.
Conceptually, the idea is that every `v8::Maybe<T>` is either empty (checked
through `.IsNothing()`) or holds a value of type `T` (checked through
`.IsJust()`). If the `Maybe` is empty, then a JavaScript exception is pending.
A typical way of accessing the value is using the `.To()` function, which
returns a boolean indicating success of the operation (i.e. the `Maybe` not
being empty) and taking a pointer to a `T` to store the value if there is one.
##### Checked conversion
`maybe.Check()` can be used to assert that the maybe is not empty, i.e. crash
the process otherwise. `maybe.FromJust()` (aka `maybe.ToChecked()`) can be used
to access the value and crash the process if it is not set.
This should only be performed if it is actually sure that the operation has
not failed. A lot of the Node.js source code does **not** follow this rule, and
can be brought to crash through this.
In particular, it is often not safe to assume that an operation does not throw
an exception, even if it seems like it would not do that.
The most common reasons for this are:
* Calls to functions like `object->Get(...)` or `object->Set(...)` may fail on
most objects, if the `Object.prototype` object has been modified from userland
code that added getters or setters.
* Calls that invoke _any_ JavaScript code, including JavaScript code that is
provided from Node.js internals or V8 internals, will fail when JavaScript
execution is being terminated. This typically happens inside Workers when
`worker.terminate()` is called, but it can also affect the main thread when
e.g. Node.js is used as an embedded library. These exceptions can happen at
any point.
It is not always obvious whether a V8 call will enter JavaScript. In addition
to unexpected getters and setters, accessing some types of built-in objects
like `Map`s and `Set`s can also run V8-internal JavaScript code.
##### MaybeLocal
`v8::MaybeLocal<T>` is a variant of `v8::Maybe<T>` that is either empty or
holds a value of type `Local<T>`. It has methods that perform the same
operations as the methods of `v8::Maybe`, but with different names:
| `Maybe` | `MaybeLocal` |
| -------------------- | ------------------------------ |
| `maybe.IsNothing()` | `maybe_local.IsEmpty()` |
| `maybe.IsJust()` | `!maybe_local.IsEmpty()` |
| `maybe.To(&value)` | `maybe_local.ToLocal(&local)` |
| `maybe.ToChecked()` | `maybe_local.ToLocalChecked()` |
| `maybe.FromJust()` | `maybe_local.ToLocalChecked()` |
| `maybe.Check()` | – |
| `v8::Nothing<T>()` | `v8::MaybeLocal<T>()` |
| `v8::Just<T>(value)` | `v8::MaybeLocal<T>(value)` |
##### Handling empty `Maybe`s
Usually, the best approach to encountering an empty `Maybe` is to just return
from the current function as soon as possible, and let execution in JavaScript
land resume. If the empty `Maybe` is encountered inside a nested function,
is may be a good idea to use a `Maybe` or `MaybeLocal` for the return type
of that function and pass information about pending JavaScript exceptions along
that way.
Generally, when an empty `Maybe` is encountered, it is not valid to attempt
to perform further calls to APIs that return `Maybe`s.
A typical pattern for dealing with APIs that return `Maybe` and `MaybeLocal` is
using `.ToLocal()` and `.To()` and returning early in case there is an error:
```cpp
// This could also return a v8::MaybeLocal<v8::Number>, for example.
v8::Maybe<double> SumNumbers(v8::Local<v8::Context> context,
v8::Local<v8::Array> array_of_integers) {
v8::Isolate* isolate = context->GetIsolate();
v8::HandleScope handle_scope(isolate);
double sum = 0;
for (uint32_t i = 0; i < array_of_integers->Length(); i++) {
v8::Local<v8::Value> entry;
if (!array_of_integers->Get(context, i).ToLocal(&entry)) {
// Oops, we might have hit a getter that throws an exception!
// It's better to not continue return an empty (“nothing”) Maybe.
return v8::Nothing<double>();
}
if (!entry->IsNumber()) {
// Let's just skip any non-numbers. It would also be reasonable to throw
// an exception here, e.g. using the error system in src/node_errors.h,
// and then to return an empty Maybe again.
continue;
}
// This cast is valid, because we've made sure it's really a number.
v8::Local<v8::Number> entry_as_number = entry.As<v8::Number>();
sum += entry_as_number->Value();
}
return v8::Just(sum);
}
// Function that is exposed to JS:
void SumNumbers(const v8::FunctionCallbackInfo<v8::Value>& args) {
// This will crash if the first argument is not an array. Let's assume we
// have performed type checking in a JavaScript wrapper function.
CHECK(args[0]->IsArray());
double sum;
if (!SumNumbers(args.GetIsolate()->GetCurrentContext(),
args[0].As<v8::Array>()).To(&sum)) {
// Nothing to do, we can just return directly to JavaScript.
return;
}
args.GetReturnValue().Set(sum);
}
```
#### TryCatch
If there is a need to catch JavaScript exceptions in C++, V8 provides the
`v8::TryCatch` type for doing so, which we wrap into our own
`node::errors::TryCatchScope` in Node.js. The latter has the additional feature
of providing the ability to shut down the program in the typical Node.js way
(printing the exception + stack trace) if an exception is caught.
A `TryCatch` will catch regular JavaScript exceptions, as well as termination
exceptions such as the ones thrown by `worker.terminate()` calls.
In the latter case, the `try_catch.HasTerminated()` function will return `true`,
and the exception object will not be a meaningful JavaScript value.
`try_catch.ReThrow()` should not be used in this case.
<a id="libuv-handles-and-requests"></a>
### libuv handles and requests
Two central concepts when working with libuv are handles and requests.
Handles are subclasses of the `uv_handle_t` “class”, and generally refer to
long-lived objects that can emit events multiple times, such as network sockets
or file system watchers.
In Node.js, handles are often managed through a [`HandleWrap`][] subclass.
Requests are one-time asynchronous function calls on the event loop, such as
file system requests or network write operations, that either succeed or fail.
In Node.js, requests are often managed through a [`ReqWrap`][] subclass.
### Environment cleanup
When a Node.js [`Environment`][] is destroyed, it generally needs to clean up
any resources owned by it, e.g. memory or libuv requests/handles.
<a id="cleanup-hooks"></a>
#### Cleanup hooks
Cleanup hooks are provided that run before the [`Environment`][]
is destroyed. They can be added and removed through by using
`env->AddCleanupHook(callback, hint);` and
`env->RemoveCleanupHook(callback, hint);`, where callback takes a `void* hint`
argument.
Inside these cleanup hooks, new asynchronous operations _may_ be started on the
event loop, although ideally that is avoided as much as possible.
Every [`BaseObject`][] has its own cleanup hook that deletes it. For
[`ReqWrap`][] and [`HandleWrap`][] instances, cleanup of the associated libuv
objects is performed automatically, i.e. handles are closed and requests
are cancelled if possible.
#### Closing libuv handles
If a libuv handle is not managed through a [`HandleWrap`][] instance,
it needs to be closed explicitly. Do not use `uv_close()` for that, but rather
`env->CloseHandle()`, which works the same way but keeps track of the number
of handles that are still closing.
#### Closing libuv requests
There is no way to abort libuv requests in general. If a libuv request is not
managed through a [`ReqWrap`][] instance, the
`env->IncreaseWaitingRequestCounter()` and
`env->DecreaseWaitingRequestCounter()` functions need to be used to keep track
of the number of active libuv requests.
#### Calling into JavaScript
Calling into JavaScript is not allowed during cleanup. Worker threads explicitly
forbid this during their shutdown sequence, but the main thread does not for
backwards compatibility reasons.
When calling into JavaScript without using [`MakeCallback()`][], check the
`env->can_call_into_js()` flag and do not proceed if it is set to `false`.
## Classes associated with JavaScript objects
### `MemoryRetainer`
A large number of classes in the Node.js C++ codebase refer to other objects.
The `MemoryRetainer` class is a helper for annotating C++ classes with
information that can be used by the heap snapshot builder in V8, so that
memory retained by C++ can be tracked in V8 heap snapshots captured in
Node.js applications.
Inheriting from the `MemoryRetainer` class enables objects (both from JavaScript
and C++) to refer to instances of that class, and in turn enables that class
to point to other objects as well, including native C++ types
such as `std::string` and track their memory usage.
This can be useful for debugging memory leaks.
The [`memory_tracker.h`][] header file explains how to use this class.
<a id="baseobject"></a>
### `BaseObject`
A frequently recurring situation is that a JavaScript object and a C++ object
need to be tied together. `BaseObject` is the main abstraction for that in
Node.js, and most classes that are associated with JavaScript objects are
subclasses of it. It is defined in [`base_object.h`][].
Every `BaseObject` is associated with one [`Environment`][] and one
`v8::Object`. The `v8::Object` needs to have at least one [internal field][]
that is used for storing the pointer to the C++ object. In order to ensure this,
the V8 `SetInternalFieldCount()` function is usually used when setting up the
class from C++.
The JavaScript object can be accessed as a `v8::Local<v8::Object>` by using
`self->object()`, given a `BaseObject` named `self`.
Accessing a `BaseObject` from a `v8::Local<v8::Object>` (frequently that is
`args.This()` or `args.Holder()` in a [binding function][]) can be done using
the `Unwrap<T>(obj)` function, where `T` is a subclass of `BaseObject`.
A helper for this is the `ASSIGN_OR_RETURN_UNWRAP` macro that returns from the
current function if unwrapping fails (typically that means that the `BaseObject`
has been deleted earlier).
```cpp
void Http2Session::Request(const FunctionCallbackInfo<Value>& args) {
Http2Session* session;
ASSIGN_OR_RETURN_UNWRAP(&session, args.Holder());
Environment* env = session->env();
Local<Context> context = env->context();
Isolate* isolate = env->isolate();
// ...
// The actual function body, which can now use the `session` object.
// ...
}
```
#### Lifetime management
The `BaseObject` class comes with a set of features that allow managing the
lifetime of its instances, either associating it with the lifetime of the
corresponding JavaScript object or untying the two.
The `BaseObject::MakeWeak()` method turns the underlying [`Global`][] handle
into a weak one, and makes it so that the `BaseObject::OnGCCollect()` virtual
method is called when the JavaScript object is garbage collected. By default,
that methods deletes the `BaseObject` instance.
`BaseObject::ClearWeak()` undoes this effect.
It generally makes sense to call `MakeWeak()` in the constructor of a
`BaseObject` subclass, unless that subclass is referred to by e.g. the event
loop, as is the case for the [`HandleWrap`][] and [`ReqWrap`][] classes.
In addition, there are two kinds of smart pointers that can be used to refer
to `BaseObject`s.
`BaseObjectWeakPtr<T>` is similar to `std::weak_ptr<T>`, but holds on to
an object of a `BaseObject` subclass `T` and integrates with the lifetime
management of the former. When the `BaseObject` no longer exists, e.g. when
it was garbage collected, accessing it through `weak_ptr.get()` will return
`nullptr`.
`BaseObjectPtr<T>` is similar to `std::shared_ptr<T>`, but also holds on to
objects of a `BaseObject` subclass `T`. While there are `BaseObjectPtr`s
pointing to a given object, the `BaseObject` will always maintain a strong
reference to its associated JavaScript object. This can be useful when one
`BaseObject` refers to another `BaseObject` and wants to make sure it stays
alive during the lifetime of that reference.
A `BaseObject` can be “detached” through the `BaseObject::Detach()` method.
In this case, it will be deleted once the last `BaseObjectPtr` referring to
it is destroyed. There must be at least one such pointer when `Detach()` is
called. This can be useful when one `BaseObject` fully owns another
`BaseObject`.
<a id="asyncwrap"></a>
### `AsyncWrap`
`AsyncWrap` is a subclass of `BaseObject` that additionally provides tracking
functions for asynchronous calls. It is commonly used for classes whose methods
make calls into JavaScript without any JavaScript stack below, i.e. more or less
directly from the event loop. It is defined in [`async_wrap.h`][].
Every `AsyncWrap` subclass has a “provider type”. A list of provider types is
maintained in `src/async_wrap.h`.
Every `AsyncWrap` instance is associated with two numbers, the “async id”
and the “async trigger id”. The “async id” is generally unique per `AsyncWrap`
instance, and only changes when the object is re-used in some way.
See the [`async_hooks` module][] documentation for more information about how
this information is provided to async tracking tools.
<a id="makecallback"></a>
#### `MakeCallback`
The `AsyncWrap` class has a set of methods called `MakeCallback()`, with the
intention of the naming being that it is used to “make calls back into
JavaScript” from the event loop, rather than making callbacks in some way.
(As the naming has made its way into the Node.js public API, it's not worth
the breakage of fixing it).
`MakeCallback()` generally calls a method on the JavaScript object associated
with the current `AsyncWrap`, and informs async tracking code about these calls
as well as takes care of running the `process.nextTick()` and `Promise` task
queues once it returns.
Before calling `MakeCallback()`, it is typically necessary to enter both a
`HandleScope` and a `Context::Scope`.
```cpp
void StatWatcher::Callback(uv_fs_poll_t* handle,
int status,
const uv_stat_t* prev,
const uv_stat_t* curr) {
// Get the StatWatcher instance associated with this call from libuv,
// StatWatcher is a subclass of AsyncWrap.
StatWatcher* wrap = ContainerOf(&StatWatcher::watcher_, handle);
Environment* env = wrap->env();
HandleScope handle_scope(env->isolate());
Context::Scope context_scope(env->context());
// Transform 'prev' and 'curr' into an array:
Local<Value> arr = ...;
Local<Value> argv[] = { Integer::New(env->isolate(), status), arr };
wrap->MakeCallback(env->onchange_string(), arraysize(argv), argv);
}
```
See [Callback scopes][] for more information.
<a id="handlewrap"></a>
### `HandleWrap`
`HandleWrap` is a subclass of `AsyncWrap` specifically designed to make working
with [libuv handles][] easier. It provides the `.ref()`, `.unref()` and
`.hasRef()` methods as well as `.close()` to enable easier lifetime management
from JavaScript. It is defined in [`handle_wrap.h`][].
`HandleWrap` instances are [cleaned up][cleanup hooks] automatically when the
current Node.js [`Environment`][] is destroyed, e.g. when a Worker thread stops.
`HandleWrap` also provides facilities for diagnostic tooling to get an
overview over libuv handles managed by Node.js.
<a id="reqwrap"></a>
### `ReqWrap`
`ReqWrap` is a subclass of `AsyncWrap` specifically designed to make working
with [libuv requests][] easier. It is defined in [`req_wrap.h`][].
In particular, its `Dispatch()` method is designed to avoid the need to keep
track of the current count of active libuv requests.
`ReqWrap` also provides facilities for diagnostic tooling to get an
overview over libuv handles managed by Node.js.
<a id="callback-scopes"></a>
### Callback scopes
The public `CallbackScope` and the internally used `InternalCallbackScope`
classes provide the same facilities as [`MakeCallback()`][], namely:
* Emitting the `'before'` event for async tracking when entering the scope
* Setting the current async IDs to the ones passed to the constructor
* Emitting the `'after'` event for async tracking when leaving the scope
* Running the `process.nextTick()` queue
* Running microtasks, in particular `Promise` callbacks and async/await
functions
Usually, using `AsyncWrap::MakeCallback()` or using the constructor taking
an `AsyncWrap*` argument (i.e. used as
`InternalCallbackScope callback_scope(this);`) suffices inside of the Node.js
C++ codebase.
## C++ utilities
Node.js uses a few custom C++ utilities, mostly defined in [`util.h`][].
### Memory allocation
Node.js provides `Malloc()`, `Realloc()` and `Calloc()` functions that work
like their C stdlib counterparts, but crash if memory cannot be allocated.
(As V8 does not handle out-of-memory situations gracefully, it does not make
sense for Node.js to attempt to do so in all cases.)
The `UncheckedMalloc()`, `UncheckedRealloc()` and `UncheckedCalloc()` functions
return `nullptr` in these cases (or when `size == 0`).
#### Optional stack-based memory allocation
The `MaybeStackBuffer` class provides a way to allocate memory on the stack
if it is smaller than a given limit, and falls back to allocating it on the
heap if it is larger. This can be useful for performantly allocating temporary
data if it is typically expected to be small (e.g. file paths).
The `Utf8Value`, `TwoByteValue` (i.e. UTF-16 value) and `BufferValue`
(`Utf8Value` but copy data from a `Buffer` if one is passed) helpers
inherit from this class and allow accessing the characters in a JavaScript
string this way.
```cpp
static void Chdir(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);
// ...
CHECK(args[0]->IsString());
Utf8Value path(env->isolate(), args[0]);
int err = uv_chdir(*path);
if (err) {
// ... error handling ...
}
}
```
### Assertions
Node.js provides a few macros that behave similar to `assert()`:
* `CHECK(expression)` aborts the process with a stack trace
if `expression` is false.
* `CHECK_EQ(a, b)` checks for `a == b`
* `CHECK_GE(a, b)` checks for `a >= b`