-
Notifications
You must be signed in to change notification settings - Fork 743
/
Asyncify.cpp
2025 lines (1888 loc) · 77.7 KB
/
Asyncify.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
/*
* Copyright 2019 WebAssembly Community Group participants
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
//
// Asyncify: async/await style code transformation, that allows pausing
// and resuming. This lets a language support synchronous operations in
// an async manner, for example, you can do a blocking wait, and that will
// be turned into code that unwinds the stack at the "blocking" operation,
// then is able to rewind it back up when the actual async operation
// completes, so the code appears to have been running synchronously
// all the while. Use cases for this include coroutines, python generators,
// etc.
//
// The approach here is a third-generation design after Emscripten's original
// Asyncify and then Emterpreter-Async approaches:
//
// * Old Asyncify rewrote control flow in LLVM IR. A problem is that this
// needs to save all SSA registers as part of the local state, which can be
// very costly. A further increase can happen because of phis that are
// added because of control flow transformations. As a result we saw
// pathological cases where the code size increase was unacceptable.
// * Emterpreter-Async avoids any risk of code size increase by running it
// in a little bytecode VM, which also makes pausing/resuming trivial.
// Speed is the main downside here; however, the approach did prove that by
// *selectively* emterpretifying only certain code, many projects can run
// at full speed - often just the "main loop" must be emterpreted, while
// high-speed code that it calls, and in which cannot be an async operation,
// remain at full speed.
//
// New Asyncify's design learns from both of those:
//
// * The code rewrite is done in Binaryen, that is, at the wasm level. At
// this level we will only need to save wasm locals, which is a much smaller
// number than LLVM's SSA registers.
// * We aim for low but non-zero overhead. Low overhead is very important
// for obvious reasons, while Emterpreter-Async proved it is tolerable to
// have *some* overhead, if the transform can be applied selectively.
//
// This new Asyncify transformation implemented here is simpler than old
// Asyncify but has low overhead when properly optimized. Old Asyncify worked at
// the CFG level and added branches there; new Asyncify on the other hand works
// on the structured control flow of wasm and simply "skips over" code when
// rewinding the stack, and jumps out when unwinding. The transformed code looks
// conceptually like this:
// (We have three phases: normal, unwinding, and rewinding)
//
// void foo(int x) {
// // new prelude
// if (rewinding) {
// loadLocals();
// }
// // main body starts here
// if (normal) {
// // some code we must skip while rewinding
// x = x + 1;
// x = x / 2;
// }
// // If rewinding, do this call only if it is the right one.
// if (normal or check_call_index(0)) { // 0 is index of this bar() call
// bar(x);
// if (unwinding) {
// noteUnWound(0);
// saveLocals();
// return;
// }
// }
// if (normal) {
// // more code we must skip while rewinding
// while (x & 7) {
// x = x + 1;
// }
// }
// [..]
//
// The general idea is that while rewinding we just "keep going forward",
// skipping code we should not run. That is, instead of jumping directly
// to the right place, we have ifs that skip along instead. The ifs for
// rewinding and unwinding should be well-predicted, so the overhead should
// not be very high. However, we do need to reach the right location via
// such skipping, which means that in a very large function the rewind
// takes some extra time. On the other hand, though, code that cannot
// unwind or rewind (like that loop near the end) can run at full speed.
// Overall, this should allow good performance with small overhead that is
// mostly noticed at rewind time.
//
// After this pass is run a new i32 global "__asyncify_state" is added, which
// has the following values:
//
// 0: normal execution
// 1: unwinding the stack
// 2: rewinding the stack
//
// There is also "__asyncify_data" which when rewinding and unwinding
// contains a pointer to a data structure with the info needed to rewind
// and unwind:
//
// { // offsets
// i32 - current asyncify stack location // 0
// i32 - asyncify stack end // 4
// }
//
// Or for wasm64:
//
// { // offsets
// i64 - current asyncify stack location // 0
// i64 - asyncify stack end // 8
// }
//
// The asyncify stack is a representation of the call frame, as a list of
// indexes of calls. In the example above, we saw index "0" for calling "bar"
// from "foo". When unwinding, the indexes are added to the stack; when
// rewinding, they are popped off; the current asyncify stack location is
// updated while doing both operations. The asyncify stack is also used to
// save locals. Note that the stack end location is provided, which is for
// error detection.
//
// Note: all pointers are assumed to be 4-byte (wasm32) / 8-byte (wasm64)
// aligned.
//
// When you start an unwinding operation, you must set the initial fields
// of the data structure, that is, set the current stack location to the
// proper place, and the end to the proper end based on how much memory
// you reserved. Note that asyncify will grow the stack up.
//
// The pass will also create five functions that let you control unwinding
// and rewinding:
//
// * asyncify_start_unwind(data : iPTR): call this to start unwinding the
// stack from the current location. "data" must point to a data
// structure as described above (with fields containing valid data).
//
// * asyncify_stop_unwind(): call this to note that unwinding has
// concluded. If no other code will run before you start to rewind,
// this is not strictly necessary, however, if you swap between
// coroutines, or even just want to run some normal code during a
// "sleep", then you must call this at the proper time. Otherwise,
// the code will think it is still unwinding when it should not be,
// which means it will keep unwinding in a meaningless way.
//
// * asyncify_start_rewind(data : iPTR): call this to start rewinding the
// stack back up to the location stored in the provided data. This prepares
// for the rewind; to start it, you must call the first function in the
// call stack to be unwound.
//
// * asyncify_stop_rewind(): call this to note that rewinding has
// concluded, and normal execution can resume.
//
// * asyncify_get_state(): call this to get the current value of the
// internal "__asyncify_state" variable as described above.
// It can be used to distinguish between unwinding/rewinding and normal
// calls, so that you know when to start an asynchronous operation and
// when to propagate results back.
//
// These five functions are exported so that you can call them from the
// outside. If you want to manage things from inside the wasm, then you
// couldn't have called them before they were created by this pass. To work
// around that, you can create imports to asyncify.start_unwind,
// asyncify.stop_unwind, asyncify.start_rewind, asyncify.stop_rewind, and
// asyncify.get_state; if those exist when this pass runs then it will turn
// those into direct calls to the functions that it creates. Note that when
// doing everything in wasm like this, Asyncify must not instrument your
// "runtime" support code, that is, the code that initiates unwinds and rewinds
// and stops them. If it did, the unwind would not stop until you left the wasm
// module entirely, etc. Therefore we do not instrument a function if it has a
// call to the four asyncify_* methods. Note that you may need to disable
// inlining if that would cause code that does need to be instrumented show up
// in that runtime code.
//
// To use this API, call asyncify_start_unwind when you want to. The call
// stack will then be unwound, and so execution will resume in the JS or
// other host environment on the outside that called into wasm. When you
// return there after unwinding, call asyncify_stop_unwind. Then when
// you want to rewind, call asyncify_start_rewind, and then call the same
// initial function you called before, so that rewinding can begin. The
// rewinding will reach the same function from which you started, since
// we are recreating the call stack. At that point you should call
// asyncify_stop_rewind and then execution can resume normally.
//
// Note that the asyncify_* API calls will verify that the data structure
// is valid, checking if the current location is past the end. If so, they
// will throw a wasm unreachable. No check is done during unwinding or
// rewinding, to keep things fast - in principle, from when unwinding begins
// and until it ends there should be no memory accesses aside from the
// asyncify code itself (and likewise when rewinding), so there should be
// no chance of memory corruption causing odd errors. However, the low
// overhead of this approach does cause an error only to be shown when
// unwinding/rewinding finishes, and not at the specific spot where the
// stack end was exceeded.
//
// By default this pass is very careful: it assumes that any import and
// any indirect call may start an unwind/rewind operation. If you know more
// specific information you can inform asyncify about that, which can reduce
// a great deal of overhead, as it can instrument less code. The relevant
// options to wasm-opt etc. are:
//
// --pass-arg=asyncify-imports@module1.base1,module2.base2,module3.base3
//
// Each module.base in that comma-separated list will be considered to
// be an import that can unwind/rewind, and all others are assumed not to
// (aside from the asyncify.* imports, which are always assumed to).
// Each entry can end in a '*' in which case it is matched as a prefix.
//
// The list of imports can be a response file (which is convenient if it
// is long, or you don't want to bother escaping it on the commandline
// etc.), e.g. --pass-arg=asyncify-imports@@file.txt will load the
// contents of file.txt (note the double "@@" - the first is the
// separator for --pass-arg, and the second is the usual convention for
// indicating a response file).
//
// --pass-arg=asyncify-ignore-imports
//
// Ignore all imports (except for asyncify.*), that is, assume none of them
// can start an unwind/rewind. (This is effectively the same as providing
// asyncify-imports with a list of non-existent imports.)
//
// --pass-arg=asyncify-ignore-indirect
//
// Ignore all indirect calls. This implies that you know the call stack
// will never need to be unwound with an indirect call somewhere in it.
// If that is true for your codebase, then this can be extremely useful
// as otherwise it looks like any indirect call can go to a lot of places.
//
// --pass-arg=asyncify-asserts
//
// This enables extra asserts in the output, like checking if we put in
// an unwind/rewind in an invalid place (this can be helpful for manual
// tweaking of the only-list / remove-list, see later).
//
// --pass-arg=asyncify-verbose
//
// Logs out instrumentation decisions to the console. This can help figure
// out why a certain function was instrumented.
//
// For manual fine-tuning of the list of instrumented functions, there are lists
// that you can set. These must be used carefully, as misuse can break your
// application - for example, if a function is called that should be
// instrumented but isn't because of these options, then bad things can happen.
// Note that even if you test this locally then users running your code in
// production may reach other code paths. Use these carefully!
//
// Each of the lists can be used with a response file (@filename, which is then
// loaded from the file). You can also use '*' wildcard matching in the lists.
//
// --pass-arg=asyncify-removelist@name1,name2,name3
//
// If the "remove-list" is provided, then the functions in it will be
// *removed* from the list of instrumented functions. That is, they won't
// be instrumented even if it looks like they need to be. This can be
// useful if you know things the safe whole-program analysis doesn't, e.g.
// that certain code paths will not be taken, where certain indirect calls
// will go, etc.
//
// --pass-arg=asyncify-addlist@name1,name2,name3
//
// If the "add-list" is provided, then the functions in the list will be
// *added* to the list of instrumented functions, that is, they will be
// instrumented even if otherwise we think they don't need to be. As by
// default everything will be instrumented in the safest way possible,
// this is only useful if you use ignore-indirect and use this to fix up
// some indirect calls that *do* need to be instrumented, or if you will
// do some later transform of the code that adds more call paths, etc.
//
// --pass-arg=asyncify-propagate-addlist
//
// The default behaviour of the addlist does not propagate instrumentation
// status. If this option is set then functions which call a function in
// the addlist will also be instrumented, and those that call them and so
// on.
//
// --pass-arg=asyncify-onlylist@name1,name2,name3
//
// If the "only-list" is provided, then *only* the functions in the list
// will be instrumented, and nothing else.
//
// Note that there are two types of instrumentation that happen for each
// function: if foo() can be part of a pause/resume operation, then we need to
// instrument code inside it to support pausing and resuming, but also, we need
// callers of the function to instrument calls to it. Normally this is already
// taken care of as the callers need to be instrumented as well anyhow. However,
// the ignore-indirect option makes things more complicated, since we can't tell
// where all the calls to foo() are - all we see are indirect calls that do not
// refer to foo() by name. To make it possible for you to use the add-list or
// only-list with ignore-indirect, those lists affect *both* kinds of
// instrumentation. That is, if parent() calls foo() indirectly, and you add
// parent() to the add-list, then the indirect calls in parent() will be
// instrumented to support pausing/resuming, even if ignore-indirect is set.
// Typically you don't need to think about this, and just add all the functions
// that can be on the stack while pausing - what this means is that when you do
// so, indirect calls will just work. (The cost is that an instrumented function
// will check for pausing at all indirect calls, which may be unnecessary in
// some cases; but this is an instrumented function anyhow, and indirect calls
// are slower anyhow, so this simple model seems good enough - a more complex
// model where you can specify "instrument, but not indirect calls from me"
// would likely have little benefit.)
//
// TODO When wasm has GC, extending the live ranges of locals can keep things
// alive unnecessarily. We may want to set locals to null at the end
// of their original range.
//
#include "asmjs/shared-constants.h"
#include "cfg/liveness-traversal.h"
#include "ir/effects.h"
#include "ir/find_all.h"
#include "ir/linear-execution.h"
#include "ir/literal-utils.h"
#include "ir/memory-utils.h"
#include "ir/module-utils.h"
#include "ir/names.h"
#include "ir/utils.h"
#include "pass.h"
#include "passes/pass-utils.h"
#include "support/file.h"
#include "support/string.h"
#include "wasm-builder.h"
#include "wasm.h"
namespace wasm {
namespace {
static const Name ASYNCIFY_STATE = "__asyncify_state";
static const Name ASYNCIFY_GET_STATE = "asyncify_get_state";
static const Name ASYNCIFY_DATA = "__asyncify_data";
static const Name ASYNCIFY_START_UNWIND = "asyncify_start_unwind";
static const Name ASYNCIFY_STOP_UNWIND = "asyncify_stop_unwind";
static const Name ASYNCIFY_START_REWIND = "asyncify_start_rewind";
static const Name ASYNCIFY_STOP_REWIND = "asyncify_stop_rewind";
static const Name ASYNCIFY_UNWIND = "__asyncify_unwind";
static const Name ASYNCIFY = "asyncify";
static const Name START_UNWIND = "start_unwind";
static const Name STOP_UNWIND = "stop_unwind";
static const Name START_REWIND = "start_rewind";
static const Name STOP_REWIND = "stop_rewind";
static const Name ASYNCIFY_GET_CALL_INDEX = "__asyncify_get_call_index";
static const Name ASYNCIFY_CHECK_CALL_INDEX = "__asyncify_check_call_index";
// TODO: having just normal/unwind_or_rewind would decrease code
// size, but make debugging harder
enum class State { Normal = 0, Unwinding = 1, Rewinding = 2 };
enum class DataOffset { BStackPos = 0, BStackEnd = 4, BStackEnd64 = 8 };
const auto STACK_ALIGN = 4;
// A helper class for managing fake global names. Creates the globals and
// provides mappings for using them.
// Fake globals are used to stash and then use return values from calls. We need
// to store them somewhere that is valid Binaryen IR, but also will be ignored
// by the Asyncify instrumentation, so we don't want to use a local. What we do
// is replace the local used to receive a call's result with a fake global.set
// to stash it, then do a fake global.get to receive it afterwards. (We do it in
// two steps so that if we are async, we only do the first and not the second,
// i.e., we don't store to the target local if not running normally).
class FakeGlobalHelper {
Module& module;
public:
FakeGlobalHelper(Module& module) : module(module) {
Builder builder(module);
std::string prefix = "asyncify_fake_call_global_";
for (auto type : collectTypes()) {
auto global = prefix + Type(type).toString();
map[type] = global;
rev[global] = type;
module.addGlobal(builder.makeGlobal(
global, type, LiteralUtils::makeZero(type, module), Builder::Mutable));
}
}
~FakeGlobalHelper() {
for (auto& pair : map) {
auto name = pair.second;
module.removeGlobal(name);
}
}
Name getName(Type type) { return map.at(type); }
Type getTypeOrNone(Name name) {
auto iter = rev.find(name);
if (iter != rev.end()) {
return iter->second;
}
return Type::none;
}
private:
std::unordered_map<Type, Name> map;
std::unordered_map<Name, Type> rev;
// Collect the types returned from all calls for which call support globals
// may need to be generated.
using Types = std::unordered_set<Type>;
Types collectTypes() {
ModuleUtils::ParallelFunctionAnalysis<Types> analysis(
module, [&](Function* func, Types& types) {
if (!func->body) {
return;
}
struct TypeCollector : PostWalker<TypeCollector> {
Types& types;
TypeCollector(Types& types) : types(types) {}
void visitCall(Call* curr) {
if (curr->type.isConcrete()) {
types.insert(curr->type);
}
}
void visitCallIndirect(CallIndirect* curr) {
if (curr->type.isConcrete()) {
types.insert(curr->type);
}
}
};
TypeCollector(types).walk(func->body);
});
Types types;
for (auto& pair : analysis.map) {
Types& functionTypes = pair.second;
for (auto t : functionTypes) {
types.insert(t);
}
}
return types;
}
};
class PatternMatcher {
public:
std::string designation;
std::set<Name> names;
std::set<std::string> patterns;
std::set<std::string> patternsMatched;
std::map<std::string, std::string> unescaped;
PatternMatcher(std::string designation,
Module& module,
const String::Split& list)
: designation(designation) {
// The lists contain human-readable strings. Turn them into the
// internal escaped names for later comparisons
for (auto& name : list) {
auto escaped = WasmBinaryReader::escape(name);
unescaped[escaped.toString()] = name;
if (name.find('*') != std::string::npos) {
patterns.insert(escaped.toString());
} else {
auto* func = module.getFunctionOrNull(escaped);
if (!func) {
std::cerr << "warning: Asyncify " << designation
<< "list contained a non-existing function name: " << name
<< " (" << escaped << ")\n";
} else if (func->imported()) {
Fatal() << "Asyncify " << designation
<< "list contained an imported function name (use the import "
"list for imports): "
<< name << '\n';
}
names.insert(escaped.str);
}
}
}
bool match(Name funcName) {
if (names.count(funcName) > 0) {
return true;
} else {
for (auto& pattern : patterns) {
if (String::wildcardMatch(pattern, funcName.toString())) {
patternsMatched.insert(pattern);
return true;
}
}
}
return false;
}
void checkPatternsMatches() {
for (auto& pattern : patterns) {
if (patternsMatched.count(pattern) == 0) {
std::cerr << "warning: Asyncify " << designation
<< "list contained a non-matching pattern: "
<< unescaped[pattern] << " (" << pattern << ")\n";
}
}
}
};
// Analyze the entire module to see which calls may change the state, that
// is, start an unwind or rewind), either in itself or in something called
// by it.
// Handles global module management, needed from the various parts of this
// transformation.
class ModuleAnalyzer {
Module& module;
bool canIndirectChangeState;
struct Info
: public ModuleUtils::CallGraphPropertyAnalysis<Info>::FunctionInfo {
// The function name.
Name name;
// If this function can start an unwind/rewind. We only set this in cases
// where we need to know that fact and also that we need to instrument code
// to handle it (as a result, we do not set it for the bottommost runtime,
// which needs no instrumentation).
bool canChangeState = false;
// If this function is part of the runtime that receives an unwinding
// and starts a rewinding. If so, we do not instrument it, see above.
// This is only relevant when handling things entirely inside wasm,
// as opposed to imports.
bool isBottomMostRuntime = false;
// If this function is part of the runtime that starts an unwinding
// and stops a rewinding. If so, we do not instrument it, see above.
// The difference between the top-most and bottom-most runtime is that
// the top-most part is still marked as changing the state; so things
// that call it are instrumented. This is not done for the bottom.
bool isTopMostRuntime = false;
bool inRemoveList = false;
bool addedFromList = false;
};
using Map = std::map<Function*, Info>;
Map map;
public:
ModuleAnalyzer(Module& module,
std::function<bool(Name, Name)> canImportChangeState,
bool canIndirectChangeState,
const String::Split& removeListInput,
const String::Split& addListInput,
bool propagateAddList,
const String::Split& onlyListInput,
bool verbose)
: module(module), canIndirectChangeState(canIndirectChangeState),
fakeGlobals(module), verbose(verbose) {
PatternMatcher removeList("remove", module, removeListInput);
PatternMatcher addList("add", module, addListInput);
PatternMatcher onlyList("only", module, onlyListInput);
// Rename the asyncify imports so their internal name matches the
// convention. This makes replacing them with the implementations
// later easier.
std::map<Name, Name> renamings;
for (auto& func : module.functions) {
if (func->module == ASYNCIFY) {
if (func->base == START_UNWIND) {
renamings[func->name] = ASYNCIFY_START_UNWIND;
} else if (func->base == STOP_UNWIND) {
renamings[func->name] = ASYNCIFY_STOP_UNWIND;
} else if (func->base == START_REWIND) {
renamings[func->name] = ASYNCIFY_START_REWIND;
} else if (func->base == STOP_REWIND) {
renamings[func->name] = ASYNCIFY_STOP_REWIND;
} else {
Fatal() << "call to unidenfied asyncify import: " << func->base;
}
}
}
ModuleUtils::renameFunctions(module, renamings);
// Scan to see which functions can directly change the state.
// Also handle the asyncify imports, removing them (as we will implement
// them later), and replace calls to them with calls to the later proper
// name.
ModuleUtils::CallGraphPropertyAnalysis<Info> scanner(
module, [&](Function* func, Info& info) {
info.name = func->name;
if (func->imported()) {
// The relevant asyncify imports can definitely change the state.
if (func->module == ASYNCIFY &&
(func->base == START_UNWIND || func->base == STOP_REWIND)) {
info.canChangeState = true;
} else {
info.canChangeState =
canImportChangeState(func->module, func->base);
if (verbose && info.canChangeState) {
std::cout << "[asyncify] " << func->name
<< " is an import that can change the state\n";
}
}
return;
}
struct Walker : PostWalker<Walker> {
Info& info;
Module& module;
bool canIndirectChangeState;
Walker(Info& info, Module& module, bool canIndirectChangeState)
: info(info), module(module),
canIndirectChangeState(canIndirectChangeState) {}
void visitCall(Call* curr) {
if (curr->isReturn) {
Fatal() << "tail calls not yet supported in asyncify";
}
auto* target = module.getFunction(curr->target);
if (target->imported() && target->module == ASYNCIFY) {
// Redirect the imports to the functions we'll add later.
if (target->base == START_UNWIND) {
info.canChangeState = true;
info.isTopMostRuntime = true;
} else if (target->base == STOP_UNWIND) {
info.isBottomMostRuntime = true;
} else if (target->base == START_REWIND) {
info.isBottomMostRuntime = true;
} else if (target->base == STOP_REWIND) {
info.canChangeState = true;
info.isTopMostRuntime = true;
} else {
WASM_UNREACHABLE("call to unidenfied asyncify import");
}
}
}
void visitCallIndirect(CallIndirect* curr) {
if (curr->isReturn) {
Fatal() << "tail calls not yet supported in asyncify";
}
if (canIndirectChangeState) {
info.canChangeState = true;
}
// TODO optimize the other case, at least by type
}
};
Walker walker(info, module, canIndirectChangeState);
walker.walk(func->body);
if (info.isBottomMostRuntime) {
info.canChangeState = false;
// TODO: issue warnings on suspicious things, like a function in
// the bottom-most runtime also doing top-most runtime stuff
// like starting and unwinding.
}
if (verbose && info.canChangeState) {
std::cout << "[asyncify] " << func->name
<< " can change the state due to initial scan\n";
}
});
// Functions in the remove-list are assumed to not change the state.
for (auto& [func, info] : scanner.map) {
if (removeList.match(func->name)) {
info.inRemoveList = true;
if (verbose && info.canChangeState) {
std::cout << "[asyncify] " << func->name
<< " is in the remove-list, ignore\n";
}
info.canChangeState = false;
}
}
// Remove the asyncify imports, if any, and any calls to them.
std::vector<Name> funcsToDelete;
for (auto& [func, info] : scanner.map) {
auto& callsTo = info.callsTo;
if (func->imported() && func->module == ASYNCIFY) {
funcsToDelete.push_back(func->name);
}
std::vector<Function*> callersToDelete;
for (auto* target : callsTo) {
if (target->imported() && target->module == ASYNCIFY) {
callersToDelete.push_back(target);
}
}
for (auto* target : callersToDelete) {
callsTo.erase(target);
}
}
for (auto name : funcsToDelete) {
module.removeFunction(name);
}
auto handleAddList = [&](ModuleAnalyzer::Map& map) {
if (!addListInput.empty()) {
for (auto& func : module.functions) {
if (addList.match(func->name) && removeList.match(func->name)) {
Fatal() << func->name
<< " is found in the add-list and in the remove-list";
}
if (!func->imported() && addList.match(func->name)) {
auto& info = map[func.get()];
if (verbose && !info.canChangeState) {
std::cout << "[asyncify] " << func->name
<< " is in the add-list, add\n";
}
info.canChangeState = true;
info.addedFromList = true;
}
}
}
};
// When propagateAddList is enabled, we should check a add-list before
// scannerpropagateBack so that callers of functions in add-list should also
// be instrumented.
if (propagateAddList) {
handleAddList(scanner.map);
}
// The order of propagation in |propagateBack| is non-deterministic, so sort
// the loggings we intend to do.
std::vector<std::string> loggings;
scanner.propagateBack([](const Info& info) { return info.canChangeState; },
[](const Info& info) {
return !info.isBottomMostRuntime &&
!info.inRemoveList;
},
[](Info& info) { info.canChangeState = true; },
[&](const Info& info, Function* reason) {
if (verbose) {
std::stringstream str;
str << "[asyncify] " << info.name
<< " can change the state due to "
<< reason->name << "\n";
loggings.push_back(str.str());
}
},
scanner.IgnoreNonDirectCalls);
if (!loggings.empty()) {
std::sort(loggings.begin(), loggings.end());
for (auto& logging : loggings) {
std::cout << logging;
}
}
map.swap(scanner.map);
if (!onlyListInput.empty()) {
// Only the functions in the only-list can change the state.
for (auto& func : module.functions) {
if (!func->imported()) {
auto& info = map[func.get()];
bool matched = onlyList.match(func->name);
info.canChangeState = matched;
if (matched) {
info.addedFromList = true;
}
if (verbose) {
std::cout << "[asyncify] " << func->name
<< "'s state is set based on the only-list to " << matched
<< '\n';
}
}
}
}
// When propagateAddList is disabled, which is default behavior,
// functions in add-list are just prepended to instrumented functions.
if (!propagateAddList) {
handleAddList(map);
}
removeList.checkPatternsMatches();
addList.checkPatternsMatches();
onlyList.checkPatternsMatches();
}
bool needsInstrumentation(Function* func) {
auto& info = map[func];
return info.canChangeState && !info.isTopMostRuntime;
}
bool canChangeState(Expression* curr, Function* func) {
// Look inside to see if we call any of the things we know can change the
// state.
// TODO: caching, this is O(N^2)
struct Walker : PostWalker<Walker> {
void visitCall(Call* curr) {
// We only implement these at the very end, but we know that they
// definitely change the state.
if (curr->target == ASYNCIFY_START_UNWIND ||
curr->target == ASYNCIFY_STOP_REWIND ||
curr->target == ASYNCIFY_GET_CALL_INDEX ||
curr->target == ASYNCIFY_CHECK_CALL_INDEX) {
canChangeState = true;
return;
}
if (curr->target == ASYNCIFY_STOP_UNWIND ||
curr->target == ASYNCIFY_START_REWIND) {
isBottomMostRuntime = true;
return;
}
// The target may not exist if it is one of our temporary intrinsics.
auto* target = module->getFunctionOrNull(curr->target);
if (target && (*map)[target].canChangeState) {
canChangeState = true;
}
}
void visitCallIndirect(CallIndirect* curr) { hasIndirectCall = true; }
Module* module;
ModuleAnalyzer* analyzer;
Map* map;
bool hasIndirectCall = false;
bool canChangeState = false;
bool isBottomMostRuntime = false;
};
Walker walker;
walker.module = &module;
walker.analyzer = this;
walker.map = ↦
walker.walk(curr);
// An indirect call is normally ignored if we are ignoring indirect calls.
// However, see the docs at the top: if the function we are inside was
// specifically added by the user (in the only-list or the add-list) then we
// instrument indirect calls from it (this allows specifically allowing some
// indirect calls but not others).
if (walker.hasIndirectCall &&
(canIndirectChangeState || map[func].addedFromList)) {
walker.canChangeState = true;
}
// The bottom-most runtime can never change the state.
return walker.canChangeState && !walker.isBottomMostRuntime;
}
FakeGlobalHelper fakeGlobals;
bool verbose;
};
// Checks if something performs a call: either a direct or indirect call,
// and perhaps it is dropped or assigned to a local. This captures all the
// cases of a call in flat IR.
static bool doesCall(Expression* curr) {
if (auto* set = curr->dynCast<LocalSet>()) {
curr = set->value;
} else if (auto* drop = curr->dynCast<Drop>()) {
curr = drop->value;
}
return curr->is<Call>() || curr->is<CallIndirect>();
}
class AsyncifyBuilder : public Builder {
public:
Module& wasm;
Type pointerType;
Name asyncifyMemory;
AsyncifyBuilder(Module& wasm, Type pointerType, Name asyncifyMemory)
: Builder(wasm), wasm(wasm), pointerType(pointerType),
asyncifyMemory(asyncifyMemory) {}
Expression* makeGetStackPos() {
return makeLoad(pointerType.getByteSize(),
false,
int(DataOffset::BStackPos),
pointerType.getByteSize(),
makeGlobalGet(ASYNCIFY_DATA, pointerType),
pointerType,
asyncifyMemory);
}
Expression* makeIncStackPos(int32_t by) {
if (by == 0) {
return makeNop();
}
auto literal = Literal::makeFromInt64(by, pointerType);
return makeStore(pointerType.getByteSize(),
int(DataOffset::BStackPos),
pointerType.getByteSize(),
makeGlobalGet(ASYNCIFY_DATA, pointerType),
makeBinary(Abstract::getBinary(pointerType, Abstract::Add),
makeGetStackPos(),
makeConst(literal)),
pointerType,
asyncifyMemory);
}
Expression* makeStateCheck(State value) {
return makeBinary(EqInt32,
makeGlobalGet(ASYNCIFY_STATE, Type::i32),
makeConst(Literal(int32_t(value))));
}
};
// Instrument control flow, around calls and adding skips for rewinding.
struct AsyncifyFlow : public Pass {
bool isFunctionParallel() override { return true; }
ModuleAnalyzer* analyzer;
Type pointerType;
Name asyncifyMemory;
std::unique_ptr<Pass> create() override {
return std::make_unique<AsyncifyFlow>(
analyzer, pointerType, asyncifyMemory);
}
AsyncifyFlow(ModuleAnalyzer* analyzer, Type pointerType, Name asyncifyMemory)
: analyzer(analyzer), pointerType(pointerType),
asyncifyMemory(asyncifyMemory) {}
void runOnFunction(Module* module_, Function* func_) override {
module = module_;
func = func_;
builder =
std::make_unique<AsyncifyBuilder>(*module, pointerType, asyncifyMemory);
// If the function cannot change our state, we have nothing to do -
// we will never unwind or rewind the stack here.
if (!analyzer->needsInstrumentation(func)) {
return;
}
// Rewrite the function body.
// Each function we enter will pop one from the stack, which is the index
// of the next call to make.
auto* block = builder->makeBlock(
{builder->makeIf(builder->makeStateCheck(
State::Rewinding), // TODO: such checks can be !normal
makeCallIndexPop()),
process(func->body)});
if (func->getResults() != Type::none) {
// Rewriting control flow may alter things; make sure the function ends in
// something valid (which the optimizer can remove later).
block->list.push_back(builder->makeUnreachable());
}
block->finalize();
func->body = block;
// Making things like returns conditional may alter types.
ReFinalize().walkFunctionInModule(func, module);
}
private:
std::unique_ptr<AsyncifyBuilder> builder;
Module* module;
Function* func;
// Each call in the function has an index, noted during unwind and checked
// during rewind.
Index callIndex = 0;
Expression* process(Expression* curr) {
// The IR is in flat form, which makes this much simpler: there are no
// unnecessarily nested side effects or control flow, so we can add
// skips for rewinding in an easy manner, putting a single if around
// whole chunks of code. Also calls are separated out so that it is easy
// to add the necessary checks for them for unwinding/rewinding support.
//
// Aside from that, we must "linearize" all control flow so that we can
// reach the right part when rewinding, which is done by always skipping
// forward. For example, for an if we do this:
//
// if (cond) { // cond is either a const or a local.get in a flat IR
// side1();
// } else {
// side2();
// }
// =>
// if (rewinding || cond) {
// new_side1();
// }
// if (rewinding || !cond) {
// new_side2();
// }
// where new_sideN is
// if (normal || check_call_index(callIndex)) {
// sideN();
// }
// when sideN can change the state, and
// if (normal) {
// sideN();
// }
// when it does not. (See makeCallSupport() for details.)
//
// This way we will linearly get through all the code in the function,
// if we are rewinding. In a similar way we skip over breaks, etc.; just
// "keep on truckin'".
//
// Note that temp locals added in this way are just normal locals, and in
// particular they are saved and loaded. That way if we resume from the
// first if arm we will avoid the second.
// To avoid recursion, we use stacks here. We process the work for each
// node in two phases as follows:
//
// 1. The "Scan" phase finds children we need to process (ones that may
// change the state), adds Scan tasks for them to the work stack, and
// process call instructions.
// 2. The "Finish" phase runs after all children have been Scanned and
// Finished. It pops the children's results from the results stack (if
// there were relevant children), and then it pushes its own result.
//
struct Work {
Expression* curr;
enum { Scan, Finish } phase;
};