Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StringLowering #6271

Merged
merged 47 commits into from
Feb 5, 2024
Merged
Show file tree
Hide file tree
Changes from 46 commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
d35493e
start
kripken Jan 29, 2024
77d5646
werk
kripken Jan 29, 2024
5ebd0a1
work
kripken Jan 29, 2024
3049a99
work
kripken Jan 30, 2024
56f940e
buid
kripken Jan 30, 2024
83dd068
test
kripken Jan 30, 2024
48089e0
work
kripken Jan 30, 2024
7859d8c
work
kripken Jan 30, 2024
955bbe6
progress
kripken Jan 30, 2024
01c7868
work?
kripken Jan 30, 2024
f00ca07
cleanup
kripken Jan 30, 2024
9dae0b3
father
kripken Jan 30, 2024
ba75182
work
kripken Jan 30, 2024
5adfed4
ment
kripken Jan 30, 2024
2c7779a
work
kripken Jan 30, 2024
9b70621
test
kripken Jan 30, 2024
d20f8d5
format
kripken Jan 30, 2024
c252831
rename
kripken Jan 30, 2024
3397899
rename
kripken Jan 30, 2024
883eba4
rename
kripken Jan 30, 2024
d93ca57
work
kripken Jan 30, 2024
07ce36a
work
kripken Jan 30, 2024
4670f8e
update
kripken Jan 30, 2024
b8fb866
work
kripken Jan 30, 2024
3bb6e3d
work
kripken Jan 31, 2024
3a5ad66
work
kripken Jan 31, 2024
e4d0b77
work
kripken Jan 31, 2024
90fbb25
work
kripken Jan 31, 2024
1a3dc63
work
kripken Jan 31, 2024
8fc15f1
work
kripken Jan 31, 2024
694ef94
test
kripken Jan 31, 2024
4ea57ee
test
kripken Jan 31, 2024
86517b0
format
kripken Jan 31, 2024
b2895ca
yolo
kripken Jan 31, 2024
ae3871e
Update src/passes/StringLowering.cpp
kripken Jan 31, 2024
a8cf56a
feedback: simplify to avoid reverse index map
kripken Jan 31, 2024
a2155ed
feedback: fix name
kripken Jan 31, 2024
2c88536
Merge branch 'string.gathering' into string.lowering
kripken Jan 31, 2024
aad2278
Update src/passes/StringLowering.cpp
kripken Jan 31, 2024
9f6bff7
comment
kripken Jan 31, 2024
0e601ea
Merge remote-tracking branch 'myself/string.gathering' into string.ga…
kripken Jan 31, 2024
7d1969f
Merge branch 'string.gathering' into string.lowering
kripken Jan 31, 2024
d5cc29e
fix
kripken Feb 1, 2024
e31cb73
Merge remote-tracking branch 'origin/main' into string.lowering
kripken Feb 1, 2024
0c9c942
Merge remote-tracking branch 'origin/main' into string.lowering
kripken Feb 2, 2024
65fc8d8
clean
kripken Feb 2, 2024
66746ee
update help test
kripken Feb 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 60 additions & 4 deletions src/passes/StringLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,17 +21,18 @@
// globals, avoiding them appearing in code that can run more than once (which
// can have overhead in VMs).
//
// Building on that, an extended version of StringGathering will also replace
// those new globals with imported globals of type externref, for use with the
// string imports proposal. String operations will likewise need to be lowered.
// TODO
// StringLowering does the same, and also replaces those new globals with
// imported globals of type externref, for use with the string imports proposal.
// String operations will likewise need to be lowered. TODO
//

#include <algorithm>

#include "ir/module-utils.h"
#include "ir/names.h"
#include "ir/type-updating.h"
#include "pass.h"
#include "support/json.h"
#include "wasm-builder.h"
#include "wasm.h"

Expand Down Expand Up @@ -175,6 +176,61 @@ struct StringGathering : public Pass {
}
};

struct StringLowering : public StringGathering {
void run(Module* module) override {
if (!module->features.has(FeatureSet::Strings)) {
return;
}

// First, run the gathering operation so all string.consts are in one place.
StringGathering::run(module);

// Lower the string.const globals into imports.
makeImports(module);

// Remove all HeapType::string etc. in favor of externref.
updateTypes(module);

// Disable the feature here after we lowered everything away.
module->features.disable(FeatureSet::Strings);
}

void makeImports(Module* module) {
Index importIndex = 0;
json::Value stringArray;
stringArray.setArray();
std::vector<Name> importedStrings;
for (auto& global : module->globals) {
if (global->init) {
if (auto* c = global->init->dynCast<StringConst>()) {
global->module = "string.const";
global->base = std::to_string(importIndex);
importIndex++;
global->init = nullptr;

auto str = json::Value::make(std::string(c->string.str).c_str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use string_view rather than c_str to properly handle nul bytes. Maybe we can fix the json API in a separate PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's worth fixing separately, I agree.

stringArray.push_back(str);
}
}
}

// Add a custom section with the JSON.
std::stringstream stream;
stringArray.stringify(stream);
auto str = stream.str();
auto vec = std::vector<char>(str.begin(), str.end());
module->customSections.emplace_back(
CustomSection{"string.consts", std::move(vec)});
}

void updateTypes(Module* module) {
TypeMapper::TypeUpdates updates;
updates[HeapType::string] = HeapType::ext;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a module ever uses the fact that string <: any, this is going to cause problems, but that's probably ok for experimentation.

Copy link
Member Author

@kripken kripken Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. If that is ever an issue it seems like we'd need to internalize/externalize in a lot of places...?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why I think we should make our internal string type be a subtype of extern instead of any, but of course that would break compatibility with the stringref proposal. This is a change we should make once we don't need to support stringref directly anymore and can get away with only supporting imported strings. Hopefully that will be soon?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, yeah, that could help. But implementing a not-quite-stringref has downsides too. I'm not sure what's best in the long term, but we don't need to decide now.

TypeMapper(*module, updates).map();
}
};

Pass* createStringGatheringPass() { return new StringGathering(); }
Pass* createStringLoweringPass() { return new StringLowering(); }

} // namespace wasm
3 changes: 3 additions & 0 deletions src/passes/pass.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -478,6 +478,9 @@ void PassRegistry::registerPasses() {
registerPass("string-gathering",
"gathers wasm strings to globals",
createStringGatheringPass);
registerPass("string-lowering",
"lowers wasm strings and operations to imports",
createStringLoweringPass);
registerPass(
"strip", "deprecated; same as strip-debug", createStripDebugPass);
registerPass("stack-check",
Expand Down
1 change: 1 addition & 0 deletions src/passes/passes.h
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,7 @@ Pass* createSimplifyLocalsNoStructurePass();
Pass* createSimplifyLocalsNoTeeNoStructurePass();
Pass* createStackCheckPass();
Pass* createStringGatheringPass();
Pass* createStringLoweringPass();
Pass* createStripDebugPass();
Pass* createStripDWARFPass();
Pass* createStripProducersPass();
Expand Down
46 changes: 46 additions & 0 deletions test/lit/passes/string-gathering.wast
Original file line number Diff line number Diff line change
@@ -1,9 +1,15 @@
;; NOTE: Assertions have been generated by update_lit_checks.py --all-items and should not be edited.

;; RUN: foreach %s %t wasm-opt --string-gathering -all -S -o - | filecheck %s
;; RUN: foreach %s %t wasm-opt --string-lowering -all -S -o - | filecheck %s --check-prefix=LOWER

;; All the strings should be collected into globals and used from there. They
;; should also be sorted deterministically (alphabetically).
;;
;; LOWER also lowers away strings entirely, leaving only imports and a custom
;; section (that part is tested in string-lowering.wast). It also removes all
;; uses of the string heap type, leaving extern instead for the imported
;; strings.

(module
;; Note that $global will be reused: no new global will be added for "foo".
Expand All @@ -19,6 +25,15 @@
(global $global (ref string) (string.const "foo"))

;; CHECK: (global $global2 stringref (global.get $string.const_bar))
;; LOWER: (type $0 (func))

;; LOWER: (import "string.const" "0" (global $string.const_bar (ref extern)))

;; LOWER: (import "string.const" "1" (global $string.const_other (ref extern)))

;; LOWER: (import "string.const" "2" (global $global (ref extern)))

;; LOWER: (global $global2 externref (global.get $string.const_bar))
(global $global2 (ref null string) (string.const "bar"))

;; CHECK: (func $a (type $0)
Expand All @@ -29,6 +44,14 @@
;; CHECK-NEXT: (global.get $global)
;; CHECK-NEXT: )
;; CHECK-NEXT: )
;; LOWER: (func $a (type $0)
;; LOWER-NEXT: (drop
;; LOWER-NEXT: (global.get $string.const_bar)
;; LOWER-NEXT: )
;; LOWER-NEXT: (drop
;; LOWER-NEXT: (global.get $global)
;; LOWER-NEXT: )
;; LOWER-NEXT: )
(func $a
(drop
(string.const "bar")
Expand All @@ -52,6 +75,20 @@
;; CHECK-NEXT: (global.get $global2)
;; CHECK-NEXT: )
;; CHECK-NEXT: )
;; LOWER: (func $b (type $0)
;; LOWER-NEXT: (drop
;; LOWER-NEXT: (global.get $string.const_bar)
;; LOWER-NEXT: )
;; LOWER-NEXT: (drop
;; LOWER-NEXT: (global.get $string.const_other)
;; LOWER-NEXT: )
;; LOWER-NEXT: (drop
;; LOWER-NEXT: (global.get $global)
;; LOWER-NEXT: )
;; LOWER-NEXT: (drop
;; LOWER-NEXT: (global.get $global2)
;; LOWER-NEXT: )
;; LOWER-NEXT: )
(func $b
(drop
(string.const "bar")
Expand All @@ -74,23 +111,32 @@
;; Multiple possible reusable globals. Also test ignoring of imports.
(module
;; CHECK: (import "a" "b" (global $import (ref string)))
;; LOWER: (import "a" "b" (global $import (ref extern)))
(import "a" "b" (global $import (ref string)))

;; CHECK: (global $global1 (ref string) (string.const "foo"))
(global $global1 (ref string) (string.const "foo"))

;; CHECK: (global $global2 (ref string) (global.get $global1))
;; LOWER: (import "string.const" "0" (global $global1 (ref extern)))

;; LOWER: (import "string.const" "1" (global $global4 (ref extern)))

;; LOWER: (global $global2 (ref extern) (global.get $global1))
(global $global2 (ref string) (string.const "foo"))

;; CHECK: (global $global3 (ref string) (global.get $global1))
;; LOWER: (global $global3 (ref extern) (global.get $global1))
(global $global3 (ref string) (string.const "foo"))

;; CHECK: (global $global4 (ref string) (string.const "bar"))
(global $global4 (ref string) (string.const "bar"))

;; CHECK: (global $global5 (ref string) (global.get $global4))
;; LOWER: (global $global5 (ref extern) (global.get $global4))
(global $global5 (ref string) (string.const "bar"))

;; CHECK: (global $global6 (ref string) (global.get $global4))
;; LOWER: (global $global6 (ref extern) (global.get $global4))
(global $global6 (ref string) (string.const "bar"))
)
23 changes: 23 additions & 0 deletions test/lit/passes/string-lowering.wast
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
;; This file checks the custom section that --string-lowering adds. The other
;; operations are tested in string-gathering.wast (which is auto-updated, unlike
;; this which is manual).

;; RUN: foreach %s %t wasm-opt --string-lowering -all -S -o - | filecheck %s

(module
(func $consts
(drop
(string.const "foo")
)
(drop
(string.const "bar")
)
(drop
(string.const "foo")
)
)
)

;; The custom section should contain foo and bar, and foo only once.
;; CHECK: custom section "string.consts", size 13, contents: "[\"bar\",\"foo\"]"

Loading