-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shareable array implementation #1739
Conversation
This is part of #532 The reason behind this not redefing ... how to open a file or load data from one is that, as in the majority of the cases after that there is some amount of ... transformation of the data, so that won't help much. The original idea was to have a shareable JSON ... but the more concrete array implementation is faster (in some cases by a lot) and also seems like a better fix for the first iteration of this, as that will be the majority of the data type. So it is very likely that the non array parts will be dropped in favor of doing them later and this getting merged faster. There are still more things to be done and depending on whether or not we want to support iterating (with for-of) it can probably be done without some of the js code.
4cf1ba8
to
7e43c20
Compare
js/initcontext.go
Outdated
case Symbol.iterator: | ||
return function() {return target.iterator()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This unfortunately can't be done through goja.ProxyTrapConifg So if we want to have support for for-of We need to use js proxy (we can use both, but I don't think that will help with anything :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm why can't we use goja.ProxyTrapConifg
? Because of Symbol.iterator
? If so, can't we just get the goja definition of the iterator symbol from the runtime and compare against that in the Get
property of the goja.ProxyTrapConifg
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes,
I am gonna copy you the relevant part(which is most of it) of documentation that I linked ;)
Note that the Proxy may not have Symbol properties when using this as a handler because property keys are passed as strings. get() and getOwnPropertyDescriptor() for Symbol properties will always return undefined; has() and deleteProperty() for Symbol properties will always return false; set() and defineProperty() for Symbol properties will throw a TypeError. If you need Symbol properties implement the handler in JavaScript.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will probably take some breaking of the API to be fixed and also the standard Symbols will need to be exported by goja .. which they currently aren't.
I doubt this will actually make much of a performance difference to be honest ... But it will be kind of nice to not have to execute a script to wrap it I guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although i guess it is worth an issue to ask why the name wasn't a goja.Value to begin with instead of string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't access the symbol from the Go side, but can't we get it from the JS side by just getting Symbol.iterator
from the runtime, even now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, nevermind, somehow missed the first of your 3 comments...
Codecov Report
@@ Coverage Diff @@
## master #1739 +/- ##
==========================================
- Coverage 71.52% 71.45% -0.07%
==========================================
Files 178 180 +2
Lines 13727 13833 +106
==========================================
+ Hits 9818 9885 +67
- Misses 3297 3326 +29
- Partials 612 622 +10
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! It mostly LGTM, save for that wrapping script that uses Proxy
. It would be great if that could be simplified/avoided, but I'm not familiar with goja enough to know if that would eventually be feasible with ProxyTrapConfig
or with some other approach.
Strong 👍 for supporting for...of
, and I think we should also support shareable objects even if it has worse performance than arrays. That will likely be easier to use in a lot of cases, but am also not sure how difficult it is to implement or if we'd be OK with postponing it for a second iteration of this, though probably yes. Maybe @sniku can chime in on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a few nitpicks. I guess it's OK if we leave shareable objects for later.
js/share.go
Outdated
// TODO rename | ||
// TODO check how it works with setup data | ||
// TODO check how it works if logged | ||
type sharedArray struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logging it will do:
TypeError: Could not convert &{Object 0xc000dfa4b0 0xc0008a4750 false map[] [] 0 0 <nil>} to primitive
at github.com/loadimpact/k6/js/common.Bind.func1 (native)
But I don't know ... what should be logged so 🤷
And passing it by setup ... doesn't work .. but it also doesn't panic - it passes some Object that seems ... empty - this can probably be fixed and made to work, but it will be harder with the cloud/distributed in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having it work from setup()
would be good, but I don't think it's required for this PoC.
That logging error looks gnarly though... We could log the first and last N elements stringified, if that's easily doable.
52cd423
to
d453b82
Compare
d453b82
to
7cc4418
Compare
js/initcontext.go
Outdated
i.sharesLock.Lock() | ||
defer i.sharesLock.Unlock() | ||
value, ok := i.shares[name] | ||
if !ok { //nolint:nestif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit excessive, why not have an early return
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
predominantly because earlier the code at the end was way longer ... but the nolint is no longer necessary actually .. so maybe the linter checking the unneeded nolints ... has a bug 😱
js/initcontext.go
Outdated
// XShare is a constructor returning a shareable read-only array (general objects coming) | ||
// indentified by the name and having their contents be whatever the call returns - again only | ||
// arrays currently supported. | ||
func (i *InitContext) XShare(name string, call goja.Callable) goja.Value { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned in slack, I'd really like not to have new k6-specific superglobal names like open()
, __VU
, etc. Instead, this should be accessible via k6/data/SharedArray
(so, import {SharedArray} from "k6/data";
) or something like that. The k6/data
package will probably be reused by #1539 for other things like segmented iterators and maybe XML/JSON/etc. streaming parsers and segmentation (though those probably will be sub-packages)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw now it should be easy to move this outside of the InitContext
, you can just have the shares registry in the InitEnvironment
that was added in #1706
js/initcontext.go
Outdated
|
||
arr := make([][]byte, len(tmpArr)) | ||
for index := range arr { | ||
arr[index], err = json.Marshal(tmpArr[index]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good enough for the first version, but I wonder if we couldn't figure out something better. For example, gob (or some other binary encoding/decoding) is probably faster.
Or maybe no encoding and decoding is necessary at all, and we can share the raw Go objects and use Object.freeze()
to make it read-only and prevent data races? goja seems to support it, and it even seems to work:
export default function () {
const obj = [1, 2, 3];
obj[1] = 22;
Object.freeze(obj); // comment this for no error
console.log(obj);
obj[2] = 33; // this throws a TypeError if frozen
console.log(obj);
}
Even if it's not perfect and there's some obscure way to modify the object, we'll document that the shared data is read-only, so it'd be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My problem is that this freezes only the top object not any object that is property of the top object. Also, goja does sometimes have an internal state for even simple types that change without any locking or stuff ... because JS is single-threaded by definition.
So IMO this will break horribly, but was one of the initial things I thought about ... also this means that I get an object from 1 VM and put it in another, without any kind of checking that it doesn't for example have functions ... that modify the original VMs properties or something ... as a whole I think while this approach will probably be "faster", unless goja has support for this it will probably be way too much work to actually properly check that things won't blow up, and any change to goja will likely necessitate refactoring.
Maybe gob will be a better fit - maybe we should try it for a second iteration ... I mostly went directly for JSON as we already do this for the setupData ... and also ... JS(ON) ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm didn't realize Object.freeze()
only touches the first layer, but even that's easy to work around. Copying the public domain deepFreeze()
from here (or, originally, here) seems to work well enough:
function deepFreeze(o) {
Object.freeze(o);
if (o === undefined) {
return o;
}
Object.getOwnPropertyNames(o).forEach(function (prop) {
if (o[prop] !== null
&& (typeof o[prop] === "object" || typeof o[prop] === "function")
&& !Object.isFrozen(o[prop])) {
deepFreeze(o[prop]);
}
});
return o;
};
export default function () {
const obj = [1, [1, 2], { c: 3, d: 4 }];
deepFreeze(obj); // comment this for no error
console.log(JSON.stringify(obj));
obj[1].push(3); // this throws a TypeError if frozen
obj[2].f = 5; // and this also
console.log(JSON.stringify(obj));
}
And we already have a JS wrapper anyway, why not go all the way? 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, goja does sometimes have an internal state for even simple types that change without any locking or stuff ... because JS is single-threaded by definition.
So I don't think this is a good idea - we can explore it in a second iteration ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, goja does sometimes have an internal state for even simple types that change without any locking or stuff ... because JS is single-threaded by definition.
Sorry, I didn't address this properly. I know about the internal state, when you pass a goja Runtime
some Go value, it wraps it in its own internal types. And that should be fine, since it doesn't touch the original value, as far as I know. At most it'll make a copy of a string to convert it to utf-16 or something like that...
So I think it's worth it to try using Object.freeze()
(or, rather, deepFreeze()
) even now, in the first iteration. It's going to be fine, even if it doesn't work at 100% and some JS guru can get around it and still modify something. They are doing it on their own risk of a data race... Besides, it's not like we haven't had issues with transforming objects to JSON and back before - we've had different key names and Go strangeness with that as well. And we do it for setup()
because we have to ship the data between instances in the cloud. Here the requirements are much different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't completely understand everything Dmitry says, but it seems like this won't be feasible... 😞
One benefit of recursively freezing objects instead of copying them (i.e. marshaling and unmarshaling), besides improved performance, would have been the better UX. With the current solution, each runtime will get its own copy of the array item, and would seeming be able to make changes in it. But if it accesses the same array item the next time, the changes it made on the previous iteration will be gone, since it'd receive a fresh copy of the data... Not the end of the world, but unless we document it very well, someone is liable to spend a bit of time pulling their hair. So, maybe it's worth recursively Object.freeze()
things even when we're copying the data on the Go side?
This meant that I needed to JSON.parse inside goja so it creates the goja objects as otherwise it can't freeze stuff from golang. At this point I decided that it is better to use JSON.stringify in the first place as well
Some benchmarks with the latest code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM, just some minor suggestions.
Also:
-
I'm out of my depths with the
Object.freeze
discussion, but shouldn't we take Dmitry's advice and use the Go API rather than wrap it in JS? It would probably complicate things a bit, but it should be worth it if it considerably increases performance. -
The benchmark in that gist shows the share test actually using more memory than the no-share test(?). I can understand the CPU increase, but isn't the whole point of this to reduce memory usage?
throw "the previous line should've errored"; | ||
} catch(e) { | ||
if (!e.toString().includes("Host object field 1 cannot be made configurable")) { | ||
throw "wrong error " + e.toString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to see separate Go testcases for each scenario, rather than this large JS script. It would be easier to run them individually and also to modify or add new ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm ... maybe, it used to be fairly short. On the other hand that will increase the amount of time the test takes(goja initialization isn't free even without the babel and corejs), while changes to this test seem ... unlikely unless we add functionality and I don't think we will be adding all that much as I doubt a lot of users will want additional Array like functionality.
I am even starting to question if enabling for-of was worth it ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would still prefer legibility/usability over a slightly longer run time, but it's fine if you don't think it's worth it.
I saw your comment about the 100-200x performance penalty with the iterator, and now I'm not so sure if we should keep it either 😄 I would assume access by index will be more common, and looping by index would work as well, but some users might prefer the for-of
notation and consider the performance loss acceptable(?). We need documentation to explain this either way, so I'm leaning towards leaving it in as is.
@imiric :
|
An interesting use case for this that I just thought of so I am just writing here - we can move it to issue if this gets merged and we think there is merit: An alternative (that will likely be better) is to make it possible for |
} | ||
arr := make([]string, obj.Get("length").ToInteger()) | ||
|
||
cal, err := rt.RunString(`(function(input, output) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain (and maybe add as a code comment here) why we're encoding the data to JSON on the JS side? JSON.stringify()
does something magical that json.Marshal()
doesn't? Or does this save us from utf8<->utf16 conversions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commit comment explains it :P. Using JSON.parse was the easiest way to get a goja object thing that can then be frozen as otherwise json.Decode
gets you a golang map which ... just isn't a goja object. You can make it into a value but it still the golang map underneath and doesn't let itself be frozen (the goja author actually commented on that).
So at that point, I had the question - do I use two different libraries to do the encode/decode or not, and went with "not" ;).
They have some ... interesting behavior differences around stuff like functions and other strange types.
But IMO it is much better to be as JS like as possible - explaining that what we do is JSON.parse/JSON.stringify a value instead of json.Encode/Decode in golang and that is why your strange value doesn't work will be again easier for users. Also possibly make it possible for them to make it work.
Another benefit is this way I iterate only once here instead of twice, or once but use string indexes and Object.Get
😱.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a code comment with that, since I doubt anyone will read the commit description in the future and might try to "optimize" it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the problem .. people don't read git commits 😭
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't played with this since mid-December, but the code LGTM in general.
Having the test in the same module will be better, but if that's annoying to do because of import loops or something like that, maybe you can move just the small try/catch
tests in there, as discrete unit tests for error handling and maybe a small one for sanity check. But leave the big test where it is, as an integration test of sorts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Noticed a couple of typos, but it's not a big deal.
The goheader
linter failures are because they're new files? 😕
No idea ... I had some golangci-lint's cache problems locally, but they seemed to go away. Maybe we are hitting something similar in github actions ... |
This is part of grafana#532 This only implements a shareable array that is generated through a callback. This way any additional processing can be done by any js code once and then the result will be shared between VUs in a readonly fashion.
This is part of #532
The reason behind this not redefing ... how to open a file or load data
from one is that, as in the majority of the cases after that there is
some amount of ... transformation of the data, so that won't help much.
The original idea was to have a shareable JSON ... but the more concrete
array implementation is faster (in some cases by a lot) and also seems
like a better fix for the first iteration of this, as that will be the
majority of the data type. So it is very likely that the non array parts
will be dropped in favor of doing them later and this getting merged
faster.
There are still more things to be done and depending on whether or not
we want to support iterating (with for-of) it can probably be done
without some of the js code.