-
Notifications
You must be signed in to change notification settings - Fork 645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TypedArray support to jsi #248
Conversation
@wkozyra95 thanks for putting together this PR. It sounds like there's two separate concerns rolled into one PR here, so let me pull them apart.
The philosophy around JSI is to provide a lightweight interface to access engine functionality, but also to keep it simple. It's possible, as you observe, to use TypedArrays by only using the ArrayBuffer class. So the question is, how do we best address issues of usability and performance, without unnecessarily adding methods to jsi::Runtime? I agree that the current mechanism is cumbersome. But, this could be fixed by creating an API which is layered on top of the existing JSI, without adding anything to the Runtime at all. So, how would this perform? I hacked up a test case which implements a use case similar to the one you described, where I wrote HostFunctions which create a small TypedArray and fill in its elements, using the existing APIs, and the one in the PR:
The example using the current APIs is a bit more verbose, because it takes care to cache some lookups to avoid running them every call, but if this was abstracted into a library, this just needs to be done once and shouldn't be difficult to use. Then, I timed how long it took to run each function 100,000 times, and took the average. The arraybuffer version takes about 7.5µs per call, and the typedarray version takes about 5.3µs per call. I then explored why arraybuffer was taking longer. The answer turns out to be here: https://github.com/facebook/hermes/blob/master/API/hermes/hermes.cpp#L1877-L1881. If I just hack that code out of my test, now arraybuffer is only 4.5µs per call. If I add a similar bit of measurement to createTypedArray, it increases to about 8.3µs. So, this This seems to me like a lot of overhead. The person who understands this best is out of the office right now, so it will take some time to understand what our options here are to mitigate the cost. However, if we can do that, I think a better approach to this would be to implement TypedArray functions as a separate layer on top of JSI, instead of adding seven new methods to Runtime. All that said, I have another question. I took a quick look at https://github.com/expo/expo/tree/master/packages/expo-gl-cpp and I couldn't quickly see how the small TypedArrays or instanceof calls you refer to in #182 are used. I ask because I think it would help understand if In part, I wonder if TypedArray actually preferable for this use case? I added one more test case to my hack, where I create a plain JS array containing a fixed number of numbers:
This is only 4.3µs per call, and is not at all cumbersome. If it still makes sense to use TypedArray, I think that this is a nice pattern to use. It would not be hard to write a createFloat64WithElements() function similar to this, on top of the existing JSI, which efficiently creates and populates a new TypedArray. If it does turn out to be necessary to add complexity to JSI to achieve reasonable performance, then I'm not against it. But I would like to see the due diligence above to make sure the added complexity is justified. |
Hi @mhorowitz , thanks for looking into this so quickly.
I agree that all issues related to that can be solved in a separate library on top of jsi, but this can be said about every class in jsi,
I didn't realize that keeping this API as small as possible is a goal here. My original assumption was that if TypedArrays are implemented as separate entities inside VM, API should provide direct access to them. We have direct access to raw memory to read or modify data, but all the other operations are essentially js calls. When you say "lightweight interface" do you mean jsi::Runtime or entire JSI API?
Lookup like this is possible for constructors, but what about cases where array is passed from js, for my use case it's more common. To read data I need to know TypedArray type, a pointer to array buffer, size and byteOffset. Type here is most problematic.
code on master is using JSC directly, this branch contains jsi implementation with current API, last commit expo/expo@02f4e2f is implementing helper that essentially is a layer on top of jsi you mentioned There are two ways we interact with TypedArrays
In a lot of cases they aren't, but that is what WebGL standard specifies.
Originally I started working on that because I assumed that this is the only way, but after reading @tmikov answer #182 I realized that there are ways to do all of this with current API. The performance of the implementation that is using the current API is good enough to support our use case. On the other hand most of the users don't use webgl directly, there is usually some js library that takes WebGlRenderer as an argument, in those cases there are a lot more unnecessary calls gl calls that might affect performance. Improvements from direct access to TypedArray could offset that.
No, only benchmarking I have done was from js. I now that it's not very good argument to prove my point here, but I wanted to verify if any changes are noticeable in js. I compared JSC implementation with hermes(with my changes)
I run the same tests to compare Hermes with and without changes, but I don't remember those results. I'll rerun those and get back to you. One problem with the benchmark for those two cases is that I'm not familiar with the Hermes code to be 100% sure that my implementation is an efficient way to do that, for all I know some of the mechanisms I'm using are quite expensive and that could be replaced with sth better. JS code that I used for testing was just a loop that called gl function and passed the same TypedArray object to it as an argument. |
Mainly, I mean jsi::Runtime, because VM-dependent code has to be written and tested multiple times, and modifies an ABI which we are not versioning well now, and the more we add, the harder it will be later. I am ok with adding VM-independent layers on top of JSI, as these are easier to iterate and manage. I think we can also take an incremental approach here, where an app or library can provide such a layer themselves, and once we have some experience with performance and developer benefit, it would be easier to import it later.
I don't know the WebGL standard. Is it legal for the caller to pass any kind of TypedArray, and the library is expected to consume it?
I'd rather make suggestions for code improvements than make unnecessary changes to JSI. If you have profiling which shows hot spots, we can discuss ways to optimize them. It may be that we can be creative here. For example, your WebGL implementation could be a hybrid of JavaScript and C++, in a way which is invisible to the caller. If you have a function In any case, I think the place to start is with some measurements about where the bottlenecks (if any) are. |
Sorry it took me that long to answer
Depends on the function, some functions accept only specific type and some of them can accept anything e.g
I prepared example with passing small TypedArray from js to c++. https://gist.github.com/wkozyra95/e368933acf1a7d63e30fee34c6ef9b06
for current api I intentionally prepared as a minimal example as possible, normally I would need also to check the specific type or at least verify if it's correct one with instanceof |
It's faster than using jsi api, but still slower than implementation exposing typed arrays,
|
Thanks for preparing this example! A few observations:
|
Thank you for your help, I appreciate you taking the time. |
Closes #182
Current API allows us to work with TypedArrays and ArrayBuffers, but there are cases where it's either inefficient or very cumbersome.
The main use case for me is WebGl implementation, which uses a lot of per frame calls that are passing typed arrays between js and c++. In a lot of cases, those are very small typed arrays (3 or 4 elements).
Few examples with current API