-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fetching a binary file from http & sending it => corruption #1375
Comments
Hi, the difference you're seeing here is because of a type difference: This behavior is part of the discrepancies in how k6 handles binary data. See issue #1020 for details. Ideally both But to address your use case, even with this binary issue aside, currently you wouldn't be able to achieve the memory savings you expect by loading all data in One workaround you can consider is manually splitting the data for each VU, as suggested here. Since you're not dealing with JSON, you would need to request only images for each specific VU, but that pattern would probably work for you. Note that sharing setup data efficiently across VUs has already been discussed and planned (see #532), and with the upcoming #1007/#997 distributed execution changes this kind of setup will be easier and more efficient. I'll close this as these are known issues, but let us know if you have additional questions, and for further support you're welcome to use the community forum. |
No you misread, the length difference is not because of base 64. I do decode, I'm pretty sure there is in fact a bug. Thank you for the tips -- now I'm preparing my data by fetching it in the vu loop (I could also fetch it just in the first iteration). But the bug i described does stand. |
After ... too much digging (and reading your screenshot backwards, which lost sometime), the problem for the particular case of b64encode->b64decode given different data is the combination of that b64decode returns string and how goja (the JS VM k6 uses) works with strings ... The short of it is that if b64decode returned A possible workaround that I found is to Longer explanation :D (I have probably gotten something wrong, but it seems like my conclusions agree with the experiments I have done) JS uses utf16(kind of 😑 ) for it's strings. K6 is written in golang which uses utf8 for it's strings. My knowledge on the matter is not much, but the important fact is that the byte representation of a character in the one doesn't match the other. So the Goja VM translates non ASCII only strings(fun fact ... UTF-8 ASCII is not the same as UTF-16 ASCII so no idea why :D) from k6's internal UTF-8 to UTF-16 when k6 returns a string to the JS VM and does it back around when a string from goja goes to k6(a little bit more complicated but ... close enough). Now if we look at this code in the golang playground:
we can see that going to UTF-16 from UTF-8 and back isn't lossless in this case. My gut feeling is that some of those are not actually UTF-8 valid and I would argue the exact reason is not important as this is clearly not how binary data should be handled in k6 and this should just be fixed by using typed arrays for []byte and so on. Additionally, the k6 b64decode returns string, which definitely makes the whole thing look more and more like utf16.Encode skips/tries to fix what it doesn't understand from the supposedly UTF-8 encoded string that b64decode returns. I would argue b64decode should have either always returned Another issue found along the way is that because the data returned from This should be fixed, but unfortunately, the internal goja JSON implementation is not exported ... there is Object.MarshalJSON, but in the code we have a ... goja.Value so I'm pretty sure it will take more then two lines :( |
I think that this should be fixed with a4927b6#diff-787f834ad3403248052890ea97f946bffc88d39d2821b3157b22451081c7c393, so I am closing it |
Environment
Expected Behavior
I am fetching a binary file from HTTP in the
setup()
, I print the size of the binary in the setup and in the VU function. In my real program of course I want to send the binary over HTTP.Actual Behavior
I would expect the length of the binary that I fetched to be the same in the setup and in the VU, but it's not. The binary is garbled:
INFO[0000] whew.png body size: 3399
INFO[0000] body size is: 3302
The first line is from the setup, the second from the VU sender.
The discrepancy is quite a lot worse if I don't use base64 encode/decode (in that case the size in the VU sender is about twice larger as before).
Steps to Reproduce the Problem
I have this test code:
I used python to serve files from the local folder:
As an aside, the reason I'm fetching the files from HTTP is that I have lots of files to send, different for each VU. If I fetch the files in the per-VU init code (as I think I'm meant to do), I don't have the
__VU
there (I'm getting__VU is not defined
). So I'd have to fetch the files for all the VUs in each VU init code, which would be way too much: I have about 250Mb of data for all VUs together, and 2000 VUs -- if each VU did fetch all the data, I'd load 250Mb*2000. So what I tried to do is to load the data for all the VUs together just once, in the setup. But now I'm hitting this issue.EDIT If I make http read+write in the VU sending function (not using the setup) then it works:
INFO[0000] whew.png body size: 3399
INFO[0000] body size is: 3399
The text was updated successfully, but these errors were encountered: