-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
command,runner: limit environ size when starting command #654
Conversation
Something's not quite right: https://gist.github.com/sourishkrout/3c8153387a81a30ec5e0d55e0d601ba1 On the notebook (client-side), I ran this where downloaded the 1m records users file from https://jsoneditoronline.org/indepth/datasets/json-file-example/
|
Tracing the size against the limit is interesting. Are we comparing the right units on both sides @adambabik? go run . server --address 127.0.0.1:9999 --tls /tmp/runme/tls 2>/dev/null
# Ran on 2024-08-22 15:31:45-07:00 for 2.059s
size: 3746 MaxEnvironSizeInBytes: 128000000
size: 3886 MaxEnvironSizeInBytes: 128000000
size: 3886 MaxEnvironSizeInBytes: 128000000
size: 3799 MaxEnvironSizeInBytes: 128000000
size: 3939 MaxEnvironSizeInBytes: 128000000
size: 3939 MaxEnvironSizeInBytes: 128000000
size: 2100901 MaxEnvironSizeInBytes: 128000000
size: 2101041 MaxEnvironSizeInBytes: 128000000
size: 2101041 MaxEnvironSizeInBytes: 128000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Long story short is that #648 produces the same error. Please see analysis in the comments.
Upon further investigation is looks like the entire file never makes it into the env store. It appears to truncate it somehow. The chunks seem to closely track the |
@sourishkrout just so I understand the test scenario. In one cell, you run |
Correct. The size is reported (on the Go kernel side) when the ENV is populated (i.e. However, what I'm noticing ENV.Limits.mp4 |
Btw, it appears the last x characters of Possibly something off with the ring buffer in |
Also, for easier "debugging". Here's my log from just cat-ing the large file:
|
Following up on the above. This is very interesting @adambabik. When I reported the length of stdout in with
with
Here's the path for the tracing in
|
b131cbd
to
e9b6e83
Compare
@sourishkrout thanks for the details. I found the bug and fixed it. There were two problems:
Please test it again. I added unit tests that replicated it and now they are passing. |
e9b6e83
to
4e2d691
Compare
Excellent. Thanks for the fix, @adambabik! The output now appears correctly in the output cell (integration with extension) 👍 . However, the original issue that subsequent cells are not running (screenshot below) due to the ENV being too large persists. I suggest you merge this fix and address the remaining issue in a separate PR. Without well-known limits for every OS's ENV size, we should pick a hard limit for storing the last output and trim it accordingly. Trying to be too clever here will likely make it hard to reason about scenarios where the limit is exceeded and/or not even being aware of problems unless reported by users. Wdyt @adambabik? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge an use separate PR for remaining issue as per my comment.
Regarding my last comment on picking a limit per Perhaps we want to lower the limit to 2 MB before merging? |
@sourishkrout thanks for the review. You're right, I was focusing on the second problem, and the environment limit was not fixed. I agree that being "clever" here is useless and a reliable solution is much more complex and really requires detecting the boundary case by case. It's not a simple function of operating system. I have found and run this on my machine:
However, EDIT: 2 MB did not work for me for the length of a single env. I needed to subtract ~128 bytes. So maybe 1 MB per env and 8 MB for the whole environ? |
Work for me. I was doing some surface-level research about |
4e2d691
to
89d7760
Compare
Quality Gate passedIssues Measures |
// // +2 for the '=' and '\0' separators | ||
// return len(k) + len(v) + 2 | ||
// } | ||
const ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sourishkrout these are the limits that worked for me when testing on macOS Ventura and Debian.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works on Sonoma, too 👍. Let's roll with these for now.
@sourishkrout finally, I found limits that worked for macOS Ventura and Debian. I think we can merge it now. If you want to further tweak the numbers, take a look here. |
This change ensures that the size of individual environment variables, as well as all environment variables (environ) passed to a command, are within limits.
Env can come from various sources like system, session, and project. The problem with exceeding the limit is typically caused by storing the last output of a previous command in a special env called
__
. The algorithm recognized this and trims the prefix of__
. While executing the command, the last output is stored in a ring buffer anyway so the behaviour stays the same. If the variable is not present or trimming is not enough, the error is logged but still the command is tried to be executed.This change also fixes a problem of trimming large outputs when executing commands via the
runnerv2
service.Relates to #648