-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
procfs::process::all_processes() can blow "open files" ulimit on a system with a large number of processes #188
Comments
Ugh, I can see this issue was already foreseen in #125, in this comment in particular.
Does this mean the consumer is responsible for making sure the Process objects are short lived? In that case I'll raise a task in nushell. Some documentation highlighting these responsibilities would be nice. |
Hi, thanks for raising the issue. Yes, the consumer is responsible for making sure the The use-case you described (with nushell's (I also note that their code looks to be written from before #171 was merged, so they're not taking advantage of the benefits of #171 and are getting only the downsides) There are other use-cases where it makes sense to keep a So my first inclination is for nushell to adjust their code to account for this. If you open an issue with nushell, I'd be grateful if you could CC me. I'm interesting in what the nushell team thinks. |
I see the purpose of the current Wouldn't it be better to have essentially two APIs? An eagerly evaluated |
Can you say more about this, and where you see the potential footgun? (Edit: not sure if the |
Maybe footgun is too harsh a term, but having an FD within the process, as a private field, with no documentation about this implementation detail (yet), would lead to unexpected behavior. A reasonable person would expect to be able to create thousands of Process instances and push them into a Vec without running out of file descriptors. |
Ah, yes, that is quite right. The documentation is clearly insufficient here, and I'll work on that this weekend. You're also right that we could provide some additional API that grabs all available info about a process and then closes the FD. I'm not inclined to provide one at the moment because it's hard to know what data to load and I think it's too inefficient to load everything. I'd rather have consumers of this API write their own wrappers that fit their need. (For example, the previous API always loaded the /proc/pid/stat structure, but this was wasteful for consumers that never needed this info) |
I'd say the addition of the documentation, highlighting the fact that there are open file descriptors, is a satisfactory resolution to this issue. |
Looks like the changes in pull request #171 make the
procfs::process::Process
struct hold an open file descriptor for the corresponding/proc/<pid>
directory, and stats are then lazily loaded with methods.Unfortunately, keeping a lot of open file descriptors around can hit the "max open files" ulimit - it is 1024 on most linux installs by default. This happened to me today while playing around with nushell, when the
ps
command did not work, since it had hit the open files limit.The code in question is this snippet, where
procfs::process::all_processes()
is called, then moreProcess
instances are created from there which really explodes the number of open file descriptors.Unfortunately I think the change in #171 - ie. wrapping an open file descriptor in a struct and holding it open until such time as it is needed - is a bad idea. Why not just hold name of the
/prod/<pid>
directory as a String? Convention dictates that a file is opened, data is read, and the file is closed in rapid succession. This is effectively how it is being done by the methods inimpl Process
, except for the open file descriptor for the directory!To reproduce the issue, I offer this butchered version of
examples/ps.rs
:Run this with a suitably low ulimit, eg.
ulimit -n 256 cargo run --example bad-ps.rs
Note: I'm creating new
Process
es and pushing them into aVec
to trigger a panic for illustrative purposes. If I push the originalprc
objects into the Vector, theProcessesIter
iterator simply finishes early. Either way, the intent is to show that if the Processes returned byall_processes()
are not dropped, the open files limit will be reached.The text was updated successfully, but these errors were encountered: