-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lldb][Linux] Parse, but don't store "comm" from /proc/stat file #100387
Conversation
As reported in llvm#89710, the %s code used for `comm` could and probably does, overflow the buffer. Likely we haven't seen it cause problems because the following data is overwritten right afterwards. Also scanf isn't a great choice here as this `comm` can include many characters that might trip up %s. We don't actually use `comm`, so parse but don't store it so we're not overflowing anything.
@llvm/pr-subscribers-lldb Author: David Spickett (DavidSpickett) ChangesAs reported in #89710, the %s code used for Also scanf isn't a great choice here as this We don't actually use Full diff: https://github.com/llvm/llvm-project/pull/100387.diff 1 Files Affected:
diff --git a/lldb/source/Host/linux/Host.cpp b/lldb/source/Host/linux/Host.cpp
index 5545f9ef4d70e..0bc736d90ea76 100644
--- a/lldb/source/Host/linux/Host.cpp
+++ b/lldb/source/Host/linux/Host.cpp
@@ -51,11 +51,9 @@ enum class ProcessState {
Zombie,
};
-constexpr int task_comm_len = 16;
-
struct StatFields {
::pid_t pid = LLDB_INVALID_PROCESS_ID;
- char comm[task_comm_len];
+ // comm
char state;
::pid_t ppid = LLDB_INVALID_PROCESS_ID;
::pid_t pgrp = LLDB_INVALID_PROCESS_ID;
@@ -100,8 +98,8 @@ static bool GetStatusInfo(::pid_t Pid, ProcessInstanceInfo &ProcessInfo,
StatFields stat_fields;
if (sscanf(
Rest.data(),
- "%d %s %c %d %d %d %d %d %u %lu %lu %lu %lu %lu %lu %ld %ld %ld %ld",
- &stat_fields.pid, stat_fields.comm, &stat_fields.state,
+ "%d %*s %c %d %d %d %d %d %u %lu %lu %lu %lu %lu %lu %ld %ld %ld %ld",
+ &stat_fields.pid, /* comm, */ &stat_fields.state,
&stat_fields.ppid, &stat_fields.pgrp, &stat_fields.session,
&stat_fields.tty_nr, &stat_fields.tpgid, &stat_fields.flags,
&stat_fields.minflt, &stat_fields.cminflt, &stat_fields.majflt,
|
You can test this locally with the following command:git-clang-format --diff 929b474991ce643cff01eeeec830b29735460db4 0e65e0e28cc4a20fbe91d3a7a76b7a0ed671e0cc --extensions cpp -- lldb/source/Host/linux/Host.cpp View the diff from clang-format here.diff --git a/lldb/source/Host/linux/Host.cpp b/lldb/source/Host/linux/Host.cpp
index 0bc736d90e..5a12fa4d01 100644
--- a/lldb/source/Host/linux/Host.cpp
+++ b/lldb/source/Host/linux/Host.cpp
@@ -99,13 +99,13 @@ static bool GetStatusInfo(::pid_t Pid, ProcessInstanceInfo &ProcessInfo,
if (sscanf(
Rest.data(),
"%d %*s %c %d %d %d %d %d %u %lu %lu %lu %lu %lu %lu %ld %ld %ld %ld",
- &stat_fields.pid, /* comm, */ &stat_fields.state,
- &stat_fields.ppid, &stat_fields.pgrp, &stat_fields.session,
- &stat_fields.tty_nr, &stat_fields.tpgid, &stat_fields.flags,
- &stat_fields.minflt, &stat_fields.cminflt, &stat_fields.majflt,
- &stat_fields.cmajflt, &stat_fields.utime, &stat_fields.stime,
- &stat_fields.cutime, &stat_fields.cstime,
- &stat_fields.realtime_priority, &stat_fields.priority) < 0) {
+ &stat_fields.pid, /* comm, */ &stat_fields.state, &stat_fields.ppid,
+ &stat_fields.pgrp, &stat_fields.session, &stat_fields.tty_nr,
+ &stat_fields.tpgid, &stat_fields.flags, &stat_fields.minflt,
+ &stat_fields.cminflt, &stat_fields.majflt, &stat_fields.cmajflt,
+ &stat_fields.utime, &stat_fields.stime, &stat_fields.cutime,
+ &stat_fields.cstime, &stat_fields.realtime_priority,
+ &stat_fields.priority) < 0) {
return false;
}
|
Ignoring the formatter to keep the change intent clear. The fields we actually use from this struct are:
And I think the some of them are set in process info, but nothing is done with them. So I'll see if we can rely purely on proc/status, for a follow up PR. |
We cannot rely purely on status because we need the times, for work being done by @feg208. I could mark a bunch of them with I think we can make the parsing up to comm better though. |
Actually, @feg208 can you to look at that? An example that I think will break the %s is:
You could parse the pid and comm manually, then scanf the rest. comm can include anything a file name can I think, spaces and brackets certianly. So |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes, I didn't realize this was new code. @feg208, we need to fix this soon.
I'm approving this because it strictly improves upon the current situation, but I don't think it's not a complete fix. I think we can skip over the comm
field with a rsplit(')')
. The rest could be parsed with scanf, as it's kinda convenient -- even if old-fashioned. (A more modern approach would be either with std::istringstream or some combination of StringRef::consumeInteger, StringRef::ltrim, llvm::split and friends
…0387) Summary: As reported in #89710, the %s code used for `comm` could and probably does, overflow the buffer. Likely we haven't seen it cause problems because the following data is overwritten right afterwards. Also scanf isn't a great choice here as this `comm` can include many characters that might trip up %s. We don't actually use `comm`, so parse but don't store it so we're not overflowing anything. Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60250556
As reported in #89710, the %s code used for
comm
could and probably does, overflow the buffer. Likely we haven't seen it cause problems because the following data is overwritten right afterwards.Also scanf isn't a great choice here as this
comm
can include many characters that might trip up %s.We don't actually use
comm
, so parse but don't store it so we're not overflowing anything.