-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import null pointer information from PDG into static analysis #1086
Conversation
ac9abd6
to
0a55190
Compare
I still need to test this on some actual code (maybe lighttpd?) |
0a55190
to
1f4aa09
Compare
1f4aa09
to
2b30b48
Compare
Is this waiting on anything, other than needing a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall. The brokenness in visit_place
is surprising but I agree that the current behavior is wrong as described; see inline comment.
f0cadd4
to
de844f5
Compare
3a5ab91
to
710b626
Compare
@ahomescu Don't forget to fix the CI. Currently it's failing on the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial comments. I haven't looked at the PDG code yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems generally reasonable. This approach of collapsing nested projections down to a single byte offset will limit our ability to extract borrowchecker aliasing information from the PDG, if we end up wanting to do that in the future.
Are there any automated tests related to instrumentation or the PDG? I thought we had one or two, but I could be wrong.
CopyRef => unimplemented!(), | ||
AddrOfLocal(ptr, _, size) => { | ||
// TODO: is this a local from another function? | ||
mapping.size = size.try_into().ok(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this conversion ever fail? If not, it seems like you could make the field size: usize
instead of Option<usize>
and avoid the None
case above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field would still need to be size: Option<usize>
because we have other cases where we get a pointer without a known size. That comes from the None
on line 268.
Some of this is happening because we get pointers from native libraries (e.g. glibc) which we have no control over. I think the best we can do right now is say "this is a pointer we know nothing about", and maybe add some clever heuristics later.
Re the conversion: I could do Some(size.try_into().unwrap())
, that might be better for debugging?
710b626
to
31c703a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable overall. Left a few comments.
34ddf68
to
b0351bf
Compare
39acd9d
to
f898e86
Compare
Record field and index projections for pointers by recording a Project event with a base and target pointer pair.
Add an index value to Project events to differentiate between different projections with the same pointer offset, e.g., (*p).x and (*p).x.a if a is the first field of the structure. This is implemented as a HashMap<u64, Vec<usize>> where each element is a unique combination of field projections. The Project key points to an element in this map. The keys are randomly-generated 64-bit keys with very low probability of collision.
The event assignment-based PDG construction steps were giving incorrect results for indirect accesses, so remove them completely.
Track the provenance of pointers with finer granularity by storing the size of every allocation, local, and constant inside the corresponding events. With this information, we can keep track of the boundaries of every object and track whether a projected pointer falls inside the original allocation.
Keep track of the size of every local for the new provenance algorithm.
Add a new arg_ptr() method that allows passing a pointer directly to an event handler. The handler needs to be generic on the pointee type: fn foo_handler<T: ?Sized>(ptr: *const T) { ... } This change allows us to get more information about the pointees from the event handler body, e.g., call std::mem::size_of_val().
Add a new AddrOfSized event and use it to keep track of the sizes of all constant operands for the new provenance algorithm.
Handle Offset nodes with a base pointer of 0 where the offset is non-zero, potentially resulting in a brand new pointer. One such case occurs in mod_cgi from lighttpd: const uintptr_t baseptr = (uintptr_t)env->b->ptr; for (i = 0; i < env->oused; ++i) envp[i] += baseptr;
Mark each PDG graph with a boolean flag that represents whether that graph corresponds to the null pointer or not. The PDG construction algorithm seems to build one unique graph for all null pointers in the entire program.
Add one test where a function argument can be either null or non-null in the recur() function of the analysis/tests/misc example code.
The dynamic instrumentation code inserts additional locals for its own instrumentation code.
Remove the NON_NULL permission from all nodes in the null graph from the PDG.
If the dynamic analysis tells us a pointer is never NULL (i.e. it never shows up in an is_null graph), then force the NON_NULL permission on it using updates_forbidden. This is potentially an unsound transformation because the pointer may actually be NULL on other inputs not covered by the sample ones.
f898e86
to
e50743c
Compare
f908abc
to
036e230
Compare
No description provided.