Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ArcHeap checking (issue #161 follow-up) #312

Open
jasongibson opened this issue Oct 9, 2020 · 5 comments
Open

ArcHeap checking (issue #161 follow-up) #312

jasongibson opened this issue Oct 9, 2020 · 5 comments

Comments

@jasongibson
Copy link

Since the ArcHeap team released their tool, I ran it over mimalloc for a
day and it reported some issues. The three things I note are:

  • The only two categories of problem it finds are overlapping-chunk (OC)
    and restricted-write (RW). So far none of the RW crashes it has found
    have reproduced outside of what AFL reported, so perhaps those are
    invalid tests, or only reproduce with the right initial randomness state.
    The ArcHeap paper doesn't appear to make a hard claim of no
    false-positives - just that they didn't see any. @jakkdu FYI in case this is
    of interest.

  • One of the OC files it created seemingly enters an infinite loop and
    allocates ~7TB of virtual memory. I couldn't attach the debugger to
    it and it only happens every few runs.

  • Another OC file hits its assert() every run. I presume that means
    it's a reliable overlap?

This is with MI_SECURE=4 and mimalloc v1.6.4.
The POC .c files are available.

@insuyun
Copy link

insuyun commented Oct 10, 2020

Hi. Thank you for running ArcHeap.
Could you share your poc files?
I would like to check them.

Best,
Insu Yun

@daanx
Copy link
Collaborator

daanx commented Oct 11, 2020

Awesome. I would like to harden mimalloc as well as possible (within reasonable efficiency).
Already due to the ArcHeap tool mimalloc checks reliably for double frees, and also detects heap block overflows and free-list corruption (Thanks Insu!).

However, at some point double-free checking is in tension with reasonable efficiency -- so I think that in the extreme one needs to test with, say, address sanitizer, but run in production with mimalloc-secure.

The difficult situation as: 1) an object is allocated at address A and freed; 2) lots of other objects are freed until an entire raw OS block is free; 3) the OS block is reused for new pages or large objects; 4) and only now the object is freed again.
Since at 4) the raw OS memory was reused in a completely new way, the meta-data may not correspond at all anymore to the initial situation and thus this can lead to all kinds of errors. (mimalloc in secure mode should still not corrupt entries though but it may free an object if it happens to be at address A for example).

Now, this is a fundamentally tricky situation: the allocator either (A) reuses raw OS memory when it can or (B) it keeps allocating fresh (virtual) OS memory and never reuses previous memory if the metadata would change. (B) is much more expensive though, and of course, the whole point of an allocator is to "reuse" OS memory efficiently so I see it as unresolvable in general.

(I believe this is also the argument made by Emery Berger here (issue 12),
he says it more concisely as: "As far as I can tell, this is not a bug in DieHard. This code confuses corruption due to double-frees with aliasing when DieHard (legally and correctly) returns the same memory address after it has been freed.")

Anyways, if there are any POC files I would love to try and see if it exposes a realistic bug or not.
Also, see also the main-override-static.c test file which has two tests double_free1 and double_free2 that come from ArcHeap.

@jasongibson
Copy link
Author

Yea, I've got no interest in being pedantic on this. I figured that since what I found was a bit different from what was described in the original ticket that it was worth mentioning. And if nothing else, the odd side effect of the memory growth would be worth understanding in case it happens due to a bug in an actual application.

I emailed the OC POC files, but need to see if I can recreate the RW ones since they were lost in a power outage.

@daanx
Copy link
Collaborator

daanx commented Oct 14, 2020

Thanks Jason. Yep, it is weird to see memory growth so not sure if that is actually metadata corruption (which I would consider a bug for sure!). Thanks for following up.

@insuyun
Copy link

insuyun commented Oct 15, 2020

@jasongibson, could you share the PoC with me, too?

@daanx I couldn't fully understand the situation that you explained earlier. But, in DieHarder's case, the problem was not just reusing, but incorrect reuse. In ArcHeap, we don't consider a trivial reuse as buggy, which returns same address.
For example,

void* a = malloc(0x100);
free(a);
void* b = malloc(0x100);
assert(a == b); // reused
free(a); // double free?
void* c = malloc(0x100);
// overlapping since c== a, but we don't consider it as buggy 

In this case, a is a dangling pointer in application's view but not for the allocator, since an allocator does not know the applications status. In DieHard issue (emeryberger/DieHard#12 (comment)), we believe that it is a bug because it returns some memory, which does not alias to any others in double free, i.e., different from reusing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants