Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double free crash on macOS #189

Closed
blinsay opened this issue May 20, 2020 · 17 comments
Closed

Double free crash on macOS #189

blinsay opened this issue May 20, 2020 · 17 comments
Labels
bug Something is borken upstream Caused by an upstream dependency
Milestone

Comments

@blinsay
Copy link

blinsay commented May 20, 2020

I was messing around with smol/surf for the first time and managed to segfault with no unsafe code.

blep(14933,0x70000c2ac000) malloc: *** error for object 0x7fc8816029b0: pointer being freed was not allocated
blep(14933,0x70000cdc1000) malloc: Double free of object 0x7fc881602a60
blep(14933,0x70000c2ac000) malloc: *** set a breakpoint in malloc_error_break to debug

Based on this lldb backtrace it looks like a problem somewhere in isahc and libcurl. The backtrace and code are here:

https://gist.github.com/blinsay/3d5174aaa03f5546cd492054889d8789

I can reproduce this every 3-6 runs of my test program. Happy to poke around and get more information if you let me know what's useful.

Thanks!

@sagebind sagebind added the bug Something is borken label May 20, 2020
@sagebind
Copy link
Owner

Yikes, that's not good, but thanks for the detailed report! I'll take a look at this as soon as I am able. Based on a brief look at the backtrace I wonder if some funky things are going on with thread locals. I have not tested Isahc with smol yet, though I'd be a little surprised if it was smol's fault.

@sagebind
Copy link
Owner

sagebind commented May 22, 2020

Haven't been able to reproduce so far. What exact dependency versions are you using?

@blinsay
Copy link
Author

blinsay commented May 22, 2020

I pulled in surf 2.0.0-alpha.2 which got me isahc 0.92. Here's the relevant chunk of cargo tree:

│   │   ├── isahc v0.9.2
│   │   │   ├── bytes v0.5.4 (*)
│   │   │   ├── crossbeam-channel v0.4.2 (*)
│   │   │   ├── crossbeam-utils v0.7.2 (*)
│   │   │   ├── curl v0.4.29
│   │   │   │   ├── curl-sys v0.4.31+curl-7.70.0
│   │   │   │   │   ├── libc v0.2.70 (*)
│   │   │   │   │   ├── libnghttp2-sys v0.1.3
│   │   │   │   │   │   └── libc v0.2.70 (*)
│   │   │   │   │   │   [build-dependencies]
│   │   │   │   │   │   └── cc v1.0.53
│   │   │   │   │   └── libz-sys v1.0.25
│   │   │   │   │       └── libc v0.2.70 (*)
│   │   │   │   │       [build-dependencies]
│   │   │   │   │       ├── cc v1.0.53 (*)
│   │   │   │   │       └── pkg-config v0.3.17
│   │   │   │   │   [build-dependencies]
│   │   │   │   │   ├── cc v1.0.53 (*)
│   │   │   │   │   └── pkg-config v0.3.17 (*)
│   │   │   │   ├── libc v0.2.70 (*)
│   │   │   │   └── socket2 v0.3.12 (*)
│   │   │   ├── curl-sys v0.4.31+curl-7.70.0 (*)

Happy to put my whole Cargo.lock in a gist if that's useful.

@sagebind
Copy link
Owner

Hmm, I'm testing your code with those dependency versions and not getting anything. Surf likes to have libcurl dynamically linked -- what version of curl are you linked to? We can find out by doing a few things:

  • Run otool -L target/debug/blep and paste its output.
  • Add isahc as a dependency (isahc = { version = "=0.9.2", default-features = false }) and add this to the top of your program: println!("runtime version: {}", isahc::version());. Paste its output.

@sagebind
Copy link
Owner

sagebind commented May 22, 2020

Do I need to have a specific server running on port 8080 to reproduce this?

@sagebind
Copy link
Owner

sagebind commented May 22, 2020

OK, I am able to reproduce, though rarely. At this point, this seems to be some sort of SSL library bug to me, as it always occurs on drop() of curl handles, and libcurl seems to be doing the correct thing here when cleaning up SSL contexts.

Unfortunately I've only managed to reproduce it two times on my mac after several hundred runs, which makes debugging this pretty difficult. Since it happens more frequently on your machine, there's a few things you could do to help narrow down the problem.

The first thing to try would be to build curl without any SSL support, since the test program does not require it. If the problem goes away, then the bug is definitely within the interaction between libcurl and the SSL engine.

@blinsay
Copy link
Author

blinsay commented May 22, 2020

Do I need to have a specific server running on port 8080 to reproduce this?

Sorry for not mentioning that! I was running the httpbin Docker image on 8080.

Run otool -L target/debug/blep and paste its output.

$ otool -L target/debug/blep
target/debug/blep:
        /System/Library/Frameworks/Security.framework/Versions/A/Security (compatibility version 1.0.0, current version 58286.255.3)
        /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1575.17.0)
        /usr/lib/libcurl.4.dylib (compatibility version 7.0.0, current version 9.0.0)
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.250.1)
        /usr/lib/libresolv.9.dylib (compatibility version 1.0.0, current version 1.0.0)

The first thing to try would be to build curl without any SSL support, since the test program does not require it. If the problem goes away, then the bug is definitely within the interaction between libcurl and the SSL engine.

I'll give that a shot tomorrow.

Thanks for digging!

@blinsay
Copy link
Author

blinsay commented May 22, 2020

Based on some extremely silly forensics (running strings on /usr/lib/libcrypto.42.dylib, which popped up in the original gdb trace) I'm guessing that the crypto lib involved is libressl-22.260.1/libressl-2.6.

@blinsay
Copy link
Author

blinsay commented May 22, 2020

Confirmed with slightly less silly forensics:

$ curl --version
curl 7.54.0 (x86_64-apple-darwin18.0) libcurl/7.54.0 LibreSSL/2.6.5 zlib/1.2.11 nghttp2/1.24.1
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz HTTP2 UnixSockets HTTPS-proxy

@sagebind
Copy link
Owner

sagebind commented May 22, 2020

Perfect, thanks. When I reproduced it, it was also with libcurl/7.54.0 and LibreSSL/2.6.5. I think we're on the right track here. Sometime today I'll try and see if it is reproducible with just the curl crate and nothing else. If so, we can open a ticket upstream.

@sagebind
Copy link
Owner

sagebind commented May 23, 2020

Good news, I've made progress today narrowing down the problem. Here's what I have so far where I am able to reproduce the double free:

[package]
name = "blep"
version = "0.0.1"
edition = "2018"

[dependencies]
curl = "0.4"
use std::{
    sync::{Arc, Barrier},
    thread,
};

fn main() {
    println!("version: {:?}", curl::Version::get());

    let threads = 8;
    let barrier = Arc::new(Barrier::new(threads + 1));

    for _ in 0..threads {
        let barrier = barrier.clone();

        thread::spawn(move || {
            let mut easy = curl::easy::Easy::new();
            easy.url("http://localhost:8080").unwrap();
            easy.perform().unwrap();
            barrier.wait();
        });
    }

    barrier.wait();
}

My guess is still that LibreSSL is doing something stupid with thread locals or some such thing. Since this appears to be a curl + LibreSSL bug, I'm going to convert this minimum reproduction into C and then open a bug report on the curl project.

Just to double-check, it would be great if you could verify that the above program also causes the bug on your machine. Thanks again for discovering this!

@sagebind sagebind added the upstream Caused by an upstream dependency label May 23, 2020
@sagebind
Copy link
Owner

@blinsay What happens if you add curl as a dependency to your program, and call curl::init() first thing in your main() function?

I think this is caused by a curl footgun where curl_global_init() is not thread safe. Meaning, it must be called on the main thread before any other threads are created. The curl crate ensures that this function is invoked exactly once before using any libcurl functions, but in your example program this lazy initialization would be occurring inside a thread, since surf is creating HTTP clients on-demand.

I don't know if you read C, but the following program will also randomly perform a double free, and demonstrates what all our crates composed together in your program are doing wrong:

#include <pthread.h>
#include <curl/curl.h>

#define THREADS 8

// Ensure curl_global_init is called only once.
static pthread_once_t init = PTHREAD_ONCE_INIT;

// Does nothing with the response body.
static size_t write_callback(void *buffer, size_t size, size_t nmemb, void *userp)
{
   return size * nmemb;
}

static void init_routine()
{
    curl_global_init(CURL_GLOBAL_ALL);
}

static void *thread_main(void *arg)
{
    // This lazy initialization is basically what the curl crate does.
    pthread_once(&init, init_routine);

    CURL *curl = curl_easy_init();
    curl_easy_setopt(curl, CURLOPT_URL, "http://localhost:8080");
    curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_callback);
    curl_easy_perform(curl);

    // Double-free occurs somewhere inside here when LibreSSL cleanup is called.
    curl_easy_cleanup(curl);

    return NULL;
}

int main(int argc, char **argv)
{
    pthread_t tid[THREADS];

    // Spawn N threads. One of them will be first to initialize curl.
    for (int i = 0; i < THREADS; i++)
    {
        pthread_create(&tid[i], NULL, thread_main, NULL);
    }

    // Wait for all threads to return. We won't finish this if we SIGABRT.
    for (int i = 0; i < THREADS; i++)
    {
        pthread_join(tid[i], NULL);
    }

    return 0;
}

I will have to think about how to address this.

@sagebind
Copy link
Owner

@blinsay For now the correct solution is to call curl::init() on the main thread before making requests inside other threads, see this upstream issue: alexcrichton/curl-rust#333.

That said, I still intend to fix this in some way because as-is this is quite user-unfriendly, so I will keep this issue labeled as a bug.

@blinsay
Copy link
Author

blinsay commented May 26, 2020

I haven't yet had a crash with curl::init() as my first line in main. The smaller repro and the C-example make a ton of sense.

Thanks for digging deep and pushing this further upstream!

@sagebind sagebind added this to the 0.9.4 milestone Jun 1, 2020
@sagebind
Copy link
Owner

sagebind commented Jun 1, 2020

We have a fix in place upstream that should resolve this more elegantly for users that I can release soon so that you won't have to call curl::init() manually (and also won't have to add curl as a direct dependency).

@sagebind sagebind modified the milestones: 0.9.4, 0.9.5 Jun 10, 2020
@blinsay
Copy link
Author

blinsay commented Jun 13, 2020

awesome, thanks so much!

@sagebind
Copy link
Owner

sagebind commented Jun 24, 2020

@blinsay Great news, as of 0.9.5 this has been fixed! No workaround is required, as init will now be done on the main thread for you automatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is borken upstream Caused by an upstream dependency
Projects
None yet
Development

No branches or pull requests

2 participants