-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Diesel Segfaults on host, but not development machine #813
Comments
I'm really bad at debugging stuff like this, so all I can offer is a bunch of links to source code. In your stacktrace, I see the following line from diesel
which is this code:
(Of course the segfault is in unsafe code!) Looking further, it seems that this drop is called at the end of the |
The intrigung part in my opinion is, that |
Are you compiling on the host itself, or from your development machine? Can you confirm that the versions of libpq are the same? |
It seems my development machine is using
while on my host machine it is |
Are you compiling on the host or target? (I don't think that should cause a segfault, but good to rule out) |
Also do you know whether the connection actually successfully established or not? Can you check if |
I'm compiling on my development macheni (as server does not seem to have the resources needed to compile the crate) |
Stepping through the function with GDB implies its establishing the connection successfully (it does not return early) |
Can you try statically linking libpq? ( |
It seems it won't let me compile on my Arch machine
This also happens under a OpenSUSE WSL. |
I updated the postgres version on my host to |
Thanks. You've given me information to reproduce, so I will try to look into it (to be honest though I don't have any ideas). The only thing I can think to try is to compile on your host machine. |
Trying to compile on my host machine literally runs out of memory :( Maybe I'll upscale it for a bit to try out, though |
I did upsize the droplet to be able to actually compile it on the host; it still shows the same behaviour though. |
Closing as this issue has been stale for a while, and there's still nothing actionable we can do. If this issue is still occurring or you can provide additional information, let me know and I'll reopen. |
I'm seeing what appears to be a similar segfault when running Happy to try and help run this down, please let me know what additional information I can provide. diesel
example failing job on travis: https://travis-ci.org/otterandrye/photothing-api/jobs/449963414 |
there's nothing helpful in the DB logs either, just seeing the connections drop when the test binary segfaults:
|
If you could compile Diesel with address sanitizer it might be able to provide more specific information on where and why the segfault is happening. You'll need a nightly compiler and pass one flag to it, see https://github.com/japaric/rust-san for details. |
For me this looks like a bug inside of libpq, because if you look at the code we are passing a pointer to |
I filed an issue against rocket with some more information: rwf2/Rocket#829. I'll try rust-san next time I'm on the appropriate machine and report back. I've got the failure reproduced in a docker container with my rocket app. One other thing: I noticed in the libpq docs that you're supposed to call PQfinish even on failed connection attempts so I wonder if that's relevant. |
|
Here's the libpq versions on the container where things are failing:
I'm working on getting the environment needed to reproduce into a |
Ok, I've got the segfault reproducible in a container. Here's the
To reproduce you just need to docker build & docker run the container - after it fails you can optionally start a shell and explore with gdb etc. Please let me know if there's anything else I can do to help here, we're well out of my usual comfort zone :-D |
I've tried to dig a bit into this. The first thing that I've tried was to disable all unrelated tests, till only that one that triggers the bug is left. Turns out that there is no single test triggering this. At least OpenSSL error 1
OpenSSL error 2
Next stop was trying to get some backtraces to see what's exactly happen here. Turns out there are different backtraces for the same crash. All those have in common that the segfault happens inside of jemalloc during deallocation of something. One (the last one) of those backtraces does not show libpq or anything diesel related, so in my opinion this is clearly not diesel related at this point, the error must be somewhere else. So blind guessing what is happening here: Both, libpq and rusoto seems to depend on openssl. Somehow someone get's openssl into some state that does something really strange (possibly corrupting the allocator??). Therefore the crashes that we observe are only the results of the actual error. Backtrace 1
Backtrace 2
Backtrace 3
cc @sfackler as openssl maintainer (because of the rust-openssl error while minimizing), @matthewkmayer as rusoto maintainer (because the second required test depends heavily on rusoto) |
This looks like an interesting bug! Let's try using |
I'll give Other random piece of intel: this seems to have something to do with the travis-ci containers, I wasn't able to reproduce the segfault with either of |
I've managed to write a "simple" program that reproduces this issue: // diesel = {version = "=1.3.3", features = ["postgres", "r2d2"]}
extern crate diesel;
// reqwest = "=0.9.5"
extern crate reqwest;
use std::thread;
use std::env;
use reqwest::Client;
use diesel::r2d2::*;
use diesel::*;
type PgPool = Pool<ConnectionManager<PgConnection>>;
pub fn init_db_pool() -> PgPool {
let db = env::var("DATABASE_URL").expect("missing database url");
let manager = ConnectionManager::<PgConnection>::new(db);
Pool::new(manager).expect("db pool")
}
fn request() {
let url = "https://google.com";
let _res = Client::new().put(url)
.body("some content")
.send()
.expect("request failed");
}
fn main() {
for i in 0..10 {
println!("Try {:?}", i);
let b = thread::spawn(|| {request()});
let a = thread::spawn(|| {let _ = init_db_pool();});
a.join();
b.join();
}
} Running this code in the docker container provided above does the following for me:
(This does also happen on newer nightlies, so this does not depend on the rustc version) @matthewkmayer This means this is not a rusoto issue |
I've run into a similar issue (and mistakenly opened #2092). Here's my code to reproduce: use diesel::pg::PgConnection;
use diesel::connection::Connection;
fn main() {
let connection = PgConnection::establish("postgres://USER:PASSWORD@localhost/")
.unwrap_or_else(|e| panic!("Error connecting to database, {:?}", e));
let res = reqwest::get("https://example.com").unwrap();
panic!("");
} Two funny parts: Removing the Edit:
|
Actually, I don't even need the use diesel::pg::PgConnection;
use diesel::connection::Connection;
fn main() {
let connection = PgConnection::establish("postgres://USER:PASSWORD@localhost/")
.unwrap_or_else(|e| panic!("Error connecting to database, {:?}", e));
let res = reqwest::get("https://example.com").unwrap();
} The code above is still failing for me. More details
|
@Munksgaard There is not much the diesel team can do here. |
@weiznich As far as I can see there's no I understand that you don't have a lot of options here, but it's nice to get these things noted for anyone else with the same problem. I still don't quite understand why such a simple program crashes everything. |
Hello,
I'm in the progress of writing a small discord bot using diesel.
However my application seems to receive a segfault on my VPS host (
uname -a
:Linux ubuntu-512mb-fra1-01 4.4.0-67-generic #88-Ubuntu SMP Wed Mar 8 16:34:45 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
; its the smallest size of an digitalocean droplet).It does not do so on my development machine (
uname -a
:Linux charon 4.10.2-1-ARCH #1 SMP PREEMPT Mon Mar 13 17:13:41 CET 2017 x86_64 GNU/Linux
)Within GDB I get the following message for the segfautl:
bt
gives me the following stacktrace:The complete source code I'm runnning can be found under https://github.com/skeleten/skellybot/tree/ece2a04ec61eaa8a1e62ecc3997aa0b7e4e99c6a
The Segfaults happens after receiving the first message which is processed here https://github.com/skeleten/skellybot/blob/ece2a04ec61eaa8a1e62ecc3997aa0b7e4e99c6a/src/main.rs#L111
The text was updated successfully, but these errors were encountered: