Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pool cannot detect idle connection disconnected by MySQL and will trigger read ECONNRESET on newly acquired connection #683

Closed
t83714 opened this issue Nov 20, 2017 · 31 comments

Comments

@t83714
Copy link

t83714 commented Nov 20, 2017

MySQL will disconnect idle connections after certain time frame.

This time frame is determined by MySQL variable wait_timeout.

It seems mysql2's pool implementation cannot detect this type of disconnection (ping is not sufficient).

And when it happens, any queries sent through the acquired connection will trigger a read ECONNRESET error.

At this moment, I have to send a SELECT 1 query to test the acquired connection before use.

And previous code that calls pool.query directly will have to be re-written.

Can Mysql2 provides two extra pool creation options:

  • verifyConnectionInterval: number of seconds before pool should send SELECT 1 query to test connections in the pool.
  • verifyConnection: boolean. If set true, the pool will test a connection in the pool every verifyConnectionInterval seconds.
@sidorares
Copy link
Owner

I saw this problem as well, and I think there is no way to automatically detect "server closed connection" automatically other than doing some kind of heartbeat messages

Are you sure doing .ping(callback) is not enough? Probably not much difference compared to SELECT 1

@sidorares
Copy link
Owner

I think first step would be adding all events supported by mysqljs/mysql pool - https://github.com/mysqljs/mysql/#release

That way you can start/stop heartbeat in your code manually in reponse to release/acquire events

And later we could simplify that and make available as pool option

@t83714
Copy link
Author

t83714 commented Nov 20, 2017

Thanks for your reply.
Will run more tests on using .ping(callback) for heartbeat test and get it back to you.

@t83714
Copy link
Author

t83714 commented Nov 20, 2017

Hi @sidorares ,
Had a few more tests on this.

It seems I can't get .ping(callback) work for heartbeat test purpose but SELECT 1 works well.

Here are the code examples (I used generic-pool for pool management):

Working Example (use SELECT 1):

import mysql2 from "mysql2/promise";
import genericPool from "generic-pool";
import dbConfigInfo from "../config/db.local";

const opts = {
    host: dbConfigInfo.host,
    user: dbConfigInfo.username,
    password: dbConfigInfo.password,
    database: dbConfigInfo.database,
};

const PoolDB = genericPool.createPool({
    create : () => mysql2.createConnection(opts),
    destroy : (connection) => connection.end(),
    validate : (connection) => connection.query(`SELECT 1`).then(()=>true,()=>false),
}, {
    max : 5,
    min : 0,
    testOnBorrow : true
});
const db = {
    execute : async (sql, values, cb) => {
        let conn;
        try{
            conn = await PoolDB.acquire();
            const r = await conn.execute(sql, values, cb);
            await PoolDB.release(conn);
            return r;
        }catch(e){
            if(conn) await PoolDB.destroy(conn);
            throw e;
        }
    },
};

Non-working Example (use .ping(callback)):

import mysql2 from "mysql2/promise";
import genericPool from "generic-pool";
import dbConfigInfo from "../config/db.local";

const opts = {
    host: dbConfigInfo.host,
    user: dbConfigInfo.username,
    password: dbConfigInfo.password,
    database: dbConfigInfo.database,
};

const PoolDB = genericPool.createPool({
    create : () => mysql2.createConnection(opts),
    destroy : (connection) => connection.end(),
    validate : (connection) => connection.ping().then(()=>true,()=>false),
}, {
    max : 5,
    min : 0,
    testOnBorrow : true
});
const db = {
    execute : async (sql, values, cb) => {
        let conn;
        try{
            conn = await PoolDB.acquire();
            const r = await conn.execute(sql, values, cb);
            await PoolDB.release(conn);
            return r;
        }catch(e){
            if(conn) await PoolDB.destroy(conn);
            throw e;
        }
    },
};

Before the tests, I changed my local MySQL wait_timeout variable to only 10 seconds.
Then, I triggered queries every 15 seconds through my web app (call db.execute() defined above).
The first example works well and the second example will trigger Access Denied: Can't add new command when connection is in closed state error after 10 seconds.

@hallh
Copy link

hallh commented Apr 5, 2018

I'm seeing this too, and not only due to wait_timeout.

About detecting closed connections, I've noticed that the number of free connections (during low-load) matches the number of actual open TCP connections. The pool._allConnections queue still remains the same. That is, ignorant of closed connections. But given that the TCP connections are being closed, it should be fairly straight forward to detect.

There are some notes in this gist.

As you can see from the notes, the amount of FREE connections slowly drops after the pool has reached its connectionLimit. I haven't looked into why the server is closing the connections, or whether it actually is the server closing them, but it's fairly common that MySQL does that, so I'd expect the library to just deal with it.

It seems a pretty serious issue, since after all the connections have been dropped - the pool will never regain any of them, practically stalling all subsequent DB queries until the process is restarted.

@midnightcodr
Copy link

I too was hit with the read ECONNRESET error recently and it's caused by default connection timeout with MySQL server. I have a long running program that uses mysql2 pool and make SQL updates whenever needed. During our UAT development, since there are no activities overnight, the queries run on the second day would simply fail with that error. Since node-mysql has better handling for the idled connections problem, I've switched to using node-mysql instead. I might give the validate option a try in the future.

@midnightcodr
Copy link

Did some tests today, I couldn't replicate the read ECONNRESET problem with a low wait_timeout setting. My setup: (doesn't seem setting of interactive_timeout is relavant either)
server: (mysql 5.6 from https://hub.docker.com/_/mysql/)

[mysqld]
wait_timeout=10
interactive_timeout=20

mysql2 version 1.5.3.

My testing code:

const mysql = require('mysql2/promise')
const settings = {
    host: 'localhost',
    user: 'dbuser',
    password: 'password',
    database: 'dbname',
    connectionLimit: 2
}
const sleep = ms => {
    return new Promise(resolve => {
        setTimeout(resolve, ms)
    })
}

const main = async () => {
    const pool = await mysql.createPool(settings)
    // console.log(pool)
    const r = await pool.query('show session variables like "%timeout"')
    console.log(r)
    const [rowsA, fieldsB] = await pool.query('select * from awesomeTbl')
    console.log('to query in 30 seconds')
    await sleep(30000)
    const [rows, fields] = await pool.query('select * from awesomeTbl')
    const [rows2, fields2] = await pool.query('select * from awesomeTbl')

    console.log('to query again in 30 seconds')
    await sleep(30000)
    const [rows1, fields1] = await pool.query('select * from awesomeTbl')

    await pool.end()
}

main()

None of the queries triggers an error.

@anthonywebb
Copy link

anthonywebb commented Feb 10, 2019

I am seeing this as well, it is a shame too, because mysql2 is so much faster with my DB, alas, I cant have the server throwing ECONNRESET without remedy.

@grega913
Copy link

I have problem with ECONNRESET (mysql, pool, knex, node) and I belive if you have the option to tweak this param on server, it should be ok . . .
https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_connect_timeout

@midnightcodr
Copy link

@anthonywebb in one of my recent projects I am being hit with the same problem again and I implemented a quick and dirty solution by querying the mysql server with:

setInterval(() => {
  pool.query('select 1')
}, 120000)

It's ugly but at least it works.

@benbotto
Copy link
Contributor

@midnightcodr How does that ensure that all of the connections in the pool are kept alive? That is, if you have 10 connections in the pool, is it possible that the keep-alive code will simply use the first connection over and over again and let the other 9 die off?

@midnightcodr
Copy link

@benbotto I doubt that's how pooling works. Let's say you specify a pool size of 10, mysql2 (or mysql) does not create or keep 10 connections alive all the time. It only create up to 10 connections when needed. Most of the time you should have fewer than 10 connections alive at the same time - unless your mysql server is mad busy.

@benbotto
Copy link
Contributor

@midnightcodr Sure, that makes sense, but the question still stands. The 10 is arbitrary (and small fries!). Is it possible that your keep-alive code snippet will simply use the first connection over and over again and let the other n connections in the pool die off?

Let's say that 2 connections are created in the pool. After idling for a while, these connections are killed by MySQL, potentially due to the wait_timeout threshold. It appears to me that mysql2 does not detect the timeout/connection abortion, and subsequent queries against either connection will fail with a ECONNRESET error.

So with your keep-alive snippet, will it keep the both connections alive? Or is it possible that only one is kept alive, and the next time that 2 connections are needed the second connection fails with an ECONNRESET error?

And thank you for the response and help!

@sidorares
Copy link
Owner

sidorares commented Dec 11, 2019

Let's say that 2 connections are created in the pool. After idling for a while, these connections are killed by MySQL, potentially due to the wait_timeout threshold. It appears to me that mysql2 does not detect the timeout/connection abortion, and subsequent queries against either connection will fail with a ECONNRESET error.

So there are 2 different questions here: 1) the driver should (ideally) remove connection from the pool immediately after it's killed. 2) A way to prefer to have N hot connections in the pool all the time (and if it gets below that for some reason actively re open new one)

Regarding 1 - I previously had issues where network socket does not get any events at all when other end is closed. I need to test that again, if there is a way to see it's closed (ideally without heartbeats and just via os level notification) we should definitely make sure connection is removed from the pool

Open to more discussion re question 2

Also if anyone can debug what happens for example when you kill server or kill connection using server side sql that would be helpful ( do we get anything fired here?

if (this._closing) {
and here
this.stream.on('error', this._handleNetworkError.bind(this));
)

@benbotto
Copy link
Contributor

Thanks for the response, @sidorares. I don't personally have a need for haveing N hot connections in the pool (2). The case I presented is just a hypothetical, and it's a question about midnightcodr's keep-alive code snippet.

But plain and simple: I'm getting ECONNRESET errors and I'm trying to diagnose and solve the problem.

@sidorares
Copy link
Owner

@benbotto I updated previous comment, can you debug if end or close events get fired when connection is killed because of wait_timeout?

@benbotto
Copy link
Contributor

benbotto commented Dec 11, 2019

Yes.

  • On my local db, I set wait_timeout to 30 (seconds).
  • I created a connection pool with a connectionLimit of 2.
  • I run two queries in rapid succession and see two connections established.
  • After 30 seconds I see two "Notes" in the mysql logs about the two connections being aborted.
  • I wait 60 seconds--long enough for wait_timeout to be exceeded--and execute two more queries in rapid succession.

Neither end nor done fires. I registered event listeners against the pool, and I'm not certain that's what your looking for. Further, I do not get an ECONNRESET error.

https://github.com/benbotto/mysql2-timeout There's what I tried.

docker-compose up -d
docker-compose exec dev bash
npm install # In the container.
node timeout.js

Just to be clear, that unfortunately does not reproduce the ECONNRESET error.

Edit:
Here are the mysql logs that I mentioned:

db_1_e4e1ba884a68 | 2019-12-11T23:36:46.121358Z 2 [Note] Aborted connection 2 to db: 'timeout' user: 'timeout-user' host: '172.31.0.3' (Got timeout reading communication packets)
db_1_e4e1ba884a68 | 2019-12-11T23:36:46.127368Z 3 [Note] Aborted connection 3 to db: 'timeout' user: 'timeout-user' host: '172.31.0.3' (Got timeout reading communication packets)

When I get ECONNRESET errors in my production code, it seems to be accompanied by those same "notes."

@sidorares
Copy link
Owner

I was referring to end/close/error events on connection.stream and not on the public api - see

if (this._closing) {
and
this.stream.on('error', this._handleNetworkError.bind(this));

@sidorares
Copy link
Owner

Thanks for repo repo with docker, I might try it later today

@benbotto
Copy link
Contributor

Sorry. Looking at the code in question. After wait_timeout--once the connection is aborted by the server--close fires and this._closing is undefined. error does not fire. FWIW I'm using version the latest 1.x.x version (1.7.0).

    this.stream.on('close', () => {
      console.log('CLOSE FIRED', this._closing); // CLOSE FIRED undefined
      // we need to set this flag everywhere where we want connection to close
      if (this._closing) {
        return;
      }
[...]

@sidorares
Copy link
Owner

this._closing is the flag that client itself sets so it knows stream end is expected and not an error condition

maybe https://nodejs.org/api/net.html#net_socket_setkeepalive_enable_initialdelay can help ( following answer in https://stackoverflow.com/questions/27356937/how-to-detect-client-side-internet-disconnection-using-node-js )

@benbotto
Copy link
Contributor

I'm checking in my production system if close or end fire when I get the ECONNRESET error. I get that error intermittently in production, so it may take a while. (I get the error about 10 times a day.)

Thank you for the links. I will read through them and see if keep-alive helps.

@sidorares
Copy link
Owner

if it helps I'm happy to set it by default or maybe allow to pass as config parameter

@benbotto
Copy link
Contributor

@sidorares Long night getting this fixed.

In my case the ECONNRESET errors are not due to wait_timeout as I initially suspected, so I have kind of hijacked this issue. I can open another issue if you want. For what it's worth:

  • I can now reliably reproduce the problem, I've fixed the problem that I was having, and I can provide minimal reproduction code if you want. In my case it has to do with networking in Docker Swarm; the connections are legitimately being reset.
  • Per your suggestion, using Net.Socket#setKeepAlive does indeed correct the issue.

A few questions then:

  • Given that the connection resets are network related, is this something worth addressing/handling in mysql2? I understand that this is a low-level driver, and attempting to handle network errors may well be outside of the scope. That being said, a keep-alive option does fix the problem without the need to handle network errors, so it seems like it would be a nice feature.
  • Do you want me to open a second issue?

Again, sorry to hijack this thread. At a glance it looked like the same issue as I was encountering. You got me pointed in the right direction yesterday, and I appreciate the help and the great library!

@midnightcodr
Copy link

@benbotto I am very interested in how you set Net.Socket#setKeepAlive with mysql2. Can you share some code snippet? I suspect some modification to mysql2 lib is required?

@benbotto
Copy link
Contributor

Sure thing, @midnightcodr. Here's the change: benbotto@0943dee

I'm not familiar with the mysql2 code, so I don't know if this changes works in all scenarios. Note that just above the change there are a few branches in the code--different ways that this.stream is created--and I only tested one of those branches. Works for me (tm), YMMV.

@midnightcodr
Copy link

Thanks @benbotto!

@sidorares
Copy link
Owner

@benbotto since you proved setKeepAlive definitely helps I'm happy see it published.

Few thing to note:

connection.stream can be any duplex stream, currently we only assume it has .on() and .write() methods. This is useful when you connect using stream returned from proxy or ssh forward out client - https://github.com/sidorares/node-mysql2/blob/master/documentation/Extras.md#connecting-using-custom-stream

Probably just adding a test if (typeof this.stream.setKeepAlive === 'function') { this.stream.setKeepAlive(true, 10000); } is enough. Maybe we should also make delay parameter configurable

@benbotto
Copy link
Contributor

All right, @sidorares, I added a PR. The PR includes two new configuration parameters: enableKeepAlive and keepAliveInitialDelay. The former defaults to false, and the latter to 0. I opted to keep keep-alive off by default to avoid unexpected side effects in existing code.

#1081

@sidorares
Copy link
Owner

@midnightcodr published as v2.1.0

@nickdnk
Copy link

nickdnk commented Jan 5, 2021

All right, @sidorares, I added a PR. The PR includes two new configuration parameters: enableKeepAlive and keepAliveInitialDelay. The former defaults to false, and the latter to 0. I opted to keep keep-alive off by default to avoid unexpected side effects in existing code.

#1081

You should probably link to this issue in the comments in the code - or at least explain the need for enableKeepAlive since it defaults to false. I spent a good hour trying to find out why my pool was not working :)

Edit: It seems the version I have suffers from #1229 which means that this probably won't solve the problem for me. My promises resolve (no errors) with all the expected rows that I select, but with every field set to 0 or null. It happens randomly and I can't reliably reproduce it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants