Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC Allow repeat calls to connect() to resolve #934

Closed
dhensby opened this issue Nov 7, 2019 · 12 comments · Fixed by #941
Closed

RFC Allow repeat calls to connect() to resolve #934

dhensby opened this issue Nov 7, 2019 · 12 comments · Fixed by #941

Comments

@dhensby
Copy link
Collaborator

dhensby commented Nov 7, 2019

It seems that there is a common use where developers would like to call ConnectionPool.connect() repeatedly to then chain on a request.

For example:

const pool = new ConnectionPool();

function runQuery(query) {
   return pool.connect().then(connectedPool => {
        return connectedPool.request().query('SELECT 1');
    });
}

At the moment, repeat calls to connect() in the above example will throw an error because the pool is already connected.

This also has the advantage of allowing developers not to have to worry about whether the pool is connected or closed. At the moment, to achieve something similar you'd need to do the following:

const pool = new ConnectionPool();

function runQuery(query) {
    return Promise.resolve(pool.connected ? pool : pool.connect()).then(connectedPool => {
        return connectedPool.request().query('SELECT 1');
    });
}

This feels overly verbose and needless and should be simple to solve.

As an enhancement, we may wish to do something similar for when the pool is in a "connecting" state. At the moment if two process call connect() whilst the pool is connecting, we will see an error, which will mean that using the pool.connect().then() pattern will be inherently unsafe for any parallel execution of queries on a pool that's not yet connected.

It feels like we are relying on devs to do too much caretaking of the pool when the library should be able to abstract that away.

Potentially the final solution could just be a lazily invoked connection. eg:

const pool = new ConnectionPool();

function runQuery(query) {
    // internally it will check if the pool has connected, if not it will attempt to connect and then run the query
    return pool.request().query(query);
}

developers can still call connect() to validate the connection is working but they wouldn't need to worry about the connected or connecting state of the pool at all.

@brokenthorn
Copy link

I think there's a further issue here.

I think in your middle code example, the one mitigating the error thrown when the pool is already connected and you call connect(), doesn't mitigate the error thrown if the pool is connecting and you call connect() (a similar error, but slightly different message).

So one might try and fix that with:

function runQuery(query) {
  return Promise
      .resolve(pool.connected ? pool : pool.connecting ? undefined : pool.connect())
      // above, if the Pool was in 'connecting' state, resolves with an undefined value,
      // otherwise resolves with the promise of a connected Pool,           
      .then(connectedPool => {
        // thus, connectedPool can be undefined here (ie. Pool was in 'connecting' state) or
        // an actual connected Pool (ie. Pool was not connected and also not in connecting state):
        if (connectedPool) return connectedPool.request().query("SELECT 1"); // return query result.
        return undefined; // return undefined because the Pool was not connected when we tried to query.
      },
  );
}

But this ends up in a query that might not run. The developer would now have to write retry logic.

I'm thinking of another solution which I'll post soon.

@brokenthorn
Copy link

Here's the other solution I was talking about. This one I actually tested.

I'm using Typescript because the example is from a project of mine. It's not part of a runQuery function but a get a pool that's actually connected before trying to run a query function.

let millisecondsPassed = 0;
// Basically let's write a recursive executor with a timeout
// and 'passed time' threshold so it doesn't recurse forever:
const waitForConnectionPoolToBeReadyExecutor = (
  resolve: (value?: PromiseLike<undefined> | undefined) => void,
  reject: (reason?: any) => void,
) => {
  setTimeout(() => {
    if (millisecondsPassed < 30000) {
      millisecondsPassed += 2000;

      // This variable defaults to false and is asynchronously set by the
      // pool init callback set on the pool constructor:
      // ie.: new(options, callback) where if not error, this._connectionPoolInitialized = true.
      //
      // Could as well be replaced with `this._connectionPool.connected`.
      if (this._connectionPoolInitialized) {
        resolve();
      } else {
        waitForConnectionPoolToBeReadyExecutor(resolve, reject);
      }
    } else
      reject(
        "Aborting start because the connection pool failed to initialize in the last ~30s.",
      );
  }, 2000);
};

// This simply now waits for like 30 seconds, for the pool to become connected.
// Since this is also an async function, if this fails after 30 seconds, execution stops
// and an error is thrown.
await new Promise(waitForConnectionPoolToBeReadyExecutor);

// use the pool: await this._connectionPool.request().query()... etc.

@brokenthorn
Copy link

brokenthorn commented Nov 13, 2019

By the way, I don't know if this can be fixed internally by simply delaying execution until the pool is connected and also triggering a connection attempt if it isn't, without throwing an error back to the user. It might change query execution order which might break some (if not eventually all) apps. Throwing an error (instant feedback) is actually good here.

The best possible solution I see is handling errors in your app.

For example, when you initialise your app and before running any queries, you can call close() on the pool and then connect().

This way you don't get any errors thrown because before calling connect(), you closed the connection pool!

You can then await that connection attempt, which according to the source code, creates just just one connection in the pool just for testing that the pool is correctly configured.

In fact, this seems to be the only use case for the connect() function: testing if the pool can connect before doing anything with it.

And then you can use the pool but that doesn't mean it can't become disconnected for some other reasons.

So in the end, requests, queries, pools, etc., they will all inevitably throw one error or another . So then why not build retry logic / error handling logic inside your app?

The requirement for that would be for you to be able to distinguish between the different errors that can be thrown so you can take different actions and that's already possible so... maybe we're using node-mssql wrong here?

BTW, I just started using this last night, up till 2 AM! 😪 Sorry if I'm overly verbose. Maybe we don't have similar dilemmas after all.

@dhensby
Copy link
Collaborator Author

dhensby commented Nov 13, 2019

Yes - there is also the problem of when the pool is in a connecting state. Ideally the requests would be queued until the connection had been resolved.

If we fired an event when the pool was connected, queued queries could then execute.

The problem with "managing" it all in your app by calling pool.close().then(pool => pool.connect()).then(pool => pool.query(...)); is that what if multiple queries are being run in parallel? Your pool will be closed whilst a query is running or while another has called connect, and so on. We have to work with the parallel nature of JS and we can't have threads closing and opening the pool for every request. This is also problematic because it closes all the pool connections, thus removing the usefulness of the pool - you might as well just create and discard connections as you need them rather than using pooling.

As I imagine it, we could have some kind of promise that is held until the pool is connected:

class ConnectionPool extends EventEmitter {
    constructor () {
        this._connecting = this._connected = false;
    }

    connect () {
        if (!this._connectingPromise) {
            this._connectingPromise = (new Promise((resolve, reject) => {
                this.once('connected', resolve)
                this.once('error', reject)
            })).then(() => {
                this.connected = true
                this._connecting = false
                return this
            }).catch(err => {
                this.connecting = false
                return Promise.reject(err)
            });
        }
        // kick off connection to db
        return this._connectingPromise
    }

    close () {
        this._connectingPromise = null
        this._connecting = this.connected = false
        // close pool
    }
}

class Request {
    constructor(pool) {
        this.pool = pool || globalConnection.pool
    }

    query(sql) {
        return this.pool.connect().then(pool => pool.query(sql))
    }
}

This is untested and just a rough approach, missing lots of details, but I'd imagine this would allow us to queue SQL queries up without having to worry about whether the pool has connected or not.

@brokenthorn
Copy link

brokenthorn commented Nov 13, 2019

I wasn't suggesting that you call close() for every query you run, just once in the initialisation phase of your app/service, in order to validate that the pool can connect (ie. has been configured correctly and there is no other issue preventing connections at the time of initialisation). Calling close() would then allow you to await connect() and know if the pool can connect before proceeding to use the pool. But you could also write your own checker in the form of a promise that waits for pool.connected to be true, while pool.connecting is true.

So now that you know that you have a pool that can connect and more importantly has connected, all subsequent uses of the pool would go as normal, by which I mean no further calls to connect(). You just fetch new requests and run queries with them. You also handle any errors that might occur, including the impossibility to connect. Similar to tedious's callback API, you can handle events OR, because node-mssql supports async/await, you would wrap all awaits in try-catch blocks. Also, I don't think requests acquire a connection from the pool until you trigger a query but I haven't actually checked the implementation.

I really need some validation from the authors. I'm only just starting to test-implement some of the things I'm proposing.

Going back to your example, it's pretty good but it might be missing some key domain details.

A pool is connected if there's at least one connection in the pool (connected or disconnected). This is what I'm assuming.

Your example provides some type of health-check that checks if the pool can create a connection before handing over a request to the developer. This has nothing to do with the actual pool, that might contain already connected connections or even ones that have disconnected and haven't been health-checked.

In my opinion you can use the callback provided on pool creation to check if the newly created pool connected or not. You can wrap that in a promise and wait. You would do this once in the app/service init phase so you can fail early if the pool can't connect due to issues such as bad password or user name. No reason to spin up the app/service if will never be able to connect.

And then the pool hands connections like normal. These connections live in the pool but can always become disconnected and need to be reconnected. It might happen with all or just one. That's why it can't handle these issues internally, I think. It can, for example, try to reconnect a broken connection, but that might fail so requests might fail as well due to that. In the end, you have to handle this situations yourself.

@dhensby
Copy link
Collaborator Author

dhensby commented Nov 13, 2019

@brokenthorn What you suggest in the first half of your post is basically what the library already does and doesn't require any changes to the library. On pool.connect() the pool attempts to make a one off connection to the DB, if that connection is successful the pool is created and the connect() callback is executed (or the Promise is resolved), the pool is ready to use (though there may be no active connections in the pool). If the initial test connection fails then connect() will error. This is current behaviour and, as you say, is for the developer to maintain the state of the pool in their applications.

However, the reason I've proposed this change is because (going by the issues opened on GitHub) many developers seem to struggle with the concept of managing the pool themselves and don't understand why repeat calls to connect fail nor why or when they should close the pool, etc.

A pool is connected if there's at least one connection in the pool (connected or disconnected). This is what I'm assuming.

This is incorrect, a pool is connected if the initial probe connection was successful, there's no need for any connections to actually be in the pool at all, nor for them to be healthy.

@dhensby
Copy link
Collaborator Author

dhensby commented Nov 13, 2019

There are some problems to keep note of, though:

  1. What if someone calls close() and query(), under the proposed pattern the pool would automatically reconnect. Maybe a pool should not be allowed to reconnect unless there is an appropriate config setting set (autoconnect)?
  2. If a user doesn't call connect manually, they won't be aware until later in their application that their connection config is bad

@brokenthorn
Copy link

brokenthorn commented Nov 13, 2019

@brokenthorn What you suggest in the first half of your post is basically what the library already does and doesn't require any changes to the library. On pool.connect() the pool attempts to make a one off connection to the DB, if that connection is successful the pool is created and the connect() callback is executed (or the Promise is resolved), the pool is ready to use (though there may be no active connections in the pool). If the initial test connection fails then connect() will error. This is current behaviour and, as you say, is for the developer to maintain the state of the pool in their applications.

Actually, I don't think it's like that. The pool is created regardless of whether you call pool.connect() or not. Just try it. If you create a new pool and call connect() you get an error that the pool is already connecting. So why call connect() in the first place? Well unless it's meant to be a private method! In that case, if all the logic you just wrote is internal to the library in order to instantiate de pool, sure.

However, the reason I've proposed this change is because (going by the issues opened on GitHub) many developers seem to struggle with the concept of managing the pool themselves and don't understand why repeat calls to connect fail nor why or when they should close the pool, etc.

Yes, I know, I've had the same problem for 2 days since starting to use this library but now I'm pretty sure I was using it wrong. I already described how I think the pool should be used so I won't repeat myself here.

A pool is connected if there's at least one connection in the pool (connected or disconnected). This is what I'm assuming.

This is incorrect, a pool is connected if the initial probe connection was successful, there's no need for any connections to actually be in the pool at all, nor for them to be healthy.

I see! I need to check the source code again but it sounds familiar (it was very late last night but I seem to remember that logic now...). But if that's the case, that means that the way I previously described how I think the pool should be used is probably right.

@dhensby
Copy link
Collaborator Author

dhensby commented Nov 13, 2019

Actually, I don't think it's like that. The pool is created regardless of whether you call pool.connect() or not. Just try it.

@brokenthorn with respect, I maintain this library so I know the internals of it pretty well. The "pool" as an object is created no matter what, but the underlying tarn js pool and probes to the database are not started until connect is called.

const pool = new sql.ConnectionPool(config)

pool.connected // false
pool.connecting // false
pool.healthy // false

const connPromise = pool.connect()

pool.connected // false
pool.connecting // true
pool.healthy // false

await connPromise

pool.connected // true
pool.connecting // false
pool.healthy // true

The alternative is to use the callback pattern, and then connect is called for you (but is still called):

const pool = new sql.ConnectionPool(config, (err => {
    pool.connected // true
    pool.connecting // false
    pool.healthy // true
});

pool.connected // false
pool.connecting // true
pool.healthy // false

that means that the way I previously described how I think the pool should be used is probably right.

That's right, you accurately described how the pool should be used, but some developers struggle with this concept, and so allowing repeat calls to the DB would be helpful, it seems. Changing the pool to allow repeat calls wouldn't really change how most people use the pool, it would just allow developers to call pool.connect() without worrying about the pool state.

@brokenthorn
Copy link

brokenthorn commented Nov 13, 2019

Actually, I don't think it's like that. The pool is created regardless of whether you call pool.connect() or not. Just try it.

@brokenthorn with respect, I maintain this library so I know the internals of it pretty well. The "pool" as an object is created no matter what, but the underlying tarn js pool and probes to the database are not started until connect is called.

That's all you had to say! I didn't know! 😨 I thought we were both just users. Now I understand you're point of view and why you opened this issue.

I'm still inclined to write an MRE because I remember last night, while I was having trouble understanding how to use the library, that just by following the example for async/await from the docs, that the first call to connect() threw an error that the pool is already connecting.

The MRE is basically your first code example from the above reply. That call to connect() threw immediately.

BTW, thank you for taking your time and the respect is mutual.

@dhensby
Copy link
Collaborator Author

dhensby commented Nov 13, 2019

No problem. If you've got code to show for the instance where you were getting an "already connecting" error when calling connect, I could probably give some feedback as that shouldn't be happening if you're only calling connect once (in a separate issue would be best, though).

I've opened a PR (#941) which settles the basic problem of calling ConnectionPool.connect() repeatedly. However, there are still going to be issues with calling request().query() before the connect call has resolved. eg:

const pool = new sql.ConnectionPool(config)

pool.connect() // note no await or then
pool.request().query('SELECT 1').then(result => {
  console.log(result)
}).catch(err => {
  console.error(err) // this will be called because connect hasn't resolved
})

The next step would be fore the requests to check if the pool is connected before continuing instead of throwing an error so you could call pool.connect() in a disassociated way and then queue up queries that will execute once the pool is connected.

@brokenthorn
Copy link

Shouldn't the docs and examples also be updated?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants