Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lagoon API Error: No channels left to allocate #1344

Open
vincenzodnp opened this issue Oct 25, 2019 · 6 comments
Open

Lagoon API Error: No channels left to allocate #1344

vincenzodnp opened this issue Oct 25, 2019 · 6 comments
Labels
1-api-auth API & Authentication subsystem bug

Comments

@vincenzodnp
Copy link
Contributor

Describe the bug
Watching at api logs, today I got this error several times:

[api-100-4bgkx] error: Error: No channels left to allocate 
[api-100-4bgkx]     at Connection.C.freshChannel (/app/node_modules/amqplib/lib/connection.js:451:11) 
[api-100-4bgkx]     at Channel.C.allocate (/app/node_modules/amqplib/lib/channel.js:53:29) 
[api-100-4bgkx]     at tryCatcher (/app/node_modules/bluebird/js/release/util.js:16:23) 
[api-100-4bgkx]     at Function.Promise.attempt.Promise.try (/app/node_modules/bluebird/js/release/method.js:39:29) 
[api-100-4bgkx]     at Channel.C.open (/app/node_modules/amqplib/lib/channel_model.js:68:21) 
[api-100-4bgkx]     at ChannelModel.CM.createChannel (/app/node_modules/amqplib/lib/channel_model.js:48:12) 
[api-100-4bgkx]     at /app/node_modules/rabbitmq-pub-sub/dist/main.js:425:61 
[api-100-4bgkx]     at tryCatcher (/app/node_modules/bluebird/js/release/util.js:16:23) 
[api-100-4bgkx]     at Promise._settlePromiseFromHandler (/app/node_modules/bluebird/js/release/promise.js:517:31) 
[api-100-4bgkx]     at Promise._settlePromise (/app/node_modules/bluebird/js/release/promise.js:574:18) 
[api-100-4bgkx]     at Promise._settlePromiseCtx (/app/node_modules/bluebird/js/release/promise.js:611:10) 
[api-100-4bgkx]     at _drainQueueStep (/app/node_modules/bluebird/js/release/async.js:142:12) 
[api-100-4bgkx]     at _drainQueue (/app/node_modules/bluebird/js/release/async.js:131:9) 
[api-100-4bgkx]     at Async._drainQueues (/app/node_modules/bluebird/js/release/async.js:147:5) 
[api-100-4bgkx]     at Immediate.Async.drainQueues [as _onImmediate] (/app/node_modules/bluebird/js/release/async.js:17:14) 
[api-100-4bgkx]     at runCallback (timers.js:705:18) 'failed to recieve message from queue \'%s\'' 'api.event.deployment.added'
@Schnitzel
Copy link
Contributor

check the rabbitmq broker, the channels are probably full, we can solve this with restarting the API pods which then will remove all existing channels.
for the longer term future I think the API needs to close it's connection to the rabbitmq after some time (maybe after the Websocket connection from the browser is also closed?)

@rocketeerbkw
Copy link
Member

Ya, I requested this issue specifically to improve things on the API side. We should be able to 1) handle the failure more gracefully and 2) manage channels better.

Some light reading/googling suggests we maybe don't need a new channel for every user/websocket connection? And/or close channels when disconnected?

@seanhamlin seanhamlin added 1-api-auth API & Authentication subsystem bug labels Jun 29, 2020
@seanhamlin
Copy link
Contributor

Spotted this on a high volume cluster, had to scale the API pods from 2 to 5 to get it to calm down.

@Schnitzel
Copy link
Contributor

I think just restarting all api pods solves this issue as well, as the issue is that the api creates a new channel for every GraphQL Subscription that is created, if you restart the api, all channels are closed and the cycle starts again.

@shreddedbacon
Copy link
Member

I was just playing around locally with pages that use subscriptions and was wondering why in the API logs I was seeing repeated requests to something that I was performing once against the API. I'm not quite sure how to debug this so I added some logger.info into some functions and saw that these were being called multiple times from 1 initial API request. Eg, I added a logger.info into the restoreLocation function and performed updateRestore mutation against the API once using graphiql, but I saw the logger in the logs more than once, I am guessing that the API is also processing these in the queues?

I don't know how you could even fix this? Can we just disable subscriptions? I don't really see them as super useful, I often just refresh the page cause some subscription things don't work properly

@tobybellwood
Copy link
Member

lets address subscriptions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1-api-auth API & Authentication subsystem bug
Projects
None yet
Development

No branches or pull requests

6 participants