Client cleanly ends after periods of inactivity when it shouldn't #21

cthielen · 2015-11-05T16:29:09Z

Unless I'm misconfiguring something, I have two instances of a chat bot using slack-ruby-client which cleanly exit after periods of inactivity. Is there some way to prevent this?

I'm not setting anything up except the API token. I use client.on twice and then do a "client.start!"

It seems to disconnect within 48 hours.

(The particular source in question is here: https://github.com/dssit/dss-chatbot-slack/blob/master/dss-chatbot.rb)

dblock · 2015-11-05T19:48:04Z

I think what happens is that the websocket closes, and the client doesn't auto-reconnect. This is not behavior that I believe belongs int a client, it's just an API wrapper, however a bot framework definitely wants to maintain a connection, and slack-ruby-bot does it here.

For your purposes just add a loop do around your entire code and a try / rescue so that any aborted connection is reopened. You definitely want that anyway because there're other possible errors, such as timeouts.

Or adopt slack-ruby-bot :)

I'm going to keep this open for now, we can debate whether the client should try to maintain connection, and if so, I am not sure how.

mikz · 2015-11-15T18:04:22Z

@dblock I'd consider implementing the reconnect logic inside this client for following reasons:

can client fetch messages that were sent when it was reconnecting?
all users would have to depend on exceptions that can slack ruby client raise (which should be cleaned up anyway, but thats different topic)
every user would have to handle https://api.slack.com/events/team_migration_started

Not saying it needs to be done, but at least consider it. I'm not sure what the "proper" reconnect logic would do.

However, to keep the connection, the ping should be running (https://api.slack.com/rtm#ping_and_pong).
Not only relying on the websocket ping, but also sending custom events to be sure.

dblock · 2015-11-15T18:17:05Z

I would +1 moving the reconnect logic similar to https://github.com/dblock/slack-ruby-bot/blob/4b2d3dfe8f3c59a3f4b44f068585ab0bd62658b1/lib/slack-ruby-bot/server.rb into the client.

cthielen · 2016-02-09T19:11:26Z

It makes sense that the re-connect logic should be in the consumer of this library but the documentation should be updated to reflect this possibly common scenario (if it isn't already updated).

dblock · 2016-02-09T19:15:12Z

We haven't said anything about reconnect. Would you like to take a stab at this in a PR @cthielen?

mikz · 2016-02-10T07:22:30Z

I was thinking about this. In celluloid adapter the callbacks are called synchronously from the read method. So any exceptions in the callback will kill the actor. I wanted to start the celluloid client with supervisor, so it reconnects automatically. The idea was to offer build class method, that would return supervisor instead.

Sent from my iPhone

On 9. 2. 2016, at 20:15, Daniel Doubrovkine (dB.) @dblockdotorg notifications@github.com wrote:

We haven't said anything about reconnect. Would you like to take a stab at this in a PR @cthielen?

—
Reply to this email directly or view it on GitHub.

dblock · 2016-02-10T11:50:42Z

Obviously the client shouldn't be hiding errors, but offering automatic reconnect in the client at this level might be a good idea. It also looks like Slack is experimenting with a reconnect_url, so there's an opportunity to auto-reconnect without anyone noticing.

cthielen · 2016-02-16T16:16:19Z

Hm -- this may be a separate issue but since implementing the exception-handling loop mentioned in this thread, my particular chat bot in question will disconnect from Slack but the exception-handling loop doesn't receive an exception and the Ruby script does not exit. It is presumably sitting on "client.start!" but I cannot confirm this.

Any recommendations on debugging this type of problem?

dblock · 2016-02-16T16:36:33Z

@cthielen Share some code?

You should see an event on :close. If you want the Ruby app to exit at that time you would need to stop EM or Celluloid. I don't think we handle this in the library here because it doesn't want to kill someone's app "by design".

cthielen · 2016-02-16T18:06:40Z

@dblock It's at https://github.com/dssit/dss-chatbot-slack/blob/master/dss-chatbot.rb . The relevant lines are 151-171.

cthielen · 2016-02-16T19:31:51Z

So, to make sure I understand correctly, the gem can emit a 'close' message which, if not caught, the script will continue to run and stay on the "client.start!" line? So one would need to handle the close message in a loop if we wanted our chat bots to stay up?

Also, how does one handle reconnect in the case of a close message short of restarting the Ruby script?

jlyonsmith · 2016-02-16T21:31:09Z

👍 for auto reconnect. Just noticed that this is happening in our build bot. We're using start_async and the :error callbacks are not firing. Makes testing a fix rather tricky.

[edit] OK, I just saw the :close event. But it looks like that calls close after sending the event, so calling start_async in there wouldn't work. I'm trying the approach of raising an error to kill the Actor then letting the Supervisor restart it.

dblock · 2016-02-16T22:08:15Z

It all depends - are you using start! or start_async for example. Where do you call EM.run or whatever Celluloid equivalent, etc.

Lets start with something simple and look at the example in https://github.com/dblock/slack-ruby-client/blob/master/examples/hi_real_time/hi.rb. It uses EventMachine, and calls start!, which internally will do EM.run { ... } which will block until the client tells it to stop. So inside on :close it calls EM.stop.

Are we saying we want to auto-reconnect on close, but not if you close it explicitly (eg. with Ctrl + C)? What happens if there's an error? What happens if the error is account_disabled vs. something transient? It feels like that's the job of a supervisor, as implemented in slack-ruby-bot (example in https://github.com/dblock/slack-ruby-bot/blob/master/examples/market/marketbot.rb).

dblock · 2019-02-25T19:53:20Z

I've released:

slack-ruby-client 0.14.0
slack-ruby-bot 0.12.0
slack-ruby-bot-server 0.9.0

These should all address the problems above via #226. Open a new issue if they don't. Closing this one.

dblock added bug? question realtime api labels Nov 5, 2015

dblock closed this as completed Feb 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Client cleanly ends after periods of inactivity when it shouldn't #21

Client cleanly ends after periods of inactivity when it shouldn't #21

cthielen commented Nov 5, 2015

dblock commented Nov 5, 2015

mikz commented Nov 15, 2015

dblock commented Nov 15, 2015

cthielen commented Feb 9, 2016

dblock commented Feb 9, 2016

mikz commented Feb 10, 2016

dblock commented Feb 10, 2016

cthielen commented Feb 16, 2016

dblock commented Feb 16, 2016

cthielen commented Feb 16, 2016

cthielen commented Feb 16, 2016

jlyonsmith commented Feb 16, 2016

dblock commented Feb 16, 2016

dblock commented Feb 25, 2019

Client cleanly ends after periods of inactivity when it shouldn't #21

Client cleanly ends after periods of inactivity when it shouldn't #21

Comments

cthielen commented Nov 5, 2015

dblock commented Nov 5, 2015

mikz commented Nov 15, 2015

dblock commented Nov 15, 2015

cthielen commented Feb 9, 2016

dblock commented Feb 9, 2016

mikz commented Feb 10, 2016

dblock commented Feb 10, 2016

cthielen commented Feb 16, 2016

dblock commented Feb 16, 2016

cthielen commented Feb 16, 2016

cthielen commented Feb 16, 2016

jlyonsmith commented Feb 16, 2016

dblock commented Feb 16, 2016

dblock commented Feb 25, 2019