Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client cleanly ends after periods of inactivity when it shouldn't #21

Closed
cthielen opened this issue Nov 5, 2015 · 14 comments
Closed

Client cleanly ends after periods of inactivity when it shouldn't #21

cthielen opened this issue Nov 5, 2015 · 14 comments

Comments

@cthielen
Copy link

cthielen commented Nov 5, 2015

Unless I'm misconfiguring something, I have two instances of a chat bot using slack-ruby-client which cleanly exit after periods of inactivity. Is there some way to prevent this?

I'm not setting anything up except the API token. I use client.on twice and then do a "client.start!"

It seems to disconnect within 48 hours.

(The particular source in question is here: https://github.com/dssit/dss-chatbot-slack/blob/master/dss-chatbot.rb)

@dblock
Copy link
Collaborator

dblock commented Nov 5, 2015

I think what happens is that the websocket closes, and the client doesn't auto-reconnect. This is not behavior that I believe belongs int a client, it's just an API wrapper, however a bot framework definitely wants to maintain a connection, and slack-ruby-bot does it here.

For your purposes just add a loop do around your entire code and a try / rescue so that any aborted connection is reopened. You definitely want that anyway because there're other possible errors, such as timeouts.

Or adopt slack-ruby-bot :)

I'm going to keep this open for now, we can debate whether the client should try to maintain connection, and if so, I am not sure how.

@mikz
Copy link
Contributor

mikz commented Nov 15, 2015

@dblock I'd consider implementing the reconnect logic inside this client for following reasons:

  • can client fetch messages that were sent when it was reconnecting?
  • all users would have to depend on exceptions that can slack ruby client raise (which should be cleaned up anyway, but thats different topic)
  • every user would have to handle https://api.slack.com/events/team_migration_started

Not saying it needs to be done, but at least consider it. I'm not sure what the "proper" reconnect logic would do.

However, to keep the connection, the ping should be running (https://api.slack.com/rtm#ping_and_pong).
Not only relying on the websocket ping, but also sending custom events to be sure.

@dblock
Copy link
Collaborator

dblock commented Nov 15, 2015

I would +1 moving the reconnect logic similar to https://github.com/dblock/slack-ruby-bot/blob/4b2d3dfe8f3c59a3f4b44f068585ab0bd62658b1/lib/slack-ruby-bot/server.rb into the client.

@cthielen
Copy link
Author

cthielen commented Feb 9, 2016

It makes sense that the re-connect logic should be in the consumer of this library but the documentation should be updated to reflect this possibly common scenario (if it isn't already updated).

@dblock
Copy link
Collaborator

dblock commented Feb 9, 2016

We haven't said anything about reconnect. Would you like to take a stab at this in a PR @cthielen?

@mikz
Copy link
Contributor

mikz commented Feb 10, 2016

I was thinking about this. In celluloid adapter the callbacks are called synchronously from the read method. So any exceptions in the callback will kill the actor. I wanted to start the celluloid client with supervisor, so it reconnects automatically. The idea was to offer build class method, that would return supervisor instead.

Sent from my iPhone

On 9. 2. 2016, at 20:15, Daniel Doubrovkine (dB.) @dblockdotorg notifications@github.com wrote:

We haven't said anything about reconnect. Would you like to take a stab at this in a PR @cthielen?


Reply to this email directly or view it on GitHub.

@dblock
Copy link
Collaborator

dblock commented Feb 10, 2016

Obviously the client shouldn't be hiding errors, but offering automatic reconnect in the client at this level might be a good idea. It also looks like Slack is experimenting with a reconnect_url, so there's an opportunity to auto-reconnect without anyone noticing.

@cthielen
Copy link
Author

Hm -- this may be a separate issue but since implementing the exception-handling loop mentioned in this thread, my particular chat bot in question will disconnect from Slack but the exception-handling loop doesn't receive an exception and the Ruby script does not exit. It is presumably sitting on "client.start!" but I cannot confirm this.

Any recommendations on debugging this type of problem?

@dblock
Copy link
Collaborator

dblock commented Feb 16, 2016

@cthielen Share some code?

You should see an event on :close. If you want the Ruby app to exit at that time you would need to stop EM or Celluloid. I don't think we handle this in the library here because it doesn't want to kill someone's app "by design".

@cthielen
Copy link
Author

@dblock It's at https://github.com/dssit/dss-chatbot-slack/blob/master/dss-chatbot.rb . The relevant lines are 151-171.

@cthielen
Copy link
Author

So, to make sure I understand correctly, the gem can emit a 'close' message which, if not caught, the script will continue to run and stay on the "client.start!" line? So one would need to handle the close message in a loop if we wanted our chat bots to stay up?

Also, how does one handle reconnect in the case of a close message short of restarting the Ruby script?

@jlyonsmith
Copy link

👍 for auto reconnect. Just noticed that this is happening in our build bot. We're using start_async and the :error callbacks are not firing. Makes testing a fix rather tricky.

[edit] OK, I just saw the :close event. But it looks like that calls close after sending the event, so calling start_async in there wouldn't work. I'm trying the approach of raising an error to kill the Actor then letting the Supervisor restart it.

@dblock
Copy link
Collaborator

dblock commented Feb 16, 2016

It all depends - are you using start! or start_async for example. Where do you call EM.run or whatever Celluloid equivalent, etc.

Lets start with something simple and look at the example in https://github.com/dblock/slack-ruby-client/blob/master/examples/hi_real_time/hi.rb. It uses EventMachine, and calls start!, which internally will do EM.run { ... } which will block until the client tells it to stop. So inside on :close it calls EM.stop.

Are we saying we want to auto-reconnect on close, but not if you close it explicitly (eg. with Ctrl + C)? What happens if there's an error? What happens if the error is account_disabled vs. something transient? It feels like that's the job of a supervisor, as implemented in slack-ruby-bot (example in https://github.com/dblock/slack-ruby-bot/blob/master/examples/market/marketbot.rb).

@dblock
Copy link
Collaborator

dblock commented Feb 25, 2019

I've released:

  • slack-ruby-client 0.14.0
  • slack-ruby-bot 0.12.0
  • slack-ruby-bot-server 0.9.0

These should all address the problems above via #226. Open a new issue if they don't. Closing this one.

@dblock dblock closed this as completed Feb 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants