Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Xenial] Execution stuck in 'requested' state on a fresh RabbitMQ #3290

Closed
arm4b opened this issue Mar 15, 2017 · 10 comments
Closed

[Xenial] Execution stuck in 'requested' state on a fresh RabbitMQ #3290

arm4b opened this issue Mar 15, 2017 · 10 comments

Comments

@arm4b
Copy link
Member

arm4b commented Mar 15, 2017

When executing any st2 action for the first time on a fresh & clean RabbitMQ immediately after st2 startup, - it runs forewer and stuck in requested state:

root@ubuntu16:~# st2 run core.local echo 123
....................................................................................................................................................
root@ubuntu16:~# st2 execution list
+--------------------------+------------+--------------+------------------------+------------------------+----------------------+
| id                       | action.ref | context.user | status                 | start_timestamp        | end_timestamp        |
+--------------------------+------------+--------------+------------------------+------------------------+----------------------+
| 58c96cadc8980518d7cf8ada | core.local | st2admin     | requested              | Wed, 15 Mar 2017       |                      |
|                          |            |              |                        | 16:32:45 UTC           |                      |
+--------------------------+------------+--------------+------------------------+------------------------+----------------------+


root@ubuntu16:~# st2 execution get 58c96cadc8980518d7cf8ada
id: 58c96cadc8980518d7cf8ada
status: requested
parameters: 
  cmd: echo 123
result: None

I can only guess that at early point st2 is busy with RabbitMQ bootstrapping and for some reason can't trigger an action (since topic/queue is not yet created/message is lost or something like that ?) when running things for the first time.

Reproduce

Requirements to reproduce:

  • OS is Ubuntu Xenial
  • RabbitMQ is clean
  • st2 was just started
  • action runs immediately after st2 start

Script to reproduce

I could reproduce it every time with this script:

#!/bin/bash

# output executed commands
set -o xtrace

sudo st2ctl stop
# emulate fresh & clean RabbitMQ
rabbitmqctl stop_app
sudo rabbitmqctl reset
rabbitmqctl start_app

# isolate & make sure the problem is not with RabbitMQ startup
sleep 30
# but with StackStorm startup itself
sudo st2ctl start

# The command is stuck and runs forever
# See: st2 execution list
st2 run core.local echo 123

This is similar to StackStorm/st2-packages#445 (comment)
The problem is more serious than it looks like, being blocker for Automation, when deploying StackStorm in prod. Stuck execution immediately after startup is pretty much a bad thing.

I originally thought this could be solved with packaging, but after repro it sounds like more about StackStorm core.
cc @m4dcoder @Kami @lakshmi-kannan.

@lakshmi-kannan
Copy link
Contributor

I'll look into this. cc: @Kami, @m4dcoder

@arm4b arm4b changed the title [Xenial] Execution stuck in 'requested' state on a clean RabbitMQ [Xenial] Execution stuck in 'requested' state on a fresh RabbitMQ Apr 11, 2017
@Kami
Copy link
Member

Kami commented Aug 3, 2017

Interesting thing about this issue is that I can't replicate it if I set number of action runner workers to 1.

This makes it look like some kind of weird race inside action runners processes.

@arm4b
Copy link
Member Author

arm4b commented Aug 3, 2017

Great!
+1 step for better problem isolation

@Kami
Copy link
Member

Kami commented Aug 4, 2017

Confirmed that #3648 fixes this.

@armab it also wouldn't be bad if you can confirm as well with the latest v2.4dev packages :)

@Kami Kami added this to the 2.4.0 milestone Aug 4, 2017
@Kami
Copy link
Member

Kami commented Aug 4, 2017

Btw, now while testing the fix - we can also remove sleep 30 from the script above to speed up the testing a bit and it works fine (aka it seems RabbitMQ starts up quite fast so sleep is not really necessary).

@nmaludy
Copy link
Member

nmaludy commented Aug 6, 2017

+1 this just bit me in the puppet module

i believe the lines @Kami is referring to is: https://github.com/StackStorm/st2-packages/blob/master/scripts/st2bootstrap-deb.sh#L557-L561

arm4b pushed a commit to StackStorm/st2-packages that referenced this issue Aug 7, 2017
@arm4b
Copy link
Member Author

arm4b commented Aug 7, 2017

@Kami Awesome job!

This allows removing horrible timeout 30 hack from curl|bash installer: StackStorm/st2-packages#480

@sadanand25
Copy link

Did you get the solution on "Execution stuck in 'requested' state on a fresh RabbitMQ"

I am also facing same issue, if yes please provide the steps.

@arm4b
Copy link
Member Author

arm4b commented Jun 26, 2019

@sadanand25 Yes, this specific issue was fixed in previous stackstorm versions, see #3648

@arm4b
Copy link
Member Author

arm4b commented Jun 26, 2019

I've checked again the startup with the previous instructions and it's indeed fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants
@Kami @arm4b @lakshmi-kannan @nmaludy @sadanand25 and others