Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent Node-RED from Entering Safe Mode After Multiple Restarts #4552

Open
2 tasks done
muenir opened this issue Sep 24, 2024 · 12 comments
Open
2 tasks done

Prevent Node-RED from Entering Safe Mode After Multiple Restarts #4552

muenir opened this issue Sep 24, 2024 · 12 comments
Assignees
Labels
customer request requested by customer feature-request New feature or request that needs to be turned into Epic/Story details
Milestone

Comments

@muenir
Copy link

muenir commented Sep 24, 2024

Description

Currently, after multiple restarts, Node-RED automatically enters safe mode with the message: “Node-RED restart loop detected. Restarting in safe mode.” This can result in extended downtimes, where flows are not running, particularly during off-peak hours such as nighttime. This is problematic as it can lead to critical services being offline for several hours.

Request:
As an administrator, I would like the ability to configure a parameter (preferably as an environment variable) to prevent Node-RED instances from starting in safe mode after restarts. This will ensure that after a restart, flows remain active and the instance continues functioning without requiring manual intervention.

Expected Benefit:
By introducing this configuration option, I will be able to minimize downtime and avoid lengthy periods where flows are not running, especially during unattended periods like overnight restarts. This ensures continuous service availability and improves overall system resilience.

Tasks

Tasks

Preview Give feedback
  1. Steve-Mcl
  2. Steve-Mcl

Which customers would this be available to

Team + Enterprise Tiers (EE)
CE/EE

@Steve-Mcl modified scope after discussion with @joepavitt

Have you provided an initial effort estimate for this issue?

I am not a FlowFuse team member

@muenir muenir added feature-request New feature or request that needs to be turned into Epic/Story details needs-triage Needs looking at to decide what to do labels Sep 24, 2024
@knolleary knolleary added the customer request requested by customer label Sep 24, 2024
@joepavitt
Copy link
Contributor

@muenir thanks for raising this. Just to check the details here, we put into safe mode, when we've detected multiple hangs/restart loops to prevent this continuing infinitely.

Whilst we can offer the configuration option here to disable that (or configure more detail on when that safe mode is enabled), I'm struggling to see the value in turning it off entirely as your application will just continue to crash/loop? Or do you expect it to auto-recover at some point?

@muenir
Copy link
Author

muenir commented Sep 25, 2024

I understand your point. In some cases, multiple restarts may happen due to temporary issues, such memory issues/leaks. It’s also possible that a specific part of the flow is triggered at certain times, causing a restart (e.g.,buggy custom or function node). Since we’re already detecting restarts, we can respond to them more effectively. However, the extended downtime of flows, especially overnight, is a significant concern. Again, this would be just an optional and even temporary flag that would be set ...

@joepavitt joepavitt moved this to Todo in 🛠 Development Oct 11, 2024
@joepavitt joepavitt removed the needs-triage Needs looking at to decide what to do label Oct 11, 2024
@joepavitt joepavitt moved this to Scheduled in ☁️ Product Planning Oct 11, 2024
@joepavitt joepavitt added this to the 2.10 milestone Oct 11, 2024
@knolleary knolleary modified the milestones: 2.10, 2.11 Oct 25, 2024
@joepavitt joepavitt modified the milestones: 2.11, 2.12 Nov 29, 2024
@Steve-Mcl Steve-Mcl self-assigned this Dec 13, 2024
@Steve-Mcl
Copy link
Contributor

Steve-Mcl commented Dec 13, 2024

Tasks

  • Update FlowFuse to present and persist bootloop settings
  • Update launcher to accept and use bootloop settings
  • Update docker driver to get settings for project and pass to launcher
  • Update k8s driver to get settings for project and pass to launcher
  • Update local-fs driver to get settings for project and pass to launcher

@hardillb does this look about right (or can you think of a means of getting settings from platform to instance without having to alter all the drivers?)

@Steve-Mcl Steve-Mcl moved this from Todo to In Design in 🛠 Development Dec 13, 2024
@hardillb
Copy link
Contributor

@Steve-Mcl shouldn't need any changes to the drivers, all this should be handled by the nr-launcher and contained in the existing settings bundle

@Steve-Mcl
Copy link
Contributor

hmmm, but then how do we get new options (un-yet defined in forge) across to the launcher? (sorry, I am clearly rusty in the area - pointers apprciated)

@hardillb
Copy link
Contributor

Instance/id/settings api

@Steve-Mcl
Copy link
Contributor

Steve-Mcl commented Dec 14, 2024

Thanks Ben you helped me find the code I was looking for - it was 10 lines below where I already added the new variables! DOH!

### Tasks
- [ ]  Update FlowFuse to present and persist bootloop settings
- [ ]  Update launcher to accept and use bootloop settings (default to existing constant)

@joepavitt joepavitt moved this from In Design to In Progress in 🛠 Development Dec 16, 2024
@Steve-Mcl
Copy link
Contributor

Steve-Mcl commented Dec 16, 2024

@joepavitt

This issue is scoped as

Which customers would this be available to

Team + Enterprise Tiers (EE)

However, that would make implementation a lot more difficult and TBH, I see no value in only supporting this option in team+enterprise.

Are you happy for me to scope this to "CE/EE"?

@hardillb
Copy link
Contributor

@Steve-Mcl can we not just hide the option based on a feature flag in the team type? We have a number of these options already

@Steve-Mcl
Copy link
Contributor

Steve-Mcl commented Dec 16, 2024

@Steve-Mcl can we not just hide the option based on a feature flag in the team type? We have a number of these options already

Sure, but I dont really see this as a "value add" & doing this just over complicates it IMO.

Also, I would need to either ignore the setting in the launcher (based on tier) or inhibit sending it to launcher & defaulting to "off".

@Steve-Mcl
Copy link
Contributor

PS, this was not scoped by FF personnel. If I were the person scheduling this work, I would most likely have chose CE/EE - so just want clarification from Joe.

@Steve-Mcl
Copy link
Contributor

Scope changed to CE/EE in agreement with Joe.

@Steve-Mcl Steve-Mcl moved this from In Progress to Review in 🛠 Development Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
customer request requested by customer feature-request New feature or request that needs to be turned into Epic/Story details
Projects
Status: Scheduled
Status: Review
Development

No branches or pull requests

5 participants