Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(tari-pulse): await shutdown signal in main loop #6696

Merged

Conversation

MCozhusheck
Copy link
Contributor

Description

Await shutdown signal and reduce error logging when node failed to compare blocks hashes.

Motivation and Context

Error logs were spamming on not supported network, greatly reducing readability and shutdown signal wasn't working propely.

How Has This Been Tested?

Run local node on esmeralda network (which is not supported yet) and watch for error logs. Also check shutdown behavior.

What process can a PR reviewer use to test or verify this change?

Same as above

Breaking Changes

  • None
  • Requires data directory on base node to be deleted
  • Requires hard fork
  • Other - Please specify

Copy link

github-actions bot commented Nov 20, 2024

Test Results (CI)

    3 files    129 suites   35m 43s ⏱️
1 344 tests 1 344 ✅ 0 💤 0 ❌
4 030 runs  4 030 ✅ 0 💤 0 ❌

Results for commit 5bbd47f.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Nov 20, 2024

Test Results (Integration tests)

36 tests  +36   36 ✅ +36   15m 3s ⏱️ + 15m 3s
11 suites +11    0 💤 ± 0 
 2 files   + 2    0 ❌ ± 0 

Results for commit 5bbd47f. ± Comparison against base commit 0610bcc.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@hansieodendaal hansieodendaal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, looking good.

Please change

    // Interval to check if the base node is still in sync with the network
    pub tari_pulse_interval: Duration,

to

    /// Interval to check if the base node is still in sync with the network
    #[serde(with = "serializers::seconds")]
    pub tari_pulse_interval: Duration,

base_layer/core/src/base_node/tari_pulse_service/mod.rs Outdated Show resolved Hide resolved
@MCozhusheck MCozhusheck changed the title fix(tari-pulse: await shutdown signal in main loop fix(tari-pulse): await shutdown signal in main loop Nov 20, 2024
hansieodendaal
hansieodendaal previously approved these changes Nov 21, 2024
Copy link
Contributor

@hansieodendaal hansieodendaal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

Some nits

base_layer/core/src/base_node/tari_pulse_service/mod.rs Outdated Show resolved Hide resolved
Comment on lines 144 to 146
interval = time::interval(Duration::from_secs(interval_in_secs * 2));
interval.tick().await;
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto no need to wait again

Suggested change
interval = time::interval(Duration::from_secs(interval_in_secs * 2));
interval.tick().await;
continue;
interval = time::interval(Duration::from_secs(interval_in_secs * 2));
continue;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, we need to .await first for next invoke to actually sleep.

@hansieodendaal
Copy link
Contributor

hansieodendaal commented Nov 21, 2024

Please change

    // Interval to check if the base node is still in sync with the network
    pub tari_pulse_interval: Duration,

to

    /// Interval to check if the base node is still in sync with the network
    #[serde(with = "serializers::seconds")]
    pub tari_pulse_interval: Duration,

This must be changed in common/config/presets/c_base_node_c.toml

hansieodendaal added a commit to hansieodendaal/tari that referenced this pull request Nov 21, 2024
Fixed the base node shutdown on Ubuntu whereas pressing Ctrl-C multiple times rendered the
command line interface in an undefined state. This was not issue on Windows.

Note: The change to 'base_layer\core\src\base_node\tari_pulse_service\mod.rs' is just to
enable testing (get rid of the repeated command-line spam), and will be fixed in tari-project#6696.
@hansieodendaal hansieodendaal mentioned this pull request Nov 21, 2024
4 tasks
base_layer/core/src/base_node/tari_pulse_service/mod.rs Outdated Show resolved Hide resolved
let interval_in_secs = interval.period().as_secs();
if interval_in_secs > (60 * 30) {
warn!(target: LOG_TARGET, "Reached maximum retry interval of 30 minutes. Exiting");
break;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not break here, just leave the interval at max 30 mins

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also after removing this tick().await after assign new interval the loop continues to max value right away.
Screenshot from 2024-11-22 11-59-15

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like after assigning new interval we have to tick().await to make sure that the next iteration of the loop actually awaits

SWvheerden pushed a commit that referenced this pull request Nov 22, 2024
Description
---
Fixed the base node shutdown on Ubuntu (and hopefully on MacOS) whereas
pressing Ctrl-C multiple times rendered the command line interface in an
undefined state. This was not issue on Windows.

**Note:** The change to
'base_layer\core\src\base_node\tari_pulse_service\mod.rs' is an interim
change to enable testing (getting rid of the repeated command-line spam)
but not required for the base node to shut down. More changes in #6696.

Motivation and Context
---
The base node did not shut down properly on Ubuntu and MacOs.

How Has This Been Tested?
---
System-level testing on Windows and Ubuntu (WSL).

What process can a PR reviewer use to test or verify this change?
---
System-level testing - still outstanding on MacOS

<!-- Checklist -->
<!-- 1. Is the title of your PR in the form that would make nice release
notes? The title, excluding the conventional commit
tag, will be included exactly as is in the CHANGELOG, so please think
about it carefully. -->


Breaking Changes
---

- [x] None
- [ ] Requires data directory on base node to be deleted
- [ ] Requires hard fork
- [ ] Other - Please specify

<!-- Does this include a breaking change? If so, include this line as a
footer -->
<!-- BREAKING CHANGE: Description what the user should do, e.g. delete a
database, resync the chain -->
@MCozhusheck MCozhusheck marked this pull request as ready for review November 22, 2024 12:48
@MCozhusheck MCozhusheck requested a review from a team as a code owner November 22, 2024 12:48
Co-authored-by: Hansie Odendaal <39146854+hansieodendaal@users.noreply.github.com>
@SWvheerden SWvheerden merged commit 321e9ba into tari-project:development Nov 26, 2024
12 checks passed
hansieodendaal added a commit to hansieodendaal/tari that referenced this pull request Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants