Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persistence - Remove Unowned Keys #568

Open
wants to merge 2 commits into
base: unstable
Choose a base branch
from

Conversation

singku
Copy link

@singku singku commented May 28, 2024

Proposing this PR per issue discussion:
#539

For cluster, after startup loading, remove keys
that shouldn't be served by this server based on slot assignment of a cluster.

Also added stat fields in server to count the total removed keys and skipped slots from last loading.

@singku singku force-pushed the unstable branch 2 times, most recently from 2eb2028 to eb85e5f Compare May 28, 2024 19:54
Copy link

codecov bot commented May 28, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.59%. Comparing base (aaa7834) to head (3299f99).
Report is 133 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable     #568      +/-   ##
============================================
+ Coverage     70.35%   70.59%   +0.23%     
============================================
  Files           112      114       +2     
  Lines         61467    61703     +236     
============================================
+ Hits          43248    43562     +314     
+ Misses        18219    18141      -78     
Files with missing lines Coverage Δ
src/cluster.c 88.26% <100.00%> (-0.09%) ⬇️
src/server.c 88.63% <100.00%> (+0.07%) ⬆️
src/server.h 100.00% <ø> (ø)

... and 58 files with indirect coverage changes

src/server.c Outdated Show resolved Hide resolved
Copy link
Member

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also should implement a test for this as well. I know it will probably be a bit tricky. I'm thinking of:

  1. Create an RDB file with data in all slots.
  2. Start a node with it that has a nodes.conf that doesn't own all the slots.

src/cluster_legacy.c Outdated Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
src/cluster_legacy.c Outdated Show resolved Hide resolved
@singku
Copy link
Author

singku commented May 29, 2024

We also should implement a test for this as well. I know it will probably be a bit tricky. I'm thinking of:

  1. Create an RDB file with data in all slots.
  2. Start a node with it that has a nodes.conf that doesn't own all the slots.

If we move the deletion after verifyClusterConfigWithData we can't simply do this test because the unassigned slot with keys will be reclaimed by the node. We can only test it in cluster mode with multiple shards, as well performing slot migration in the test

@singku
Copy link
Author

singku commented Jun 4, 2024

@madolson gently ping for additional input

src/cluster.c Outdated Show resolved Hide resolved
src/cluster.c Outdated Show resolved Hide resolved
src/cluster.c Outdated Show resolved Hide resolved
src/cluster.c Outdated Show resolved Hide resolved
src/server.h Outdated Show resolved Hide resolved
src/cluster.c Outdated Show resolved Hide resolved
@singku
Copy link
Author

singku commented Jul 30, 2024

@hpatro Thanks for reviewing this PR, I just rebased it to resolve conflicts and applied changes per your comments, please take a look.

src/cluster.c Outdated Show resolved Hide resolved
src/server.h Outdated Show resolved Hide resolved
src/cluster.c Outdated Show resolved Hide resolved
src/cluster.c Outdated Show resolved Hide resolved
@hpatro
Copy link
Contributor

hpatro commented Aug 9, 2024

Mostly LGTM, thanks @singku!

For cluster, after startup loading, remove keys
that shouldn't be served by this server based on slot
assignment of a cluster.

Signed-off-by: Liang Tang <tangliang@google.com>
src/cluster.c Outdated Show resolved Hide resolved
@hpatro
Copy link
Contributor

hpatro commented Aug 13, 2024

@madolson Could you take a pass? LGTM.

@singku
Copy link
Author

singku commented Sep 17, 2024

@madolson Could you take a pass? LGTM.

Gentle Ping..

Co-authored-by: Harkrishn Patro <bunty.hari@gmail.com>
Signed-off-by: Ping Xie <pingxie@outlook.com>
@PingXie
Copy link
Member

PingXie commented Sep 29, 2024

For a more generic solution, I think we need to shed unowned keys during loading. The current fix drops the unowned keys after the loading is complete but the node might have already OOM'd before reaching the point.

We also should implement a test for this as well.

+1

If we move the deletion after verifyClusterConfigWithData we can't simply do this test because the unassigned slot with keys will be reclaimed by the node. We can only test it in cluster mode with multiple shards, as well performing slot migration in the test

This change does need to be tested in cluster mode. Have you tried out @madolson's recommendation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants