Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: supports the master-slave mode #342

Conversation

panlei-coder
Copy link
Collaborator

@panlei-coder panlei-coder commented Jun 5, 2024

issue: #211
需要先合并 pikiwidb/braft#6 这个 pr。

由于该 pr 需要修改 braft 部分的源码,所以在 braft.cmake 中先改成了我自己的仓库,等后续合并了 braft 对应的 pr 之后再改回来。

主从模式的实现大体的逻辑和 raft 集群的逻辑一致,具体的命令如下:
主从
master:master init
slave:slaveof ip port (learner) (如果检测到自己是followr,则先移除集群,再以learner的方式加入)
slaveof no one 解除主从关系

braft
leader:raft.cluster init
follower:raft.cluster join ip:port (如果检测到自己是learner,则先解除主从关系,再重新加入集群)
raft.cluster remove peer_id

Summary by CodeRabbit

Summary by CodeRabbit

  • 新功能

    • 引入用于管理Raft集群内主从关系的命令。
    • 添加IP地址和端口的验证函数。
  • 改进

    • 增强Raft节点信息显示,包含领导者和学习者节点。
    • 更新集群操作,支持指定角色如跟随者或学习者。
  • 重构

    • 在测试中提取获取peer ID和成员状态的逻辑到一个单独的函数,以提高可读性和可重用性。
  • 错误修复

    • 解决与Raft集群操作中处理学习者相关的问题。

@github-actions github-actions bot added the ✏️ Feature New feature or request label Jun 5, 2024
Copy link

coderabbitai bot commented Jun 5, 2024

Warning

Rate limit exceeded

@panlei-coder has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 15 minutes and 28 seconds before requesting another review.

How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Commits

Files that changed from the base of the PR and between 54897e8 and b5cbcae.

Walkthrough

最近的更改主要集中在增强 Raft 集群管理功能。关键更新包括修改 CMake 中的构建配置、更新 braft 依赖项、添加 Raft 集群操作的新命令,并完善对 Raft 节点(包括学习节点)的处理逻辑。此外,引入了用于验证 IP 和端口的新函数,并改善了测试套件的可读性和可重用性。

Changes

Files Change Summary
CMakeLists.txt ADDRESS_SANITIZER 的默认设置从 ON 更改为 OFF
cmake/braft.cmake 更新 braft 依赖项的 GIT_REPOSITORYGIT_TAG
src/base_cmd.h 添加新的命令名称 kCmdNameMasterkCmdNameSlaveof,删除了不必要的分号。
src/cmd_admin.cc 更新获取和显示 Raft 节点信息的逻辑。
src/cmd_raft.cc 重构并添加集群操作函数,引入了主从角色的新命令。
src/cmd_raft.h 引入新的类 MasterCmdSlaveofCmd 来处理 Raft 集群命令。
src/cmd_table_manager.cc 向命令表中添加了 MasterSlaveof 命令。
src/praft/praft.cc 添加并修改函数以处理 Raft 集群操作中的学习节点。
src/praft/praft.h 添加新参数和方法以支持 Raft 集群操作中的学习节点。
src/pstd/pstd_string.cc, src/pstd/pstd_string.h 引入验证 IP 地址和端口的函数。
tests/consistency_test.go 重构测试函数,通过提取逻辑到新函数中提高可读性和可重用性。

Sequence Diagram(s)

sequenceDiagram
    participant Admin as Admin
    participant InfoCmd as InfoCmd
    participant RaftNode as RaftNode
    participant RaftCluster as RaftCluster
    participant PRaft as PRaft

    Admin->>InfoCmd: 请求 Raft 信息
    InfoCmd->>PRaft: 获取领导者 ID
    PRaft-->>InfoCmd: 返回领导者 ID
    InfoCmd->>PRaft: 获取节点信息
    PRaft-->>InfoCmd: 返回节点信息
    InfoCmd-->>Admin: 显示 Raft 信息

    Admin->>RaftCluster: 添加节点(学习者)
    RaftCluster->>PRaft: 添加 Peer(学习者)
    PRaft-->>RaftCluster: 确认添加
    RaftCluster-->>Admin: 节点已添加

    Admin->>RaftCluster: 移除节点(学习者)
    RaftCluster->>PRaft: 移除 Peer(学习者)
    PRaft-->>RaftCluster: 确认移除
    RaftCluster-->>Admin: 节点已移除
Loading

Poem

在代码的天地里,兔子们嬉戏,
Raft 集群今天变得更加强大。
学习者与领导者,新命令齐飞,
节点和谐共舞,生机勃勃。
IP 与端口,检查更周全,
错误与漏洞,我们不放过。
欢庆吧,程序员,系统光明,
在字节的巢穴里,一切安好。 🐇✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Outside diff range and nitpick comments (4)
src/cmd_raft.cc (2)

Line range hint 154-198: Review the cluster joining logic for error handling and message clarity.

The ClusterJoin function could improve its error messages for better clarity and debugging. Specifically, the message for an already initialized node could be more descriptive.

- "A node that has been added to a cluster must be removed from the old cluster before it can be added to the new cluster"
+ "Node initialization error: Please remove the node from its current cluster before adding it to a new one."

Line range hint 237-253: Ensure proper validation of IP and port in the DoCmdJoin function.

The function DoCmdJoin uses the GetIpAndPortFromEndPoint utility to parse the IP and port but does not handle the case where the IP or port might be invalid. Consider adding error handling for this scenario.

+ if (!ip_port.has_value()) {
+   return client->SetRes(CmdRes::kErrOther, "Invalid IP or port provided.");
+ }
src/praft/praft.cc (2)

Line range hint 37-48: Ensure proper synchronization and error handling in ClusterCmdContext::Set.

The method ClusterCmdContext::Set modifies several member variables under a mutex lock, which is good for thread safety. However, there's no error handling if client_ is not nullptr when this method is called, which could lead to silent logical errors if the method is misused. Consider adding an error log or exception in this case.


291-294: Clarify the role of the learner flag in PRaft::SendNodeAddRequest.

The method PRaft::SendNodeAddRequest appends "learner" to the request if is_learner is true. It would be beneficial for maintainability to add a comment explaining why this distinction is necessary, especially for future developers or for debugging.

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between eec3030 and 021a199.

Files selected for processing (12)
  • CMakeLists.txt (1 hunks)
  • cmake/braft.cmake (1 hunks)
  • src/base_cmd.h (2 hunks)
  • src/cmd_admin.cc (1 hunks)
  • src/cmd_raft.cc (7 hunks)
  • src/cmd_raft.h (1 hunks)
  • src/cmd_table_manager.cc (1 hunks)
  • src/praft/praft.cc (9 hunks)
  • src/praft/praft.h (4 hunks)
  • src/pstd/pstd_string.cc (2 hunks)
  • src/pstd/pstd_string.h (2 hunks)
  • tests/consistency_test.go (1 hunks)
Files skipped from review due to trivial changes (1)
  • CMakeLists.txt
Additional context used
golangci-lint
tests/consistency_test.go

27-27: undefined: redis (typecheck)


28-28: undefined: redis (typecheck)


31-31: undefined: Describe (typecheck)


819-819: undefined: redis (typecheck)


37-37: undefined: BeforeAll (typecheck)


44-44: undefined: Expect (typecheck)


49-49: undefined: Expect (typecheck)


50-50: undefined: Expect (typecheck)


80-80: undefined: AfterAll (typecheck)


90-90: undefined: BeforeEach (typecheck)


105-105: undefined: AfterEach (typecheck)


117-117: undefined: It (typecheck)


156-156: undefined: It (typecheck)


189-189: undefined: It (typecheck)

Additional comments not posted (11)
src/cmd_raft.h (1)

83-107: Addition of MasterCmd and SlaveofCmd classes to handle master-slave commands.

Consider adding unit tests to ensure the functionality of these new commands.

src/pstd/pstd_string.h (1)

101-103: Addition of IsValidIP and IsValidPort functions to validate IP addresses and ports.

Consider adding unit tests to ensure the robustness of these new utility functions.

src/praft/praft.h (1)

Line range hint 54-142: Modifications and additions to handle learners in the Raft cluster.

Ensure thorough testing, especially integration tests, to verify the correct handling of learners in various cluster scenarios.

src/cmd_table_manager.cc (1)

73-74: Addition of Master and Slaveof commands to the command table with specific priorities.

Ensure testing to verify that these commands are correctly integrated and functional within the system.

src/cmd_admin.cc (1)

198-226: Updates to InfoCmd::InfoRaft function to improve the display of Raft node information.

Consider adding comments or documentation to explain these changes for the benefit of future maintainers.

src/cmd_raft.cc (2)

87-99: Validate the number of arguments for the RAFT.NODE ADD command.

The validation logic for the number of arguments in the DoCmdAdd function is correctly implemented, ensuring that either 4 or 5 arguments are provided. This is crucial for maintaining the command's integrity.


118-136: Check the role type handling in the RAFT.NODE REMOVE command.

The handling of the role type (learner or follower) in the DoCmdRemove function appears correct, but ensure that this aligns with the expected behavior in all use cases.

Verification successful

The implementation of the DoCmdRemove function in src/cmd_raft.cc correctly handles the role type (learner or follower) for the RAFT.NODE REMOVE command. The function checks the number of arguments and sets the is_learner flag based on the fourth argument if provided.

  • src/cmd_raft.cc lines 118-136: The role type handling in the DoCmdRemove function aligns with the expected behavior.
Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify the role type handling in RAFT.NODE REMOVE command.

# Test: Search for the function usage. Expect: Correct handling of role types.
rg --type python $'DoCmdRemove'

Length of output: 61



Script:

#!/bin/bash
# Description: Verify the role type handling in RAFT.NODE REMOVE command.

# Search for the definition of the DoCmdRemove function in C++ files.
rg --type cpp 'DoCmdRemove'

# Search for the usage of the DoCmdRemove function in C++ files.
rg --type cpp 'DoCmdRemove('

Length of output: 448



Script:

#!/bin/bash
# Description: Extract the implementation of the DoCmdRemove function from src/cmd_raft.cc.

# Extract the implementation of the DoCmdRemove function.
ast-grep --lang cpp --pattern $'void DoCmdRemove(PClient* client) {
  $$$
}' src/cmd_raft.cc

Length of output: 93



Script:

#!/bin/bash
# Description: Extract the implementation of the DoCmdRemove function from src/cmd_raft.cc.

# Search for the implementation of the DoCmdRemove function.
rg --type cpp --context 20 'void DoCmdRemove(PClient* client)' src/cmd_raft.cc

Length of output: 80



Script:

#!/bin/bash
# Description: Extract the implementation of the DoCmdRemove function from src/cmd_raft.cc.

# Search for the implementation of the DoCmdRemove function with increased context lines.
rg --type cpp --context 40 'DoCmdRemove' src/cmd_raft.cc

Length of output: 6018

src/base_cmd.h (1)

45-46: Addition of new command constants for master-slave mode.

The constants kCmdNameMaster and kCmdNameSlaveof have been correctly added to support the new master-slave mode functionality. This is a necessary update for the new feature.

src/pstd/pstd_string.cc (1)

662-674: Validate IP address and port number utility functions.

The functions IsValidIP and IsValidPort are correctly implemented to validate IP addresses and port numbers, respectively. These functions are crucial for ensuring the validity of network addresses in the application.

src/praft/praft.cc (2)

306-309: Ensure consistency in handling the learner flag in PRaft::SendNodeRemoveRequest.

The method PRaft::SendNodeRemoveRequest also appends "learner" based on the is_learner flag. Verify that this behavior is consistent and correctly documented across all methods that use this flag.


161-161: Validate the use of is_learner flag in PRaft::Init.

The is_learner flag is set based on the ClusterCmdContext. Ensure that this flag is appropriately used throughout the system and that its value is consistent with the expected behavior in all contexts where it's accessed.

Verification successful

To ensure the is_learner flag is used consistently and correctly across the system, we need to verify its initialization and usage in the relevant methods and contexts. Let's start by examining the initialization and setting of the is_learner flag in praft.cc and its usage in methods like AddPeer and RemovePeer.


To ensure the is_learner flag is used consistently and correctly across the system, we need to re-run the verification scripts with adjusted patterns to ensure they match the code correctly.


The previous attempts to verify the usage of the is_learner flag did not produce any output, which suggests that the patterns used might not have matched the code correctly. To ensure a thorough verification, we need to adjust our approach and re-run the verification scripts with more precise patterns.


The verification of the is_learner flag usage across the system has been successful. The flag is initialized and set correctly in praft.cc, and its usage in methods like AddPeer and RemovePeer is consistent with the expected behavior. Additionally, the flag is appropriately used in the ClusterRemove and ClusterJoin functions in cmd_raft.cc.

  • is_learner is initialized in src/praft/praft.h:

    bool is_learner_ = false;
  • Usage in AddPeer method in src/praft/praft.cc:

    butil::Status PRaft::AddPeer(const std::string& peer, bool is_learner) {
      if (!node_) {
        ERROR_LOG_AND_STATUS("Node is not initialized");
      }
      braft::SynchronizedClosure done;
      if (is_learner) {
        node_->add_learner(peer, &done);
      } else {
        node_->add_peer(peer, &done);
      }
    }
  • Usage in RemovePeer method in src/praft/praft.cc:

    butil::Status PRaft::RemovePeer(const std::string& peer, bool is_learner) {
      if (!node_) {
        return ERROR_LOG_AND_STATUS("Node is not initialized");
      }
      braft::SynchronizedClosure done;
      if (is_learner) {
        node_->remove_learner(peer, &done);
      } else {
        node_->remove_peer(peer, &done);
      }
    }
  • Usage in ClusterRemove function in src/cmd_raft.cc:

    static void ClusterRemove(PClient* client, bool is_learner) {
      // Function implementation
    }
  • Usage in ClusterJoin function in src/cmd_raft.cc:

    static void ClusterJoin(PClient* client, std::string&& ip, int port, bool is_learner) {
      // Function implementation
    }

The is_learner flag is used consistently and correctly across the system.

Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify the usage of `is_learner` flag across the system.

# Test: Search for the usage of `is_learner`. Expect: Consistent and correct usage across the system.
rg --type cpp 'is_learner'

Length of output: 1719



Script:

#!/bin/bash
# Verify the initialization and setting of `is_learner` in `praft.cc`
ast-grep --lang cpp --pattern $'bool is_learner_ = $_;' src/praft/praft.cc

# Verify the usage of `is_learner` in `AddPeer` and `RemovePeer` methods in `praft.cc`
ast-grep --lang cpp --pattern $'butil::Status PRaft::AddPeer(const std::string& peer, bool is_learner) {
  $$$
}' src/praft/praft.cc

ast-grep --lang cpp --pattern $'butil::Status PRaft::RemovePeer(const std::string& peer, bool is_learner) {
  $$$
}' src/praft/praft.cc

# Verify the usage of `is_learner` in `cmd_raft.cc`
ast-grep --lang cpp --pattern $'static void ClusterRemove(PClient* client, bool is_learner) {
  $$$
}' src/cmd_raft.cc

ast-grep --lang cpp --pattern $'static void ClusterJoin(PClient* client, std::string&& ip, int port, bool is_learner) {
  $$$
}' src/cmd_raft.cc

Length of output: 612



Script:

#!/bin/bash
# Verify the initialization and setting of `is_learner` in `praft.cc`
ast-grep --lang cpp --pattern 'bool is_learner_ = $_;' src/praft/praft.cc

# Verify the usage of `is_learner` in `AddPeer` and `RemovePeer` methods in `praft.cc`
ast-grep --lang cpp --pattern 'butil::Status PRaft::AddPeer(const std::string& peer, bool is_learner) {
  $$$
}' src/praft/praft.cc

ast-grep --lang cpp --pattern 'butil::Status PRaft::RemovePeer(const std::string& peer, bool is_learner) {
  $$$
}' src/praft/praft.cc

# Verify the usage of `is_learner` in `cmd_raft.cc`
ast-grep --lang cpp --pattern 'static void ClusterRemove(PClient* client, bool is_learner) {
  $$$
}' src/cmd_raft.cc

ast-grep --lang cpp --pattern 'static void ClusterJoin(PClient* client, std::string&& ip, int port, bool is_learner) {
  $$$
}' src/cmd_raft.cc

Length of output: 612



Script:

#!/bin/bash
# Verify the initialization and setting of `is_learner` in `praft.cc`
rg --type cpp 'bool is_learner_ ='

# Verify the usage of `is_learner` in `AddPeer` and `RemovePeer` methods in `praft.cc`
rg --type cpp 'butil::Status PRaft::AddPeer' -A 10
rg --type cpp 'butil::Status PRaft::RemovePeer' -A 10

# Verify the usage of `is_learner` in `cmd_raft.cc`
rg --type cpp 'static void ClusterRemove' -A 10
rg --type cpp 'static void ClusterJoin' -A 10

Length of output: 2722

Comment on lines +819 to +839
func getPeerID(client *redis.Client, ctx context.Context) (string, bool) {
info, err := client.Do(ctx, "info", "raft").Result()
Expect(err).NotTo(HaveOccurred())
info_str := info.(string)
scanner := bufio.NewScanner(strings.NewReader(info_str))
var peer_id string
var is_member bool
for scanner.Scan() {
line := scanner.Text()
if strings.Contains(line, "raft_peer_id") {
parts := strings.Split(line, ":")
if len(parts) >= 2 {
peer_id = parts[1]
is_member = true
break
}
}
}

return peer_id, is_member
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure the redis package is correctly imported to resolve the 'undefined: redis' error.

+ import "github.com/redis/go-redis/v9"

Committable suggestion was skipped due to low confidence.

Tools
golangci-lint

819-819: undefined: redis (typecheck)

Comment on lines +792 to +816
// raft
for _, follower := range followers {
info, err := follower.Do(ctx, "info", "raft").Result()
Expect(err).NotTo(HaveOccurred())
info_str := info.(string)
scanner := bufio.NewScanner(strings.NewReader(info_str))
var peer_id string
var is_member bool
for scanner.Scan() {
line := scanner.Text()
if strings.Contains(line, "raft_peer_id") {
parts := strings.Split(line, ":")
if len(parts) >= 2 {
peer_id = parts[1]
is_member = true
break
}
}
}
peer_id, is_member := getPeerID(follower, ctx)

if is_member {
ret, err := follower.Do(ctx, "raft.node", "remove", peer_id).Result()
Expect(err).NotTo(HaveOccurred())
Expect(ret).To(Equal(OK))
}
}
})

for _, f := range followers {
ret, err := f.Do(ctx, "slaveof", "127.0.0.1", "12111").Result()
Expect(err).NotTo(HaveOccurred())
Expect(ret).To(Equal(OK))
}

time.Sleep(10 * time.Second)

for _, f := range followers {
ret, err := f.Do(ctx, "slaveof", "no", "one").Result()
Expect(err).NotTo(HaveOccurred())
Expect(ret).To(Equal(OK))
}
})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure the It function from the Ginkgo testing framework is correctly imported to resolve the 'undefined: It' error.

+ import . "github.com/onsi/ginkgo/v2"
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
// raft
for _, follower := range followers {
info, err := follower.Do(ctx, "info", "raft").Result()
Expect(err).NotTo(HaveOccurred())
info_str := info.(string)
scanner := bufio.NewScanner(strings.NewReader(info_str))
var peer_id string
var is_member bool
for scanner.Scan() {
line := scanner.Text()
if strings.Contains(line, "raft_peer_id") {
parts := strings.Split(line, ":")
if len(parts) >= 2 {
peer_id = parts[1]
is_member = true
break
}
}
}
peer_id, is_member := getPeerID(follower, ctx)
if is_member {
ret, err := follower.Do(ctx, "raft.node", "remove", peer_id).Result()
Expect(err).NotTo(HaveOccurred())
Expect(ret).To(Equal(OK))
}
}
})
for _, f := range followers {
ret, err := f.Do(ctx, "slaveof", "127.0.0.1", "12111").Result()
Expect(err).NotTo(HaveOccurred())
Expect(ret).To(Equal(OK))
}
time.Sleep(10 * time.Second)
for _, f := range followers {
ret, err := f.Do(ctx, "slaveof", "no", "one").Result()
Expect(err).NotTo(HaveOccurred())
Expect(ret).To(Equal(OK))
}
})
+ import . "github.com/onsi/ginkgo/v2"
// raft
for _, follower := range followers {
peer_id, is_member := getPeerID(follower, ctx)
if is_member {
ret, err := follower.Do(ctx, "raft.node", "remove", peer_id).Result()
Expect(err).NotTo(HaveOccurred())
Expect(ret).To(Equal(OK))
}
}
for _, f := range followers {
ret, err := f.Do(ctx, "slaveof", "127.0.0.1", "12111").Result()
Expect(err).NotTo(HaveOccurred())
Expect(ret).To(Equal(OK))
}
time.Sleep(10 * time.Second)
for _, f := range followers {
ret, err := f.Do(ctx, "slaveof", "no", "one").Result()
Expect(err).NotTo(HaveOccurred())
Expect(ret).To(Equal(OK))
}
})

src/praft/praft.cc Outdated Show resolved Hide resolved
src/praft/praft.cc Show resolved Hide resolved
src/praft/praft.cc Show resolved Hide resolved
Comment on lines 19 to 20
GIT_REPOSITORY https://github.com/panlei-coder/braft.git # "https://github.com/pikiwidb/braft.git"
GIT_TAG fix_learner_bug # master
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip

Codebase Verification

The temporary changes to GIT_REPOSITORY and GIT_TAG are still present in the cmake/braft.cmake file.

  • GIT_REPOSITORY https://github.com/panlei-coder/braft.git
  • GIT_TAG fix_learner_bug

Ensure to revert these changes once the permanent fix is merged into the main braft repository.

Analysis chain

Temporary change to GIT_REPOSITORY and GIT_TAG to address a specific issue.

Ensure to revert these changes once the permanent fix is merged into the main braft repository.

Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Check if the temporary changes to GIT_REPOSITORY and GIT_TAG have been reverted.

# Test: Search for the temporary repository and tag in the cmake file. Expect: No occurrences after the fix is merged.
rg --type cmake $'GIT_REPOSITORY https://github.com/panlei-coder/braft.git|GIT_TAG fix_learner_bug' cmake/braft.cmake

Length of output: 266

Comment on lines +26 to +50
static void ClusterRemove(PClient* client, bool is_learner) {
// Get the leader information
braft::PeerId leader_peer_id(PRAFT.GetLeaderID());
// @todo There will be an unreasonable address, need to consider how to deal with it
if (leader_peer_id.is_empty()) {
client->SetRes(CmdRes::kErrOther,
"The leader address of the cluster is incorrect, try again or delete the node from another node");
return;
}

// Connect target
std::string peer_ip = butil::ip2str(leader_peer_id.addr.ip).c_str();
auto port = leader_peer_id.addr.port - pikiwidb::g_config.raft_port_offset;
auto peer_id = client->argv_[2];
auto ret = PRAFT.GetClusterCmdCtx().Set(ClusterCmdType::kRemove, client, std::move(peer_ip), port, is_learner,
std::move(peer_id));
if (!ret) { // other clients have removed
return client->SetRes(CmdRes::kErrOther, "Other clients have removed");
}
PRAFT.GetClusterCmdCtx().ConnectTargetNode();
INFO("Sent remove request to leader successfully");

// Not reply any message here, we will reply after the connection is established.
client->Clear();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure proper error handling when the leader's address is incorrect.

The function ClusterRemove attempts to handle an incorrect leader address by setting an error response, but it does not log this significant event. Consider adding a log statement before returning to aid in debugging.

+ INFO("Incorrect leader address detected.");
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
static void ClusterRemove(PClient* client, bool is_learner) {
// Get the leader information
braft::PeerId leader_peer_id(PRAFT.GetLeaderID());
// @todo There will be an unreasonable address, need to consider how to deal with it
if (leader_peer_id.is_empty()) {
client->SetRes(CmdRes::kErrOther,
"The leader address of the cluster is incorrect, try again or delete the node from another node");
return;
}
// Connect target
std::string peer_ip = butil::ip2str(leader_peer_id.addr.ip).c_str();
auto port = leader_peer_id.addr.port - pikiwidb::g_config.raft_port_offset;
auto peer_id = client->argv_[2];
auto ret = PRAFT.GetClusterCmdCtx().Set(ClusterCmdType::kRemove, client, std::move(peer_ip), port, is_learner,
std::move(peer_id));
if (!ret) { // other clients have removed
return client->SetRes(CmdRes::kErrOther, "Other clients have removed");
}
PRAFT.GetClusterCmdCtx().ConnectTargetNode();
INFO("Sent remove request to leader successfully");
// Not reply any message here, we will reply after the connection is established.
client->Clear();
}
static void ClusterRemove(PClient* client, bool is_learner) {
// Get the leader information
braft::PeerId leader_peer_id(PRAFT.GetLeaderID());
// @todo There will be an unreasonable address, need to consider how to deal with it
if (leader_peer_id.is_empty()) {
INFO("Incorrect leader address detected.");
client->SetRes(CmdRes::kErrOther,
"The leader address of the cluster is incorrect, try again or delete the node from another node");
return;
}
// Connect target
std::string peer_ip = butil::ip2str(leader_peer_id.addr.ip).c_str();
auto port = leader_peer_id.addr.port - pikiwidb::g_config.raft_port_offset;
auto peer_id = client->argv_[2];
auto ret = PRAFT.GetClusterCmdCtx().Set(ClusterCmdType::kRemove, client, std::move(peer_ip), port, is_learner,
std::move(peer_id));
if (!ret) { // other clients have removed
return client->SetRes(CmdRes::kErrOther, "Other clients have removed");
}
PRAFT.GetClusterCmdCtx().ConnectTargetNode();
INFO("Sent remove request to leader successfully");
// Not reply any message here, we will reply after the connection is established.
client->Clear();
}

CMakeLists.txt Outdated
@@ -33,7 +33,7 @@ IF(CMAKE_BUILD_TYPE STREQUAL "Release")
MESSAGE("Building in Release mode")
ELSE()
MESSAGE("Building in Debug mode")
OPTION(ADDRESS_SANITIZER "Enable AddressSanitizer (default sanitizer)" ON)
OPTION(ADDRESS_SANITIZER "Enable AddressSanitizer (default sanitizer)" OFF)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

重新 OPEN , ON

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Outside diff range, codebase verification and nitpick comments (1)
src/cmd_raft.cc (1)

26-50: 改进 ClusterRemove 函数

新增的 ClusterRemove 函数封装了节点移除逻辑,改进了代码组织和可重用性。然而,它缺少对重要事件的日志记录。

建议在处理错误和成功发送请求时添加日志记录,以便调试和监控。

+  INFO("Attempting to remove node from cluster.");
...
+  INFO("Incorrect leader address detected.");
...
+  INFO("Sent remove request to leader successfully.");
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 021a199 and 54897e8.

Files selected for processing (5)
  • src/cmd_admin.cc (1 hunks)
  • src/cmd_raft.cc (7 hunks)
  • src/praft/praft.cc (9 hunks)
  • src/praft/praft.h (4 hunks)
  • tests/consistency_test.go (1 hunks)
Additional context used
Learnings (1)
src/cmd_admin.cc (1)
Learnt from: hahahashen
PR: OpenAtomFoundation/pikiwidb#338
File: src/cmd_admin.cc:248-289
Timestamp: 2024-07-10T14:45:11.380Z
Learning: The error messages in `InfoCmd::InfoRaft` were updated to use more standard English for improved readability and professionalism.
Additional comments not posted (26)
src/praft/praft.h (10)

55-55: 新增参数 is_learner

Set 方法新增了 bool is_learner 参数,这使得可以区分学习者和非学习者节点。这一改动与支持主从模式的 PR 目标一致。


63-63: 添加 const 限定符

GetClusterCmdType 方法添加了 const 限定符,确保该方法不会修改对象的状态,提高了代码的安全性和清晰度。


64-64: 添加 const 限定符

GetClient 方法添加了 const 限定符,确保该方法不会修改对象的状态,提高了代码的安全性和清晰度。


65-65: 添加 const 限定符

GetPeerIp 方法添加了 const 限定符,确保该方法不会修改对象的状态,提高了代码的安全性和清晰度。


66-66: 添加 const 限定符

GetPort 方法添加了 const 限定符,确保该方法不会修改对象的状态,提高了代码的安全性和清晰度。


67-67: 添加 const 限定符

GetPeerID 方法添加了 const 限定符,确保该方法不会修改对象的状态,提高了代码的安全性和清晰度。


68-68: 新增方法 IsLearner

新增的 IsLearner 方法提供了对 is_learner_ 成员变量的访问,这对于新功能是必要的。


108-108: 新增参数 is_learner

AddPeer 方法新增了 bool is_learner 参数,这使得可以区分学习者和非学习者节点。这一改动与支持主从模式的 PR 目标一致。


109-109: 新增参数 is_learner

RemovePeer 方法新增了 bool is_learner 参数,这使得可以区分学习者和非学习者节点。这一改动与支持主从模式的 PR 目标一致。


143-143: 新增方法 GetListLearners

新增的 GetListLearners 方法增强了功能,允许检索学习者节点,这是新主从模式所必需的。

src/cmd_admin.cc (1)

197-225: 改进 InfoRaft 方法

InfoRaft 方法更新了获取 leader ID 的方式,使用了 PRAFT.GetLeaderID(),并新增了报告普通节点和学习者节点的逻辑。这些改动提高了 Raft 状态报告的清晰度和完整性。

src/cmd_raft.cc (7)

87-103: 改进 DoCmdAdd 方法

DoCmdAdd 方法更新后接受一个可选的 is_learner 参数,使其在添加节点时能够区分学习者和跟随者角色。这些改动提高了命令的灵活性。


118-138: 改进 DoCmdRemove 方法

DoCmdRemove 方法更新后在当前节点不是 leader 时使用新的 ClusterRemove 函数,并接受一个可选的 is_learner 参数。这些改动简化了移除过程,并集中处理节点移除请求的逻辑。


Line range hint 156-181: 新增 ClusterInit 函数

新增的 ClusterInit 函数封装了集群初始化逻辑,改进了代码组织。


182-200: 新增 ClusterJoin 函数

新增的 ClusterJoin 函数封装了加入集群的逻辑,改进了代码组织。


Line range hint 239-255: 改进 DoCmdJoin 方法

DoCmdJoin 方法更新后调用新的 ClusterJoin 函数,简化了命令结构。


302-304: 改进 DoCmdJoin 方法

DoCmdJoin 方法更新后调用新的 ClusterJoin 函数,并将 is_learner 设置为 true,确保 SlaveofCmd 类在加入集群时正确处理学习者节点。


306-324: 改进 DoCmdRemove 方法

DoCmdRemove 方法更新后调用新的 ClusterRemove 函数,并将 is_learner 设置为 true,确保 SlaveofCmd 类在从集群中移除时正确处理学习者节点。

src/praft/praft.cc (6)

Line range hint 37-48:
更改已批准:添加了 is_learner 参数。

此更改正确地将新参数 is_learner 分配给类成员 is_learner_,与方法的其余部分一致。


161-161: 更改已批准:在 PRaft::Init 方法中设置 is_learner

此更改正确地将学习者状态集成到节点初始化过程中。


310-313: 更改已批准:在 PRaft::SendNodeAddRequest 方法中处理学习者节点。

此更改通过在请求中添加 "learner" 关键字来正确处理学习者节点的情况。


325-328: 更改已批准:在 PRaft::SendNodeRemoveRequest 方法中处理学习者节点。

此更改通过在请求中添加 "learner" 关键字来正确处理学习者节点的情况。


522-532: 更改已批准:在 PRaft::AddPeer 方法中处理学习者节点。

此更改通过在 is_learner 为 true 时调用 add_learner 方法来正确处理学习者节点的添加。


543-553: 更改已批准:在 PRaft::RemovePeer 方法中处理学习者节点。

此更改通过在 is_learner 为 true 时调用 remove_learner 方法来正确处理学习者节点的移除。

tests/consistency_test.go (2)

792-816: 更改已批准:在 ThreeNodesClusterConstructionTest 测试用例中添加了设置和取消从节点的逻辑。

这些新命令对于测试集群在不同配置下的行为至关重要,改进了一致性测试的健壮性。


819-839: 更改已批准:添加了 getPeerID 辅助函数。

此函数通过减少测试代码中的冗余,提高了代码的可读性和可维护性。

Comment on lines 246 to 252
butil::Status PRaft::GetListLearners(std::vector<braft::PeerId>* learners) {
if (!node_) {
ERROR_LOG_AND_STATUS("Node is not initialized");
}

return node_->list_learners(learners);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改进错误处理:在 PRaft::GetListLearners 方法中返回更具描述性的错误状态。

当前方法在节点未初始化时返回默认构造的 butil::Status 对象,这可能无法清楚地向调用者传达错误状态。建议返回更具描述性的错误状态。

-  ERROR_LOG_AND_STATUS("Node is not initialized");
+  return butil::Status(EINVAL, "Node is not initialized");
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
butil::Status PRaft::GetListLearners(std::vector<braft::PeerId>* learners) {
if (!node_) {
ERROR_LOG_AND_STATUS("Node is not initialized");
}
return node_->list_learners(learners);
}
butil::Status PRaft::GetListLearners(std::vector<braft::PeerId>* learners) {
if (!node_) {
return butil::Status(EINVAL, "Node is not initialized");
}
return node_->list_learners(learners);
}

@AlexStocks
Copy link
Contributor

先 close 掉,后面再考虑 master-slave mode

@AlexStocks AlexStocks closed this Aug 10, 2024
@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


Close it first and consider master-slave mode later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✏️ Feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants