Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add proto #2

Merged
merged 2 commits into from
Apr 17, 2024
Merged

feat: add proto #2

merged 2 commits into from
Apr 17, 2024

Conversation

FrankYang0529
Copy link
Contributor

@FrankYang0529 FrankYang0529 commented Feb 20, 2024

issue: longhorn/longhorn#6744

Test Plan

Please build images from following PRs:

Case 1: New cluster

  1. Clone longhorn/longhorn.
git clone https://github.com/longhorn/longhorn.git
cd longhorn/chart
  1. Install the cluster with your images. For example:
helm install longhorn . --namespace longhorn-system --create-namespace \
    --set image.longhorn.engine.repository=frankyang/longhorn-engine \
    --set image.longhorn.engine.tag="arm64-latest" \
    --set image.longhorn.manager.repository=frankyang/longhorn-manager \
    --set image.longhorn.manager.tag="arm64-latest" \
    --set image.longhorn.instanceManager.repository=frankyang/longhorn-instance-manager \
    --set image.longhorn.instanceManager.tag="arm64-latest" \
    --set image.pullPolicy="Always" \
    --set image.longhorn.backingImageManager.repository=frankyang/backing-image-manager \
    --set image.longhorn.backingImageManager.tag="v3_20240417" \
    --set image.longhorn.shareManager.repository=frankyang/longhorn-share-manager \
    --set image.longhorn.shareManager.tag="arm64-latest" \
    --set defaultSettings.systemManagedPodsImagePullPolicy="always"
  1. Create a volume or snapshot. Check basic operations can work.

Case 2: Upgrade from v1.6.1

  1. Clone longhorn/longhorn and checkout to v1.6.1.
git clone https://github.com/longhorn/longhorn.git
cd longhorn/chart
git checkout v1.6.1
  1. Install v1.6.1 cluster.
helm install longhorn . --namespace longhorn-system --create-namespace
  1. Create a Volume, PVC, and try to write some data to the volume.
  2. Upgrade the cluster to new images. For example.
git checkout master
helm upgrade longhorn . --namespace longhorn-system \
    --set image.longhorn.engine.repository=frankyang/longhorn-engine \
    --set image.longhorn.engine.tag="arm64-latest" \
    --set image.longhorn.manager.repository=frankyang/longhorn-manager \
    --set image.longhorn.manager.tag="arm64-latest" \
    --set image.longhorn.instanceManager.repository=frankyang/longhorn-instance-manager \
    --set image.longhorn.instanceManager.tag="arm64-latest" \
    --set image.pullPolicy="Always" \
    --set image.longhorn.backingImageManager.repository=frankyang/backing-image-manager \
    --set image.longhorn.backingImageManager.tag="v3_20240417" \
    --set image.longhorn.shareManager.repository=frankyang/longhorn-share-manager \
    --set image.longhorn.shareManager.tag="arm64-latest"
  1. Before upgrading engine, delete a replica of the volume and make sure it can come back.
  2. Detach the volume. Upgrade the engine to latest. Make sure it can be attached and data is still there.

@FrankYang0529
Copy link
Contributor Author

Hi @innobead, currently, different LH projects use different protobuf package naming.

  • longhorn-engine: ptypes
  • longhorn-instance-manager: imrpc
  • longhorn-spdk-engine: spdkrpc
  • backing-image-manager: longhorn.backingimagemanager.pkg.rpc
  • longhorn-share-manager: don't specify package name

Originally, I would like to fix the naming in this PR like pkg.<service name>. However, this introduces another problem. New generated code has different invoke path. Let's use ReplicaCreate for example:

  • longhorn-engine ptypes: /ptypes.ReplicaService/ReplicaCreate
  • longhorn-types enginerpc: /pkg.enginerpc.ReplicaService/ReplicaCreate

In a new LH cluster, this is not a problem. However, when we upgrade an old LH cluster, we may have both old and new longhorn-instance-manager and longhorn-engine. This will be an issue. I get an error like:

level=error msg="Failed to initialize im client to instance-manager-b3561682e04c14c53967b1711a0905e7 before monitoring" func="controller.(*InstanceManagerController).startMonitoring" file="instance_manager_controller.go:1373" controller=longhorn-instance-manager error="failed to check version of Instance Manager Instance Service Client for instance-manager-b3561682e04c14c53967b1711a0905e7 IP 10.244.133.199: failed to get version: rpc error: code = Unimplemented desc = unknown service pkg.imrpc.InstanceService" instance manager=instance-manager-b3561682e04c14c53967b1711a0905e7 node=worker-2

We have two ways to solve the problem:

  1. We introduce an old grpc server to conversion the request to the new server.
    Pros: we can unify protobuf package naming in longhorn/types.
    Cons: The longhorn-manager architecture will be complex.

  2. We keep the package name.
    Pros: We don't need any change for longhorn-manager server.
    Cons: The naming is different even if we have a repository to collect all protobuf.

@innobead
Copy link
Member

innobead commented Feb 23, 2024

As discussed with/ @FrankYang0529 , we will adopt option 2 to keep the package name as usual, as this is the original design principle of Protobuf (ex: backward compatibility) before we figure out a better way to mitigate this design debt.

@innobead
Copy link
Member

@ejweber @shuo-wu @ChanYiLin Please review this.

@FrankYang0529 FrankYang0529 force-pushed the LH-6744 branch 3 times, most recently from ee78f3e to 9ddb72a Compare April 8, 2024 02:34
@innobead innobead requested a review from derekbit April 10, 2024 15:21
Copy link
Member

@innobead innobead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just one suggestion about folder naming.


package bimrpc;

option go_package = "github.com/longhorn/types/pkg/bimrpc";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
option go_package = "github.com/longhorn/types/pkg/bimrpc";
option go_package = "github.com/longhorn/types/pkg/generated/bimrpcpb";

For Python generated module, suggest generated-py/... instead of rpc-py. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@innobead, If we want to update go_package in bimrpc, I will update all go_package with prefix github.com/longhorn/types/pkg/generated for consistency. The generated-py path also looks good to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated it. Thank you.

@innobead
Copy link
Member

@FrankYang0529 In addition, please use Github Action instead and we only need to run this on amd64.

@shuo-wu
Copy link
Contributor

shuo-wu commented Apr 10, 2024

I am thinking we have the 3rd option:
Retain the old gRPC client code for existing longhorn-instance-manager and longhorn-engine. Then remove those codes in the next minor release.

@FrankYang0529
Copy link
Contributor Author

I am thinking we have the 3rd option: Retain the old gRPC client code for existing longhorn-instance-manager and longhorn-engine. Then remove those codes in the next minor release.

Hi @shuo-wu, when users upgrade LH, longhorn-instance-manager needs to connect old and new longhorn-engine. If we change protobuf package name, longhorn-instance-manager has to create two gRPC clients, one for old protobuf and another for new. There are 3 package names need to be changed: imrpc, ptypes, and smrpc. This may introduce some complexity in longhorn-manager and longhorn-instance-manager.

@derekbit
Copy link
Member

derekbit commented Apr 11, 2024

I am thinking we have the 3rd option: Retain the old gRPC client code for existing longhorn-instance-manager and longhorn-engine. Then remove those codes in the next minor release.

Hi @shuo-wu, when users upgrade LH, longhorn-instance-manager needs to connect old and new longhorn-engine. If we change protobuf package name, longhorn-instance-manager has to create two gRPC clients, one for old protobuf and another for new. There are 3 package names need to be changed: imrpc, ptypes, and smrpc. This may introduce some complexity in longhorn-manager and longhorn-instance-manager.

@FrankYang0529 Have you tested the combability after upgrading Longhorn, for example, creating and deleting volumes in mixed old/new longhorn-manager, instance-instance-manager, and longhorn-engine cluster?

@FrankYang0529
Copy link
Contributor Author

I am thinking we have the 3rd option: Retain the old gRPC client code for existing longhorn-instance-manager and longhorn-engine. Then remove those codes in the next minor release.

Hi @shuo-wu, when users upgrade LH, longhorn-instance-manager needs to connect old and new longhorn-engine. If we change protobuf package name, longhorn-instance-manager has to create two gRPC clients, one for old protobuf and another for new. There are 3 package names need to be changed: imrpc, ptypes, and smrpc. This may introduce some complexity in longhorn-manager and longhorn-instance-manager.

@FrankYang0529 Have you tested the combability after upgrading Longhorn, for example, creating and deleting volumes in mixed old/new longhorn-manager, instance-instance-manager, and longhorn-engine cluster?

Yes, in my test steps, I think these two steps cover it.

5. Before upgrading engine, delete a replica of the volume and make sure it can come back.
6. Detach the volume. Upgrade the engine to latest. Make sure it can be attached and data is still there.

rm -rf ./protobuf/vendor/protobuf/src/google/protobuf
DIR="./protobuf/vendor/protobuf/src/google/protobuf"
mkdir -p $DIR
wget https://raw.githubusercontent.com/protocolbuffers/protobuf/v24.3/src/google/protobuf/empty.proto -P $DIR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make v24.3 as a global variable PROTOBUF_VER

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated it. Thank you.

Comment on lines +5 to +8
* protoc-gen-go 1.31.0
* protoc-gen-go-grpc 1.3.0
* python protobuf 4.24.3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Where should developers make changes if they want to update?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I add Dockerfile.dapper* to .gitignore, so we don't see it here. I will add it and force push again. We can change it in Dockerfile.dapper in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add Dockerfile.dapper. There are some variables can change the version. Thanks.

Signed-off-by: PoAn Yang <yangpoan@gmail.com>
Signed-off-by: PoAn Yang <yangpoan@gmail.com>
@FrankYang0529
Copy link
Contributor Author

@FrankYang0529 In addition, please use Github Action instead and we only need to run this on amd64.

Added it. However, this is first PR to add GItHub Action, so it looks like it's not triggered in this PR.

@innobead
Copy link
Member

No worries. You can update the action later if it doesn't work.

Copy link
Member

@innobead innobead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@innobead innobead merged commit 43dea92 into longhorn:main Apr 17, 2024
1 check passed
@FrankYang0529 FrankYang0529 deleted the LH-6744 branch April 17, 2024 07:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants