Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: make API and controller changes to support manual disk selection #229

Merged
merged 12 commits into from
Aug 31, 2022

Conversation

iamniting
Copy link
Member

@iamniting iamniting commented Jul 18, 2022

Allow user to select the block devices while VG creation in the LVMCluster CR.

Signed-off-by: Nitin Goyal nigoyal@redhat.com

api/v1alpha1/lvmcluster_types.go Show resolved Hide resolved
api/v1alpha1/lvmcluster_types.go Outdated Show resolved Hide resolved
api/v1alpha1/lvmvolumegroup_types.go Outdated Show resolved Hide resolved
@iamniting iamniting force-pushed the api branch 4 times, most recently from a469dce to deb1d65 Compare July 22, 2022 10:21
@iamniting iamniting force-pushed the api branch 2 times, most recently from 3bc331d to 66bdc36 Compare July 27, 2022 10:15
@iamniting iamniting changed the title feat: make API changes to support manual disk selection feat: make API and controller changes to support manual disk selection Jul 27, 2022
@@ -254,6 +254,10 @@ spec:
node:
description: Node is the name of the node
type: string
reason:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like something that can be part of the make bundle commit.

@@ -208,6 +215,46 @@ func (r *LVMClusterReconciler) reconcile(ctx context.Context, instance *lvmv1alp
return ctrl.Result{}, nil
}

func (r *LVMClusterReconciler) verifyLvmClusterSpec(ctx context.Context, instance *lvmv1alpha1.LVMCluster) error {

// make sure no device overlap with another VGs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if the user decides to give the device names instead of device paths?
I know we encourage them to use paths, but still if they decide to use device names instead, then this validation will always fail, if I'm not wrong.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even in that case, it should work, as the device names will only be the same. I am just loading the devices into a map; if the key already exists, we are erroring out.

@@ -208,6 +215,46 @@ func (r *LVMClusterReconciler) reconcile(ctx context.Context, instance *lvmv1alp
return ctrl.Result{}, nil
}

func (r *LVMClusterReconciler) verifyLvmClusterSpec(ctx context.Context, instance *lvmv1alpha1.LVMCluster) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

method name can be more specific about what we are trying to verify instead of verifyLvmClusterSpec

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comments still holds. The function only validates the device classes in the Spec. The function name should reflect that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a new func for this logic and that will be called from here

@@ -108,6 +114,11 @@ type LVMClusterStatus struct {
// Ready describes if the LVMCluster is ready.
// +optional
Ready bool `json:"ready,omitempty"`

// Reason describes the reason behind any success or failure.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean reason behind failure only? Don't think we add any success reason.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can also have a reason behind the success which will always be the exact string. I think success reason should not matter much, as much as failure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to use conditions here if possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added conditions

instance.Status.Reason = err.Error()

// Apply status changes
err = r.Client.Status().Update(ctx, instance)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, updating status should not be part of the verification function. It should only be concerned about the verification part.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This verification function is called precisely after getting the instance and we won't get to an actual update call as we will be returning from here only, so we need to do an update here itself only and only if there is a failure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also check for node selector when validating the devices overlaps. It is possible for the same device path in different nodes to be used in different device classes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checking node selectors as well now


// Reason describes the reason behind any success or failure.
// +optional
Reason string `json:"reason,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we add reason for error specific to this PR only?
If that's the case, then we need to add reasons for other failures as well in a separate PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am updating the reason wherever required.

// TODO: Implement this
func filterMatchingDevices(blockDevices []internal.BlockDevice, volumeGroup *lvmv1alpha1.LVMVolumeGroup) ([]internal.BlockDevice, []internal.BlockDevice, error) {
// currently just match all devices
func (r *VGReconciler) filterMatchingDevices(blockDevices []internal.BlockDevice, volumeGroup *lvmv1alpha1.LVMVolumeGroup) ([]internal.BlockDevice, []internal.BlockDevice, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit test should be added for this function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment still valid.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add them later once we agree with the actual code changes to minimize the changes in unit tests.

// TODO: Implement this
func filterMatchingDevices(blockDevices []internal.BlockDevice, volumeGroup *lvmv1alpha1.LVMVolumeGroup) ([]internal.BlockDevice, []internal.BlockDevice, error) {
// currently just match all devices
func (r *VGReconciler) filterMatchingDevices(blockDevices []internal.BlockDevice, volumeGroup *lvmv1alpha1.LVMVolumeGroup) ([]internal.BlockDevice, []internal.BlockDevice, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you plan to the use the unmatched disks return value? If not, then we can avoid returning that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no plans we can remove it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still see this return value in the code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed it earlier, made changes now

if ok {
filteredBlockDevices = append(filteredBlockDevices, blockDevice)
} else {
err := fmt.Errorf("can not find disk name %s in the available block devices", path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
err := fmt.Errorf("can not find disk name %s in the available block devices", path)
err := fmt.Errorf("can not find disk name %s in the available block devices", diskName)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both will be the same only, So updating to path should make any differences

@@ -108,6 +114,11 @@ type LVMClusterStatus struct {
// Ready describes if the LVMCluster is ready.
// +optional
Ready bool `json:"ready,omitempty"`

// Reason describes the reason behind any success or failure.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to use conditions here if possible.

instance.Status.Reason = err.Error()

// Apply status changes
err = r.Client.Status().Update(ctx, instance)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also check for node selector when validating the devices overlaps. It is possible for the same device path in different nodes to be used in different device classes.

if device.DiskByPath != "" {
args = append(args, device.DiskByPath)
} else {
args = append(args, device.KName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't think it will work if complete device name with path (for example: dev/sdb) is not provided in the args

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should work as we are getting the full path now.

}

// ListBlockDevices using the lsblk command
func ListBlockDevices(exec Executor) ([]BlockDevice, error) {
// var output bytes.Buffer
var blockDeviceMap map[string][]BlockDevice
columns := "NAME,ROTA,TYPE,SIZE,MODEL,VENDOR,RO,STATE,KNAME,SERIAL,PARTLABEL,FSTYPE"
args := []string{"--json", "-o", columns}
args := []string{"--json", "--paths", "-o", columns}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is --paths argument needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using --paths will give us the full path and make our job easy while comparing the paths and while adding paths into some commands we don't have to add /dev/ anymore.

Copy link
Contributor

@sp98 sp98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments are not addressed

bundle/manifests/lvm.topolvm.io_lvmclusters.yaml Outdated Show resolved Hide resolved
api/v1alpha1/lvmcluster_types.go Outdated Show resolved Hide resolved
controllers/conditions.go Outdated Show resolved Hide resolved
controllers/lvm_volumegroup.go Outdated Show resolved Hide resolved
@@ -208,6 +215,46 @@ func (r *LVMClusterReconciler) reconcile(ctx context.Context, instance *lvmv1alp
return ctrl.Result{}, nil
}

func (r *LVMClusterReconciler) verifyLvmClusterSpec(ctx context.Context, instance *lvmv1alpha1.LVMCluster) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comments still holds. The function only validates the device classes in the Spec. The function name should reflect that.

@iamniting iamniting force-pushed the api branch 3 times, most recently from e1fda0d to a0a6098 Compare August 22, 2022 07:29
devices := make(map[string]map[string]string)

for _, deviceClass := range instance.Spec.Storage.DeviceClasses {
if deviceClass.DeviceSelector != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we (in the future) have two deviceClasses - one taking up all the disks and one with manual disk selection?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should not support such cases IMO. This code will just check the manual disk paths only as other device class wont have any list of devices in the spec.

r.Log.Error(err, "failed to verify device paths")

setLVMClusterCRvalidCondition(&instance.Status.Conditions, corev1.ConditionFalse, "CRInvalid", err.Error())
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we use setLVMClusterCRvalidCondition when there's a verify issue?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are setting the condition in both cases either in failure or in success.

Can you pls elaborate on what you mean?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need more coffee :D sorry

return []internal.BlockDevice{}, err
}
} else {
err = fmt.Errorf("unsupported disk path format %s", path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add to the error message that only /dev/disk/by-path and /dev/ links are currently supported

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


blockDevice, ok := hasExactDisk(blockDevices, diskName)

if filepath.Dir(path) == internal.DiskByPathPrefix {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't we use this flow for all /dev/disk/by-* options?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can, We need more conditions to do that. Also as of now, all other patterns other than diskbypath will go into the else condition starting with dev.

}

devices := []string{}
if diskPattern == internal.DiskByPathPrefix {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idea: why not always use the disk path in the status?!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had this discussion earlier with @nbalacha and we came to the conclusion that we will show the paths as given in the spec. It will be easier for a user to compare them from the spec.

make sure device paths does not overlap with another VG or within the
same VG.

Signed-off-by: Nitin Goyal <nigoyal@redhat.com>
Signed-off-by: Nitin Goyal <nigoyal@redhat.com>
implement filterMatchingDevices

Signed-off-by: Nitin Goyal <nigoyal@redhat.com>
if VolumeGroup CR uses paths add paths in the devices else add names

Signed-off-by: Nitin Goyal <nigoyal@redhat.com>
update the error in the reason

Signed-off-by: Nitin Goyal <nigoyal@redhat.com>
Signed-off-by: Nitin Goyal <nigoyal@redhat.com>
Signed-off-by: Nitin Goyal <nigoyal@redhat.com>
Signed-off-by: Nitin Goyal <nigoyal@redhat.com>
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 31, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 31, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: iamniting, nbalacha, sp98

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 31, 2022
@openshift-merge-robot openshift-merge-robot merged commit e022411 into openshift:main Aug 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants