Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Recreating VPC/Subnet after their deletion yields errors #4530

Closed
SkalaNetworks opened this issue Sep 17, 2024 · 5 comments · Fixed by #4533
Closed

[BUG] Recreating VPC/Subnet after their deletion yields errors #4530

SkalaNetworks opened this issue Sep 17, 2024 · 5 comments · Fixed by #4533
Labels
bug Something isn't working subnet vpc

Comments

@SkalaNetworks
Copy link
Contributor

Kube-OVN Version

v1.13.0

Kubernetes Version

v1.30.4+k0s

K0s

Operation-system/Kernel Version

"Debian GNU/Linux 12 (bookworm)"
6.1.0-18-amd64

Description

After deleting VPCs and their associated subnets, recreating them identically yields database errors.

Steps To Reproduce

  1. Create namespace "ns1" and "ns2"
  2. Apply the following configuration "kubectl apply -f test.yaml"
kind: Vpc
apiVersion: kubeovn.io/v1
metadata:
  name: test-vpc-1
spec:
  namespaces:
  - ns1
---
kind: Vpc
apiVersion: kubeovn.io/v1
metadata:
  name: test-vpc-2
spec:
  namespaces:
    - ns2
---

kind: Subnet
apiVersion: kubeovn.io/v1
metadata:
  name: net1
spec:
  vpc: test-vpc-1
  cidrBlock: 10.0.1.0/24,fd00:10:10::/64
  namespaces:
    - ns1
---
kind: Subnet
apiVersion: kubeovn.io/v1
metadata:
  name: net2
spec:
  vpc: test-vpc-2
  cidrBlock: 10.0.1.0/24
  protocol: IPv4
  namespaces:
    - ns2

---
apiVersion: v1
kind: Pod
metadata:
  namespace: ns1
  name: vpc1-pod
spec:
  containers:
    - name: vpc1-pod
      image: docker.io/library/nginx:alpine
---
apiVersion: v1
kind: Pod
metadata:
  namespace: ns2
  name: vpc2-pod
spec:
  containers:
    - name: vpc2-pod
      image: docker.io/library/nginx:alpine
  1. Observe pods are successfully created
  2. kubectl delete -f test.yaml
  3. Observe everything being removed
  4. Apply it again
  5. Pods are stuck in containercreating
  6. Kube-ovn-controller has logs:
kube-ovn-controller E0917 03:57:46.434360       7 ovn.go:216] error occurred in transact with operations [{Op:ins
ert Table:Logical_Switch_Port Row:map[addresses:{GoSet:[router]} external_ids:{GoMap:map[ls:net2 vendor:kube-ovn]
} name:net2-test-vpc-2 options:{GoMap:map[router-port:test-vpc-2-net2]} type:router] Rows:[] Columns:[] Mutations
:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUID: UUIDName:u0939685268} {Op:mutate T
able:Logical_Switch Row:map[] Rows:[] Columns:[] Mutations:[{Column:ports Mutator:insert Value:{GoSet:[{GoUUID:u0
939685268}]}}] Timeout:<nil> Where:[where column _uuid == {f438ac99-0318-4e97-ac71-4414fd0532c7}] Until: Durable:
<nil> Comment:<nil> Lock:<nil> UUID: UUIDName:} {Op:insert Table:Logical_Router_Port Row:map[external_ids:{GoMap:
map[lr:test-vpc-2 vendor:kube-ovn]} mac:02:69:fc:f6:b1:b0 name:test-vpc-2-net2 networks:{GoSet:[10.0.1.1/24]}] Ro
ws:[] Columns:[] Mutations:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUID: UUIDName
:u0939685269} {Op:mutate Table:Logical_Router Row:map[] Rows:[] Columns:[] Mutations:[{Column:ports Mutator:inser
t Value:{GoSet:[{GoUUID:u0939685269}]}}] Timeout:<nil> Where:[where column _uuid == {e1b10e54-63c1-467a-944b-12a2
cdb48e4a}] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUID: UUIDName:}] with operation errors []: constraint v
iolation: Transaction causes multiple rows in "Logical_Router_Port" table to have identical values (test-vpc-2-ne
t2) for index on column "name".  First row, with UUID 1023b6db-1f76-460a-8542-94c1584e80dd, existed in the databa
se before this transaction and was not modified by the transaction.  Second row, with UUID d8488262-9683-4c06-8fb
4-5bcd289a804a, was inserted by this transaction.
kube-ovn-controller E0917 03:57:46.434491       7 ovn-nb.go:72] create logical patch port net2-test-vpc-2 and tes
t-vpc-2-net2: constraint violation: Transaction causes multiple rows in "Logical_Router_Port" table to have ident
ical values (test-vpc-2-net2) for index on column "name".  First row, with UUID 1023b6db-1f76-460a-8542-94c1584e8
0dd, existed in the database before this transaction and was not modified by the transaction.  Second row, with U
UID d8488262-9683-4c06-8fb4-5bcd289a804a, was inserted by this transaction.
kube-ovn-controller E0917 03:57:46.434527       7 subnet.go:717] create logical switch net2: create logical patch
 port net2-test-vpc-2 and test-vpc-2-net2: constraint violation: Transaction causes multiple rows in "Logical_Rou
ter_Port" table to have identical values (test-vpc-2-net2) for index on column "name".  First row, with UUID 1023
b6db-1f76-460a-8542-94c1584e80dd, existed in the database before this transaction and was not modified by the tra
nsaction.  Second row, with UUID d8488262-9683-4c06-8fb4-5bcd289a804a, was inserted by this transaction.
kube-ovn-controller E0917 03:57:46.434586       7 controller.go:1312] "Unhandled Error" err="error syncing add/up
date subnet \"net2\": create logical patch port net2-test-vpc-2 and test-vpc-2-net2: constraint violation: Transa
ction causes multiple rows in \"Logical_Router_Port\" table to have identical values (test-vpc-2-net2) for index
on column \"name\".  First row, with UUID 1023b6db-1f76-460a-8542-94c1584e80dd, existed in the database before th
is transaction and was not modified by the transaction.  Second row, with UUID d8488262-9683-4c06-8fb4-5bcd289a80
4a, was inserted by this transaction., requeuing" logger="UnhandledError"

Current Behavior

VPCs/Subnets fail to work with the same name after they've been deleted and recreated

Expected Behavior

Can recreate them just fine

@SkalaNetworks SkalaNetworks added the bug Something isn't working label Sep 17, 2024
Copy link

dosubot bot commented Sep 17, 2024

To resolve the database errors when recreating VPCs and Subnets with the same names after deletion in Kube-OVN v1.13.0 on Kubernetes v1.30.4+k0s, you need to ensure that VPCs and Subnets do not share the same name. The provided code snippet from pkg/webhook/subnet.go shows that there is a validation check in place to prevent VPCs and Subnets from having the same name.

Here is the relevant part of the code:

vpcList := &ovnv1.VpcList{}
if err := v.cache.List(ctx, vpcList); err != nil {
    return ctrlwebhook.Errored(http.StatusBadRequest, err)
}
for _, item := range vpcList.Items {
    if item.Name == o.Name {
        err := errors.New("vpc and subnet cannot have the same name")
        return ctrlwebhook.Errored(http.StatusBadRequest, err)
    }
}

This code checks if any existing VPC has the same name as the Subnet being created and returns an error if it does. To resolve the issue, ensure that when you recreate VPCs and Subnets, they have unique names that do not conflict with each other.

Additionally, the pkg/controller/subnet.go file contains logic to handle various aspects of subnet creation and updates, including validation and formatting. For example, the validateVpcBySubnet function ensures that the VPC associated with a subnet is in a valid state and does not have overlapping namespaces with other VPCs. This further emphasizes the importance of maintaining unique and non-conflicting names and configurations for VPCs and Subnets.

Here is a snippet from pkg/controller/subnet.go:

func (c *Controller) validateVpcBySubnet(subnet *kubeovnv1.Subnet) (*kubeovnv1.Vpc, error) {
    vpc, err := c.vpcsLister.Get(subnet.Spec.Vpc)
    if err != nil {
        klog.Errorf("failed to get subnet's vpc '%s', %v", subnet.Spec.Vpc, err)
        return vpc, err
    }

    if !vpc.Status.Standby {
        err = fmt.Errorf("the vpc '%s' not standby yet", vpc.Name)
        klog.Error(err)
        return vpc, err
    }

    if !vpc.Status.Default {
        for _, ns := range subnet.Spec.Namespaces {
            if !slices.Contains(vpc.Spec.Namespaces, ns) {
                err = fmt.Errorf("namespace '%s' is out of range to custom vpc '%s'", ns, vpc.Name)
                klog.Error(err)
                return vpc, err
            }
        }
    } else {
        vpcs, err := c.vpcsLister.List(labels.Everything())
        if err != nil {
            klog.Errorf("failed to list vpc, %v", err)
            return vpc, err
        }
        for _, vpc := range vpcs {
            if (subnet.Annotations[util.VpcLastName] == "" && subnet.Spec.Vpc != vpc.Name ||
                subnet.Annotations[util.VpcLastName] != "" && subnet.Annotations[util.VpcLastName] != vpc.Name) &&
                !vpc.Status.Default && util.IsStringsOverlap(vpc.Spec.Namespaces, subnet.Spec.Namespaces) {
                err = fmt.Errorf("namespaces %v are overlap with vpc '%s'", subnet.Spec.Namespaces, vpc.Name)
                klog.Error(err)
                return vpc, err
            }
        }
    }
    return vpc, nil
}

By ensuring unique names and validating configurations, you can avoid database errors when recreating VPCs and Subnets [1][2][3].

To continue talking to Dosu, mention @dosu.

@SkalaNetworks
Copy link
Contributor Author

So apparently VPC and subnets shouldn't have the same name, this is a very weird requirement, is this warranted?

Copy link

dosubot bot commented Sep 17, 2024

Yes, it is warranted that VPCs and subnets should not have the same name in Kube-OVN. This is enforced in the VpcCreateHook function, where it checks if any existing subnet has the same name as the VPC being created and returns an error if they do.

for _, item := range subnetList.Items {
    if item.Name == vpc.Name {
        err := errors.New("vpc and subnet cannot have the same name")
        return ctrlwebhook.Errored(http.StatusBadRequest, err)
    }
}

This requirement helps to avoid conflicts and ensures the integrity of the system [1].

To continue talking to Dosu, mention @dosu.

@oilbeater
Copy link
Collaborator

@bobz965 do you remember why we have this restriction?

@bobz965
Copy link
Collaborator

bobz965 commented Sep 18, 2024

VPC and subnets shouldn't have the same name

Maybe, LSP and LRP should have the same name when vpc and subnet has the same name. but LSP and LRP are both the Port resources?

image

@zhangzujian zhangzujian linked a pull request Sep 18, 2024 that will close this issue
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working subnet vpc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants