Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warning "Detected conflicting tunnel peer for prefix" and "Detected conflicting encryption key index for prefix" #44

Open
ruslan-y opened this issue May 3, 2024 · 2 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@ruslan-y
Copy link

ruslan-y commented May 3, 2024

Hi there!

We've deployed Cilium and Netreap following the guide README.md to our DEV cluster.
After a month of running-in and testing in the DEV cluster, we decided to implement it in the PROD cluster.
Testing was successful on DEV cluster, but one thing stopped us from finally switching to Cilium.
On the PROD cluster we have more than 300 hosts.
Sometimes we're getting an threatening warning in the cilium log:

level=warning msg="Detected conflicting tunnel peer for prefix. This may cause connectivity issues for this address." cidr=172.16.42.41/32 conflictingResource=node//host2 conflictingTunnelPeer=ip-addr resource=node//host2 subsys=ipcache
level=warning msg="Detected conflicting encryption key index for prefix. This may cause connectivity issues for this address." cidr=172.16.42.41/32 conflictingKey=255 conflictingResource=node//host2 key=255 resource=node//host1 subsys=ipcache

Our configuration below:

Systemd Unit File

[Unit]
Description=Cilium Agent
After=docker.service
Requires=docker.service
After=consul.service
Wants=consul.service
Before=nomad.service

[Service]
Restart=always
ExecStartPre=-/usr/bin/docker exec %n stop
ExecStartPre=-/usr/bin/docker rm %n
ExecStart=/usr/bin/docker run --rm --name %n \
  -v /var/run/cilium:/var/run/cilium \
  -v /sys/fs/bpf:/sys/fs/bpf \
  --env CONSUL_HTTP_TOKEN=<secret_token> \
  --net=host \
  --cap-add NET_ADMIN \
  --cap-add NET_RAW \
  --cap-add IPC_LOCK \
  --cap-add SYS_MODULE \
  --cap-add SYS_ADMIN \
  --cap-add SYS_RESOURCE \
  --privileged \
  cilium/cilium:v1.14.5 \
  cilium-agent --kvstore consul --kvstore-opt consul.address=127.0.0.1:8500 \
    --kvstore-periodic-sync=5m \
    --enable-ipv6=false  \
    --tunnel-protocol=geneve \
    --enable-wireguard \
    --encrypt-node \
    --enable-bpf-masquerade=true \
    --kube-proxy-replacement=true \
    --enable-l7-proxy=false  \
    --prometheus-serve-addr=127.0.0.1:9962 \
    --ipv4-range 172.16.0.0/16 \

[Install]
WantedBy=multi-user.target

/etc/docker/daemon.json

{
    "default-address-pools": [
      {
        "base": "172.17.0.0/16",
        "size": 24
      }
    ]
  }

/opt/cni/config/cilium.conflist

{
  "name": "cilium",
  "cniVersion": "1.0.0",
  "plugins": [
    {
      "type": "cilium-cni",
      "enable-debug": false
    },
    {
      "type": "portmap",
      "capabilities": {"portMappings": true}
    }
  ]
}

/opt/cni/bin

bandwidth  bridge  cilium-cni  dhcp  dummy  firewall  host-device  host-local  ipvlan  loopback  macvlan  portmap  ptp  sbr  static  tap  tuning  vlan  vrf

Netreap system job

job "netreap" {
  datacenters = ["dc1"]
  priority    = "100"
  type        = "system"
  meta {
    RENDER_STAMP = "2024-05-01_02:48:41PM"
  }
  constraint {
      attribute = "${attr.plugins.cni.version.cilium-cni}"
      operator  = ">="
      value     = "1.15.3"
  }
  constraint {
      attribute = "${attr.plugins.cni.version.cilium-cni}"
      operator  = "is_set"
  }
  
  group "netreap"  {
    
    vault {
      policies = ["nomad-services"]
    }
    
    restart {
      interval = "10m"
      attempts = 5
      delay    = "15s"
      mode     = "delay"
    } 
    service {
      name = "netreap"
      tags = [ "netreap" ]
    }
    task "netreap" {
      driver = "docker"
      template {
        destination = "secrets/file.env"
        env         = true
        change_mode = "restart"
        data        = <<EOT
NETREAP_CILIUM_CIDR="172.16.0.0/16"
NOMAD_ADDR="https://127.0.0.1:4646"
NETREAP_DEBUG="true"
NOMAD_CLIENT_KEY="/etc/nomad/ssl/client-key.pem"
NOMAD_CLIENT_CERT="/etc/nomad/ssl/client.pem"
NOMAD_CAPATH="/etc/nomad/ssl/nomad-ca.pem"
{{- with secret "kv/netreap/prod" }}
CONSUL_HTTP_TOKEN="{{ .Data.data.CONSUL_HTTP_TOKEN }}"
{{- end }}
{{- with secret "kv/netreap/prod" }}
NOMAD_TOKEN="{{ .Data.data.NOMAD_TOKEN }}"
{{- end }}
EOT
      }
      
      config {
        image = "registry"
        network_mode = "host"
        auth {
          username = "[MASKED]"
          password = "[MASKED]"
        }
        volumes = [
          "/etc/nomad/ssl:/etc/nomad/ssl",
          "/var/run/cilium:/var/run/cilium",
        ]
      }
      resources {
          cpu = 200
          memory = 300
      }
    }
  }
}

Netreap Version

v0.2.0

Cilium Version

Client: 1.14.5 85db28be 2023-12-11T14:30:29+01:00 go version go1.20.12 linux/amd64
Daemon: 1.14.5 85db28be 2023-12-11T14:30:29+01:00 go version go1.20.12 linux/amd64

Kernel Version

Linux ax51-host110 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 (2022-10-21) x86_64 GNU/Linux

Nomad Version

Nomad v1.5.6
BuildDate 2023-05-19T18:26:13Z
Revision 8af70885c02ab921dedbdf6bc406a1e886866f80

Consul Version

Consul v1.14.7
Revision d97acc0a
Build Date 2023-05-16T01:36:41Z

When we're running cilium-agent with --ipv4-range 172.16.0.0/16 as you specified in the documentation any host has same subnet - 172.16.0.0/16

host1    ext ip addr       172.16.0.0/16                                  local
host2    ext ip addr       172.16.0.0/16                                  kvstore
host3    ext ip addr       172.16.0.0/16                                  kvstore
host4    ext ip addr       172.16.0.0/16                                  kvstore
host5    ext ip addr       172.16.0.0/16                                  kvstore
host6    ext ip addr       172.16.0.0/16                                  kvstore
host7    ext ip addr       172.16.0.0/16                                  kvstore
host8    ext ip addr       172.16.0.0/16                                  kvstore
host9    ext ip addr       172.16.0.0/16                                  kvstore
host10   ext ip addr       172.16.0.0/16                                  kvstore

I guess this may be the cause of conflicts and as a result we see it in the cilium log.
And if I understand correctly, Netreap is not responsible for IPAM, as the operator does in K8s.
Can you explain me please how should it be working properly?
Anyway, maybe you have some advices for production-ready cluster. It would be great to hear your opinion on this.

Also we ran cilium-agent with --ipv4-range auto flag, but this subnet range is not enough for us.

host1    ext ip addr       10.231.0.0/16                                  local
host2    ext ip addr       10.72.0.0/16                                   kvstore
host3    ext ip addr       10.201.0.0/16                                  kvstore
host4    ext ip addr       10.70.0.0/16                                   kvstore
host5    ext ip addr       10.75.0.0/16                                   kvstore
host6    ext ip addr       10.104.0.0/16                                  kvstore
host7    ext ip addr       10.109.0.0/16                                  kvstore
host8    ext ip addr       10.154.0.0/16                                  kvstore
host9    ext ip addr       10.208.0.0/16                                  kvstore
host10   ext ip addr       10.23.0.0/16                                   kvstore
@ruslan-y ruslan-y added bug Something isn't working help wanted Extra attention is needed labels May 3, 2024
@ruslan-y
Copy link
Author

ruslan-y commented May 3, 2024

And I also noticed that you have removed configuring flag --cilium-cidr in the new version.
What is the reason for this?

@deverton-godaddy
Copy link
Contributor

--cilium-cidr is no longer needed as netreap now validates node membership by querying Nomad directly rather than guessing based on the IP address.

As for the issue with conflicting IPs I suspect that's something more to do with the Cilium configuration and it seems like you're having more luck asking there cilium/cilium#32188

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants