syslog source AWS NLB healthcheck tcp memory leak #17923

christophemorio · 2023-07-10T08:17:19Z

A note for the community

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

On AWS EKS,
Syslog tcp source exposed over a LoadBalancer of type NLB slowly increases memory usage continuously.
Increase rate is correlated to number of nodes.

In a nutshell, by default healthcheck is requested every 30s for each node
NLB --> all nodes Kubeproxy tcp/31xxx --> vector pods tcp/9514

It sounds the TCP healthcheck made by AWS NLB genarates a memory leak.
As a workaround, kubeproxy-less override forces healthcheck out of syslog tcp, and then memory usage still flat:

apiVersion: v1
kind: Service
spec:
      externalTrafficPolicy: Local
      ...

Configuration

---
apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/name: vector-test
  name: vector-test-config
  namespace: default
data:
  main-config.yml: |
    ---
      sources:
        syslog_source:
          type: syslog
          address: 0.0.0.0:9514
          mode: tcp
      sinks:
        debug_file:
          type: file
          inputs:
            - syslog_source
          encoding:
            codec: json
          path: /tmp/syslog.log

---
apiVersion: v1
kind: Pod
metadata:
  labels:
    app.kubernetes.io/name: vector-test
    vector.dev/exclude: "true"
  name: vector-test
  namespace: default
spec:
  containers:
  - args:
    - --config-dir
    - /etc/vector/
    env:
    - name: VECTOR_LOG
      value: debug
    image: docker.io/timberio/vector:0.31.0-debian
    name: vector
    ports:
    - containerPort: 9514
      name: syslog
      protocol: TCP
    resources:
      limits:
        cpu: "1"
        memory: 128Mi
      requests:
        cpu: "1"
        memory: 128Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
    volumeMounts:
    - mountPath: /etc/vector/
      name: config
      readOnly: true
  volumes:
  - name: config
    projected:
      sources:
      - configMap:
          name: vector-test-config

---
apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
    service.beta.kubernetes.io/aws-load-balancer-internal: "true"
    service.beta.kubernetes.io/aws-load-balancer-type: nlb
  labels:
    app.kubernetes.io/name: vector-test
  name: vector-test
  namespace: default
spec:
  # Workaround, set traffic policy to 'Local' to avoid memory increase:
  # externalTrafficPolicy: Local
  ports:
  - name: syslog
    port: 9514
    protocol: TCP
    targetPort: 9514
  selector:
    app.kubernetes.io/name: vector-test
  type: LoadBalancer

Version

0.31.0

Debug Output

2023-07-10T07:55:38.618432Z DEBUG source{component_kind="source" component_id=syslog_source component_type=syslog component_name=syslog_source}:connection{peer_addr=172.16.22.59:21598}: vector::sources::util::net::tcp: Accepted a new connection. peer_addr=172.16.22.59:21598
2023-07-10T07:55:38.618505Z DEBUG source{component_kind="source" component_id=syslog_source component_type=syslog component_name=syslog_source}:connection{peer_addr=172.16.22.59:21598}: vector::sources::util::net::tcp: Connection closed.



### Example Data

_No response_

### Additional Context

_No response_

### References

_No response_

The text was updated successfully, but these errors were encountered:

dsmith3197 · 2024-01-09T17:32:18Z

This was likely due to the peer_addr tag that's added to internal metrics. Can you try upgrading to the latest version of Vector (v0.35.0) and let us know if that resolves the issue (#18982).

christophemorio · 2024-01-12T13:55:35Z

Thanks for the heads up, indeed we are currently testing latest version and issue is gone.
🙇🏼

christophemorio added the type: bug A code related bug. label Jul 10, 2023

christophemorio closed this as completed Jan 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

syslog source AWS NLB healthcheck tcp memory leak #17923

syslog source AWS NLB healthcheck tcp memory leak #17923

christophemorio commented Jul 10, 2023

dsmith3197 commented Jan 9, 2024

christophemorio commented Jan 12, 2024

syslog source AWS NLB healthcheck tcp memory leak #17923

syslog source AWS NLB healthcheck tcp memory leak #17923

Comments

christophemorio commented Jul 10, 2023

A note for the community

Problem

Configuration

Version

Debug Output

dsmith3197 commented Jan 9, 2024

christophemorio commented Jan 12, 2024