Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new event bus breaks sensors #2278

Closed
rwong2888 opened this issue Oct 27, 2022 · 13 comments · Fixed by #2293
Closed

new event bus breaks sensors #2278

rwong2888 opened this issue Oct 27, 2022 · 13 comments · Fixed by #2293
Assignees
Labels
bug Something isn't working

Comments

@rwong2888
Copy link

When we update the jetstream eventbus version to 2.9.1, our sensors crashloop with the error below. The only way for us to recover is delete the argo-event namespace and argocd will restore it back to normal.

{"level":"info","ts":1666883882.918018,"logger":"argo-events.sensor","caller":"leaderelection/leaderelection.go:144","msg":"Not the LEADER, stand by ...","sensorName":"gcf"}
{"level":"info","ts":1666883889.7350574,"logger":"argo-events.sensor","caller":"leaderelection/leaderelection.go:153","msg":"Becoming a Candidate, stand by ...","sensorName":"gcf"}
{"level":"info","ts":1666884061.5108724,"logger":"argo-events.sensor","caller":"leaderelection/leaderelection.go:150","msg":"I'm the LEADER, starting ...","sensorName":"gcf"}
{"level":"info","ts":1666884061.5112243,"logger":"argo-events.sensor","caller":"base/jetstream.go:80","msg":"NATS auth strategy: Basic","sensorName":"gcf"}
{"level":"info","ts":1666884061.5205114,"logger":"argo-events.sensor","caller":"base/jetstream.go:102","msg":"Connected to NATS Jetstream server.","sensorName":"gcf"}
{"level":"info","ts":1666884061.5221837,"logger":"argo-events.sensor","caller":"base/jetstream.go:115","msg":"No need to create Stream 'default' as it already exists","sensorName":"gcf"}
{"level":"error","ts":1666884061.5252142,"logger":"argo-events.sensor","caller":"sensor/sensor_jetstream.go:60","msg":"failed to Create Key/Value Store for sensor gcf, err: nats: stream name already in use","sensorName":"gcf","stacktrace":"github.com/argoproj/argo-event
s/eventbus/jetstream/sensor.(*SensorJetstream).Initialize\n\t/home/runner/work/argo-events/argo-events/eventbus/jetstream/sensor/sensor_jetstream.go:60\ngh.neting.cc/argoproj/argo-events/sensors.(*SensorContext).listenEvents.func1\n\t/home/runner/work/argo-events/argo-eve
nts/sensors/listener.go:109\ngh.neting.cc/argoproj/argo-events/common.DoWithRetry.func1\n\t/home/runner/work/argo-events/argo-events/common/retry.go:106\nk8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24
.3/pkg/util/wait/wait.go:220\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:233\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection\n\t/home/runner
/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:226\nk8s.io/apimachinery/pkg/util/wait.ExponentialBackoff\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:421\ngh.neting.cc/argoproj/argo-events/common.DoWithRetry\n\t/home/runner/w
ork/argo-events/argo-events/common/retry.go:105\ngh.neting.cc/argoproj/argo-events/sensors.(*SensorContext).listenEvents\n\t/home/runner/work/argo-events/argo-events/sensors/listener.go:108\ngh.neting.cc/argoproj/argo-events/sensors.(*SensorContext).Start.func1\n\t/home/run
ner/work/argo-events/argo-events/sensors/listener.go:66"}
{"level":"info","ts":1666884063.061998,"logger":"argo-events.sensor","caller":"base/jetstream.go:80","msg":"NATS auth strategy: Basic","sensorName":"gcf"}
{"level":"info","ts":1666884063.0722675,"logger":"argo-events.sensor","caller":"base/jetstream.go:102","msg":"Connected to NATS Jetstream server.","sensorName":"gcf"}
{"level":"info","ts":1666884063.0728028,"logger":"argo-events.sensor","caller":"base/jetstream.go:115","msg":"No need to create Stream 'default' as it already exists","sensorName":"gcf"}
{"level":"error","ts":1666884063.0752885,"logger":"argo-events.sensor","caller":"sensor/sensor_jetstream.go:60","msg":"failed to Create Key/Value Store for sensor gcf, err: nats: stream name already in use","sensorName":"gcf","stacktrace":"github.com/argoproj/argo-event
s/eventbus/jetstream/sensor.(*SensorJetstream).Initialize\n\t/home/runner/work/argo-events/argo-events/eventbus/jetstream/sensor/sensor_jetstream.go:60\ngh.neting.cc/argoproj/argo-events/sensors.(*SensorContext).listenEvents.func1\n\t/home/runner/work/argo-events/argo-eve
nts/sensors/listener.go:109\ngh.neting.cc/argoproj/argo-events/common.DoWithRetry.func1\n\t/home/runner/work/argo-events/argo-events/common/retry.go:106\nk8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24
.3/pkg/util/wait/wait.go:220\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:233\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection\n\t/home/runner
/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:226\nk8s.io/apimachinery/pkg/util/wait.ExponentialBackoff\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:421\ngh.neting.cc/argoproj/argo-events/common.DoWithRetry\n\t/home/runner/w
ork/argo-events/argo-events/common/retry.go:105\ngh.neting.cc/argoproj/argo-events/sensors.(*SensorContext).listenEvents\n\t/home/runner/work/argo-events/argo-events/sensors/listener.go:108\ngh.neting.cc/argoproj/argo-events/sensors.(*SensorContext).Start.func1\n\t/home/run
ner/work/argo-events/argo-events/sensors/listener.go:66"}
{"level":"info","ts":1666884064.3643985,"logger":"argo-events.sensor","caller":"base/jetstream.go:80","msg":"NATS auth strategy: Basic","sensorName":"gcf"}
{"level":"info","ts":1666884064.370844,"logger":"argo-events.sensor","caller":"base/jetstream.go:102","msg":"Connected to NATS Jetstream server.","sensorName":"gcf"}
{"level":"info","ts":1666884064.37135,"logger":"argo-events.sensor","caller":"base/jetstream.go:115","msg":"No need to create Stream 'default' as it already exists","sensorName":"gcf"}
{"level":"error","ts":1666884064.373799,"logger":"argo-events.sensor","caller":"sensor/sensor_jetstream.go:60","msg":"failed to Create Key/Value Store for sensor gcf, err: nats: stream name already in use","sensorName":"gcf","stacktrace":"github.com/argoproj/argo-events
/eventbus/jetstream/sensor.(*SensorJetstream).Initialize\n\t/home/runner/work/argo-events/argo-events/eventbus/jetstream/sensor/sensor_jetstream.go:60\ngh.neting.cc/argoproj/argo-events/sensors.(*SensorContext).listenEvents.func1\n\t/home/runner/work/argo-events/argo-even
ts/sensors/listener.go:109\ngh.neting.cc/argoproj/argo-events/common.DoWithRetry.func1\n\t/home/runner/work/argo-events/argo-events/common/retry.go:106\nk8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24.
3/pkg/util/wait/wait.go:220\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:233\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection\n\t/home/runner/
go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:226\nk8s.io/apimachinery/pkg/util/wait.ExponentialBackoff\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:421\ngh.neting.cc/argoproj/argo-events/common.DoWithRetry\n\t/home/runner/wo
rk/argo-events/argo-events/common/retry.go:105\ngh.neting.cc/argoproj/argo-events/sensors.(*SensorContext).listenEvents\n\t/home/runner/work/argo-events/argo-events/sensors/listener.go:108\ngh.neting.cc/argoproj/argo-events/sensors.(*SensorContext).Start.func1\n\t/home/runn
er/work/argo-events/argo-events/sensors/listener.go:66"}
{"level":"info","ts":1666884066.1804144,"logger":"argo-events.sensor","caller":"base/jetstream.go:80","msg":"NATS auth strategy: Basic","sensorName":"gcf"}
{"level":"info","ts":1666884066.1925776,"logger":"argo-events.sensor","caller":"base/jetstream.go:102","msg":"Connected to NATS Jetstream server.","sensorName":"gcf"}
{"level":"info","ts":1666884066.194199,"logger":"argo-events.sensor","caller":"base/jetstream.go:115","msg":"No need to create Stream 'default' as it already exists","sensorName":"gcf"}
{"level":"error","ts":1666884066.1979318,"logger":"argo-events.sensor","caller":"sensor/sensor_jetstream.go:60","msg":"failed to Create Key/Value Store for sensor gcf, err: nats: stream name already in use","sensorName":"gcf","stacktrace":"github.com/argoproj/argo-event
s/eventbus/jetstream/sensor.(*SensorJetstream).Initialize\n\t/home/runner/work/argo-events/argo-events/eventbus/jetstream/sensor/sensor_jetstream.go:60\ngh.neting.cc/argoproj/argo-events/sensors.(*SensorContext).listenEvents.func1\n\t/home/runner/work/argo-events/argo-eve
nts/sensors/listener.go:109\ngh.neting.cc/argoproj/argo-events/common.DoWithRetry.func1\n\t/home/runner/work/argo-events/argo-events/common/retry.go:106\nk8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24
.3/pkg/util/wait/wait.go:220\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:233\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection\n\t/home/runner
/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:226\nk8s.io/apimachinery/pkg/util/wait.ExponentialBackoff\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:421\ngh.neting.cc/argoproj/argo-events/common.DoWithRetry\n\t/home/runner/w
ork/argo-events/argo-events/common/retry.go:105\ngh.neting.cc/argoproj/argo-events/sensors.(*SensorContext).listenEvents\n\t/home/runner/work/argo-events/argo-events/sensors/listener.go:108\ngh.neting.cc/argoproj/argo-events/sensors.(*SensorContext).Start.func1\n\t/home/run
ner/work/argo-events/argo-events/sensors/listener.go:66"}
{"level":"info","ts":1666884067.8530138,"logger":"argo-events.sensor","caller":"base/jetstream.go:80","msg":"NATS auth strategy: Basic","sensorName":"gcf"}
{"level":"info","ts":1666884067.860567,"logger":"argo-events.sensor","caller":"base/jetstream.go:102","msg":"Connected to NATS Jetstream server.","sensorName":"gcf"}
{"level":"info","ts":1666884067.8610892,"logger":"argo-events.sensor","caller":"base/jetstream.go:115","msg":"No need to create Stream 'default' as it already exists","sensorName":"gcf"}
{"level":"error","ts":1666884067.8636787,"logger":"argo-events.sensor","caller":"sensor/sensor_jetstream.go:60","msg":"failed to Create Key/Value Store for sensor gcf, err: nats: stream name already in use","sensorName":"gcf","stacktrace":"github.com/argoproj/argo-event
s/eventbus/jetstream/sensor.(*SensorJetstream).Initialize\n\t/home/runner/work/argo-events/argo-events/eventbus/jetstream/sensor/sensor_jetstream.go:60\ngh.neting.cc/argoproj/argo-events/sensors.(*SensorContext).listenEvents.func1\n\t/home/runner/work/argo-events/argo-eve
nts/sensors/listener.go:109\ngh.neting.cc/argoproj/argo-events/common.DoWithRetry.func1\n\t/home/runner/work/argo-events/argo-events/common/retry.go:106\nk8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24
.3/pkg/util/wait/wait.go:220\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:233\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection\n\t/home/runner
/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:226\nk8s.io/apimachinery/pkg/util/wait.ExponentialBackoff\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:421\ngh.neting.cc/argoproj/argo-events/common.DoWithRetry\n\t/home/runner/w
ork/argo-events/argo-events/common/retry.go:105\ngh.neting.cc/argoproj/argo-events/sensors.(*SensorContext).listenEvents\n\t/home/runner/work/argo-events/argo-events/sensors/listener.go:108\ngh.neting.cc/argoproj/argo-events/sensors.(*SensorContext).Start.func1\n\t/home/run
ner/work/argo-events/argo-events/sensors/listener.go:66"}
{"level":"fatal","ts":1666884067.8637202,"logger":"argo-events.sensor","caller":"sensors/listener.go:67","msg":"failed to start","sensorName":"gcf","error":"failed after retries: nats: stream name already in use","stacktrace":"github.com/argoproj/argo-events/sensors.(*S
ensorContext).Start.func1\n\t/home/runner/work/argo-events/argo-events/sensors/listener.go:67"}
@rwong2888 rwong2888 added the bug Something isn't working label Oct 27, 2022
@whynowy
Copy link
Member

whynowy commented Oct 30, 2022

@juliev0 - Can you take a look?

@juliev0 juliev0 self-assigned this Oct 30, 2022
@juliev0
Copy link
Contributor

juliev0 commented Oct 30, 2022

@rwong2888 Can you provide the previous and new EventBus specs? (or otherwise, just want to make sure I understand - you went from an older Jetstream version to a newer one but otherwise nothing else changed, right?)

@rwong2888
Copy link
Author

rwong2888 commented Oct 31, 2022

We upgraded to events v1.7.3 from v1.7.2. After v1.7.3 was rolled out we upgraded the event bus from version 2.8.1 to 2.9.1. After doing so, we see sensors crash looping and we basically wipe out the entire namespace and let ArgoCD rebuild everything to settle it all down.

Similar behaviour could be seen if we tried delete the ArgoCD application in v1.7.2, and we presumably can do the same on v1.7.3. We were attempting to add PruneLast annotation to the eventbus as I believe the CRD deletes before the sensors, because it's alphabetical. Perhaps, we could just rename it as z-default as a workaround.

apiVersion: argoproj.io/v1alpha1
kind: EventBus
metadata:
  name: default
spec:
  jetstream:
    # kubectl describe configmap argo-events-controller-config
    version: 2.9.1
    affinity:
      podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              controller: eventbus-controller
              eventbus-name: default
          topologyKey: kubernetes.io/hostname
        preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                controller: eventbus-controller
                eventbus-name: default
            topologyKey: failure-domain.beta.kubernetes.io/zone
    persistence:
      storageClassName: standard
      accessMode: ReadWriteOnce
      volumeSize: 20Gi
    nodeSelector:
      cloud.google.com/gke-nodepool: argo-events
    tolerations:
    - key: "app"
      operator: "Equal"
      value: "argo-events"
      effect: "NoSchedule"
    # streamConfig: |             # see default values in argo-events-controller-config
    #   maxAge: 24h
    # settings: |
    #   max_file_store: 1GB       # see default values in argo-events-controller-config
    # startArgs:
    #   - "-D"                    # debug-level logs

@juliev0
Copy link
Contributor

juliev0 commented Oct 31, 2022

got it, thanks

@juliev0
Copy link
Contributor

juliev0 commented Oct 31, 2022

@whynowy I guess we do preserve state when a Sensor is deleted and re-created, but the intention is that we do not preserve state when an EventBus is deleted and re-created, right? (basically, deleting the EventBus is the way to purge all of that state and start from scratch)

@juliev0
Copy link
Contributor

juliev0 commented Nov 4, 2022

Discussed with @whynowy. We will preserve the sensor's trigger state in this case (i.e. just use the existing key/value store).

@omarkalloush
Copy link

omarkalloush commented Jan 9, 2023

This issue is still happening.

@juliev0
Copy link
Contributor

juliev0 commented Jan 9, 2023

fix: if key/value store already exists use that #2293

Can you please provide Sensor log, Sensor spec, and old and new EventBus specs? And which version of Argo Events are you using?

@omarkalloush
Copy link

  • Argo Events version 1.7.4
  • EventBus Current specs, the old specs just had v2.8.1
apiVersion: argoproj.io/v1alpha1
kind: EventBus
metadata:
  name: default
spec:
  jetstream:
    # kubectl describe configmap argo-events-controller-config
    version: 2.9.1
    affinity:
      podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              controller: eventbus-controller
              eventbus-name: default
          topologyKey: kubernetes.io/hostname
        preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                controller: eventbus-controller
                eventbus-name: default
            topologyKey: failure-domain.beta.kubernetes.io/zone
    persistence:
      storageClassName: standard
      accessMode: ReadWriteOnce
      volumeSize: 20Gi
    # streamConfig: |             # see default values in argo-events-controller-config
    #   maxAge: 24h
    # settings: |
    #   max_file_store: 1GB       # see default values in argo-events-controller-config
    # startArgs:
    #   - "-D"                    # debug-level logs
  • EventBus Logs
main [96] 2023/01/12 21:55:41.432337 [INF]   Cluster:  default                                                                                                                                                     
main [96] 2023/01/12 21:55:41.432338 [INF]   Name:     eventbus-default-js-0                                                                                                                                       
main [96] 2023/01/12 21:55:41.432345 [INF]   Node:     7zgMgllI                                                                                                                                                    
main [96] 2023/01/12 21:55:41.432347 [INF]   ID:       NCRU4WSWFJ6KHSSI7NMPZLHUFKHUBSUUFYDQLHZI3QXSDSY47WQQWRCC                                                                                                    
main [96] 2023/01/12 21:55:41.432354 [WRN] Plaintext passwords detected, use nkeys or bcrypt                                                                                                                       
main [96] 2023/01/12 21:55:41.432360 [INF] Using configuration file: /etc/nats-config/nats-js.conf                                                                                                                 
main [96] 2023/01/12 21:55:41.433689 [INF] Starting http monitor on 0.0.0.0:8222                                                                                                                                   
main [96] 2023/01/12 21:55:41.433869 [INF] Starting JetStream                                                                                                                                                      
main [96] 2023/01/12 21:55:41.434156 [INF]     _ ___ _____ ___ _____ ___ ___   _   __  __                                                                                                                          
main [96] 2023/01/12 21:55:41.434167 [INF]  _ | | __|_   _/ __|_   _| _ \ __| /_\ |  \/  |                                                                                                                         
main [96] 2023/01/12 21:55:41.434169 [INF] | || | _|  | | \__ \ | | |   / _| / _ \| |\/| |                                                                                                                         
main [96] 2023/01/12 21:55:41.434171 [INF]  \__/|___| |_| |___/ |_| |_|_\___/_/ \_\_|  |_|                                                                                                                         
main [96] 2023/01/12 21:55:41.434172 [INF]                                                                                                                                                                         
main [96] 2023/01/12 21:55:41.434174 [INF]          https://docs.nats.io/jetstream                                                                                                                                 
main [96] 2023/01/12 21:55:41.434176 [INF]                                                                                                                                                                         
main [96] 2023/01/12 21:55:41.434178 [INF] ---------------- JETSTREAM ----------------                                                                                                                             
main [96] 2023/01/12 21:55:41.434183 [INF]   Max Memory:      11.73 GB                                                                                                                                             
main [96] 2023/01/12 21:55:41.434186 [INF]   Max Storage:     1.00 TB                                                                                                                                              
main [96] 2023/01/12 21:55:41.434188 [INF]   Store Directory: "/data/jetstream/store/jetstream"                                                                                                                    
main [96] 2023/01/12 21:55:41.434201 [INF]   Encryption:      ChaCha20-Poly1305                                                                                                                                    
main [96] 2023/01/12 21:55:41.434206 [INF] -------------------------------------------                                                                                                                             
main [96] 2023/01/12 21:55:41.434539 [WRN]   Error decrypting our stream metafile: cipher: message authentication failed                                                                                           
main [96] 2023/01/12 21:55:41.434607 [WRN]   Error decrypting our stream metafile: cipher: message authentication failed                                                                                           
main [96] 2023/01/12 21:55:41.434685 [WRN]   Error decrypting our stream metafile: cipher: message authentication failed                                                                                           
main [96] 2023/01/12 21:55:41.434742 [WRN]   Error decrypting our stream metafile: cipher: message authentication failed                                                                                           
main [96] 2023/01/12 21:55:41.434807 [WRN]   Error decrypting our stream metafile: cipher: message authentication failed                                                                                           
main [96] 2023/01/12 21:55:41.434843 [INF] Starting JetStream cluster                                                                                                                                              
main [96] 2023/01/12 21:55:41.434850 [INF] Creating JetStream metadata controller                                                                                                                                  
main [96] 2023/01/12 21:55:41.435004 [ERR] Error creating filestore: cipher: message authentication failed                                                                                                         
main [96] 2023/01/12 21:55:41.435018 [FTL] Can't start JetStream: cipher: message authentication failed 
  • Sensor specs
apiVersion: argoproj.io/v1alpha1
kind: Sensor
metadata:
  name: github-operations
spec:
  replicas: 3
  template:
    serviceAccountName: operate-workflow-sa

  dependencies:
    - name: testing
      eventSourceName: github-operations
      eventName: testing
      filters:
        exprs:
          - #@ push_on_branch(ref="refs/heads/main")
        data:
          - #@ changes_on_file(file="[README.md]")
        # time:
        #   #! Time in UTC.
        #   start: "12:00:00"
        #   stop: "22:00:00"
        # script: |
        #   local day=os.date( "!%A", os.time() - 4 * 60 * 60 )
        #   if (day == "Saturday" or day == "Sunday") then return false end
        #   return true

  triggers:

    - template:
        conditions: "testing"
        name: testing
        argoWorkflow:
          operation: submit
          source:
            git:
              filePath: workflows/ci.yaml
              url: #@ data.values["argo-workflows"].resources.repo
              branch: #@ data.values["argo-workflows"].resources.branch

  • Sensor logs
{"level":"info","ts":1673560476.049813,"logger":"argo-events.sensor","caller":"cmd/start.go:77","msg":"starting sensor server","sensorName":"github-operations","version":"v1.7.4"}                                
{"level":"info","ts":1673560476.0499656,"logger":"argo-events.sensor","caller":"metrics/metrics.go:172","msg":"starting metrics server","sensorName":"github-operations"}                                          
{"level":"fatal","ts":1673560476.0581808,"logger":"argo-events.sensor","caller":"leaderelection/leaderelection.go:126","msg":"failed to new Nats Rpc","sensorName":"github-operations","error":"nats: no servers av
ailable for connection","stacktrace":"github.com/argoproj/argo-events/common/leaderelection.(*natsEventBusElector).RunOrDie\n\t/home/runner/work/argo-events/argo-events/common/leaderelection/leaderelection.go:12
6\ngh.neting.cc/argoproj/argo-events/sensors.(*SensorContext).Start\n\t/home/runner/work/argo-events/argo-events/sensors/listener.go:64\ngh.neting.cc/argoproj/argo-events/sensors/cmd.Start\n\t/home/runner/work/argo-
events/argo-events/sensors/cmd/start.go:79\ngh.neting.cc/argoproj/argo-events/cmd/commands.NewSensorCommand.func1\n\t/home/runner/work/argo-events/argo-events/cmd/commands/sensor.go:14\ngh.neting.cc/spf13/cobra.(*Co
mmand).execute\n\t/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:920\ngh.neting.cc/spf13/cobra.(*Command).ExecuteC\n\t/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:1044\ngithub
.com/spf13/cobra.(*Command).Execute\n\t/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:968\ngh.neting.cc/argoproj/argo-events/cmd/commands.Execute\n\t/home/runner/work/argo-events/argo-events/cmd/
commands/root.go:19\nmain.main\n\t/home/runner/work/argo-events/argo-events/cmd/main.go:8\nruntime.main\n\t/opt/hostedtoolcache/go/1.19.3/x64/src/runtime/proc.go:250"} 

The only solution that we found was is delete the argo-event namespace and argocd will restore it back to normal.

@juliev0
Copy link
Contributor

juliev0 commented Jan 12, 2023

sor","caller":"sensor/sensor_jetstream.go:60","msg":"failed to Create Key/Value Store for sensor gcf, err: nats: stream name already in use","sensorName":"gcf","stacktrace":"github.com/argoproj/argo-event
s/eventbus/jetstream/sensor.(*SensorJetstream).Initialize\n\t/home/runner/work/argo-events/argo-events/eventbus/jetstream/sensor/sensor_jetstream.go:60\ngh.neting.cc/argoproj/argo-events/sensors.(*SensorContext).listenEvents.func1\n\t/home/runner/work/argo-events/argo-eve
nts/sensors/listener.go:109\ngh.neting.cc/argoproj/argo-events/common.DoWithRetry.func1\n\t/home/runner/work/argo-events/argo-events/common/retry.go:106\nk8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24
.3/pkg/util/wait/wait.go:220\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:233\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection\n\t/home/runner
/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:226\nk8s.io/apimachinery/pkg/util/wait.ExponentialBackoff\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.24.3/pkg/util/wait/wait.go:421\ngh.neting.cc/argoproj/argo-events/common.DoWithRetry\n\t/home/runner/w
ork/argo-events/argo-events/common/retry.go:105\ngh.neting.cc/argoproj/argo-events/sensors.(*SensorContext).listenEvents\n\t/home/runner/work/argo-events/argo-events/sensors/listener.go:108\ngh.neting.cc/argoproj/argo-event

Based on your Sensor log, this doesn't seem to be the same error as the one in this issue - or am I missing something?

@omarkalloush
Copy link

So this issue was first create cause I was trying to add the PruneLast option to the EventBus and when I delete the application from ArgoCD to test that, the logs that @rwong2888 provided were the ones that we got before the fix that was released.

After that, we upgraded to the version that contained that fix and when I tried again these are the logs that we are seeing now.

@juliev0
Copy link
Contributor

juliev0 commented Jan 13, 2023

So this issue was first create cause I was trying to add the PruneLast option to the EventBus and when I delete the application from ArgoCD to test that, the logs that @rwong2888 provided were the ones that we got before the fix that was released.

After that, we upgraded to the version that contained that fix and when I tried again these are the logs that we are seeing now.

Okay, seems like a new issue then, right? Do you want to open up a new bug for it?

@omarkalloush
Copy link

sure thing, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants