Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky test TestCtlV3AuthAndWatchJWT #17556

Open
chaochn47 opened this issue Mar 8, 2024 · 3 comments
Open

Flaky test TestCtlV3AuthAndWatchJWT #17556

chaochn47 opened this issue Mar 8, 2024 · 3 comments

Comments

@chaochn47
Copy link
Member

Which Github Action / Prow Jobs are flaking?

E2E / test (linux-386-e2e) (pull_request)

https://github.com/etcd-io/etcd/actions/runs/8199906821/job/22425749141?pr=17555

Which tests are flaking?

TestCtlV3AuthAndWatchJWT

Github Action / Prow Job link

https://github.com/etcd-io/etcd/actions/runs/8199906821/job/22425749141?pr=17555

Reason for failure (if possible)

jwt token is expired after the 1s TTL.

2024-03-08T07:05:13.928 test started to launch etcd process but this step is after e2e.NewConfigJWT().

So it is expected that after more than 1s that the token is expired and watch is cancelled by server.

{
  "level":"warn",
  "ts":"2024-03-08T07:05:15.007417Z",
  "caller":"auth/jwt.go:66",
  "msg":"failed to parse a JWT token",
  "error":"Token is expired"
}

Log snippets from the above github action failed test.

2024-03-08T07:05:13.9419929Z {"Time":"2024-03-08T07:05:13.928191133Z","Action":"output","Package":"go.etcd.io/etcd/tests/v3/e2e","Test":"TestCtlV3AuthAndWatchJWT","Output":"    logger.go:130: 2024-03-08T07:05:13.928Z\tINFO\tspawning process\t{\"args\": [\"/home/runner/work/etcd/etcd/bin/etcd\", \"--name=TestCtlV3AuthAndWatchJWT-test-0\", \"--listen-client-urls=http://localhost:20000\", \"--advertise-client-urls=http://localhost:20000\", \"--listen-peer-urls=http://localhost:20001\", \"--initial-advertise-peer-urls=http://localhost:20001\", \"--initial-cluster-token=new\", \"--data-dir\", \"/tmp/TestCtlV3AuthAndWatchJWT2770679286/002\", \"--snapshot-count=10000\", \"--auth-token=jwt,pub-key=/home/runner/work/etcd/etcd/tests/fixtures/server.crt,priv-key=/home/runner/work/etcd/etcd/tests/fixtures/server.key.insecure,sign-method=RS256,ttl=1s\", \"--initial-cluster-token=new\", \"--experimental-initial-corrupt-check=true\", \"--initial-cluster=TestCtlV3AuthAndWatchJWT-test-0=http://localhost:20001\", \"--initial-cluster-state=new\"], \"working-dir\": \"/tmp/TestCtlV3AuthAndWatchJWT2770679286/001\", \"name\": \"TestCtlV3AuthAndWatchJWT-test-0\", \"environment-variables\": [\"ETCD_VERIFY=all\", \"EXPECT_DEBUG=true\", \"PATH=/opt/hoste"}
2024-03-08T07:05:15.0112928Z {"Time":"2024-03-08T07:05:15.007575666Z","Action":"output","Package":"go.etcd.io/etcd/tests/v3/e2e","Test":"TestCtlV3AuthAndWatchJWT","Output":"/home/runner/work/etcd/etcd/bin/etcd (TestCtlV3AuthAndWatchJWT-test-0) (15561): {\"level\":\"warn\",\"ts\":\"2024-03-08T07:05:15.007417Z\",\"caller\":\"auth/jwt.go:66\",\"msg\":\"failed to parse a JWT token\",\"error\":\"Token is expired\"}\r\n"}
2024-03-08T07:05:15.0119638Z {"Time":"2024-03-08T07:05:15.007610442Z","Action":"output","Package":"go.etcd.io/etcd/tests/v3/e2e","Test":"TestCtlV3AuthAndWatchJWT","Output":"/home/runner/work/etcd/etcd/bin/etcd (TestCtlV3AuthAndWatchJWT-test-0) (15561): {\"level\":\"warn\",\"ts\":\"2024-03-08T07:05:15.007475Z\",\"caller\":\"auth/store.go:1076\",\"msg\":\"invalid auth token\",\"token\":\"***"}\r\n"}

2024-03-08T07:05:15.2553257Z {"Time":"2024-03-08T07:05:15.007948053Z","Action":"output","Package":"go.etcd.io/etcd/tests/v3/e2e","Test":"TestCtlV3AuthAndWatchJWT","Output":"/home/runner/work/etcd/etcd/bin/etcdctl (/home/runner/work/etcd/etcd/bin/etcdctl_--endpoints=http://localhost:20000_--dial-timeout=7s_--user=test-user:pass_watch_key_--rev_1) (15623): watch was canceled (rpc error: code = Unauthenticated desc = etcdserver: invalid auth token)\r\n"}
2024-03-08T07:05:15.2559479Z {"Time":"2024-03-08T07:05:15.008528444Z","Action":"output","Package":"go.etcd.io/etcd/tests/v3/e2e","Test":"TestCtlV3AuthAndWatchJWT","Output":"/home/runner/work/etcd/etcd/bin/etcdctl (/home/runner/work/etcd/etcd/bin/etcdctl_--endpoints=http://localhost:20000_--dial-timeout=7s_--user=test-user:pass_watch_key_--rev_1) (15623): Error: watch is canceled by the server\r\n"}
2024-03-08T07:05:15.2565723Z {"Time":"2024-03-08T07:05:15.013482979Z","Action":"output","Package":"go.etcd.io/etcd/tests/v3/e2e","Test":"TestCtlV3AuthAndWatchJWT","Output":"    ctl_v3_auth_test.go:287: watchTest #0: ctlV3Watch error (match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [5] after running [/home/runner/work/etcd/etcd/bin/etcdctl --endpoints=http://localhost:20000 --dial-timeout=7s --user=test-user:pass watch key --rev 1]], last lines:\n"}
2024-03-08T07:05:15.2570620Z {"Time":"2024-03-08T07:05:15.013497486Z","Action":"output","Package":"go.etcd.io/etcd/tests/v3/e2e","Test":"TestCtlV3AuthAndWatchJWT","Output":"        watch was canceled (rpc error: code = Unauthenticated desc = etcdserver: invalid auth token)\r\n"}
2024-03-08T07:05:15.2573693Z {"Time":"2024-03-08T07:05:15.013504229Z","Action":"output","Package":"go.etcd.io/etcd/tests/v3/e2e","Test":"TestCtlV3AuthAndWatchJWT","Output":"        Error: watch is canceled by the server\r\n"}
2024-03-08T07:05:15.2576128Z {"Time":"2024-03-08T07:05:15.01351024Z","Action":"output","Package":"go.etcd.io/etcd/tests/v3/e2e","Test":"TestCtlV3AuthAndWatchJWT","Output":"        )\n"}

Anything else we need to know?

Please update the test to use longer TTL... I am not sure whether other tests are also impacted by this race condition.

https://github.com/etcd-io/etcd/blob/main/tests/e2e/ctl_v3_auth_test.go#L33

@ah8ad3
Copy link
Contributor

ah8ad3 commented Mar 12, 2024

I might be able to fix this if it's ok i'll take a look at this.

@chaochn47
Copy link
Member Author

Thanks @ah8ad3! Assigned to you. Just got hit again in https://github.com/etcd-io/etcd/actions/runs/8289059783/job/22684775771?pr=17555...

@ah8ad3
Copy link
Contributor

ah8ad3 commented Mar 15, 2024

As i found we have a TestAuthJWTExpire that tests the expire conditions, so i think we are good to increase the TTL to something larger, Also we had the same issue here too #16523 that got fixed with increasing the ttl. I'm testing it to see affects on other tests meanwhile i thought sharing this and ask for your'e opinion would be a good idea too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants