Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The PostCommit XVR Direct job is flaky #30517

Open
github-actions bot opened this issue Mar 5, 2024 · 10 comments
Open

The PostCommit XVR Direct job is flaky #30517

github-actions bot opened this issue Mar 5, 2024 · 10 comments

Comments

@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2024

The PostCommit XVR Direct is failing over 50% of the time
Please visit https://github.com/apache/beam/actions/workflows/beam_PostCommit_XVR_Direct.yml?query=is%3Afailure+branch%3Amaster to see the logs.

Copy link
Contributor Author

Reopening since the workflow is still flaky

@tvalentyn
Copy link
Contributor

tvalentyn commented Aug 29, 2024

Seems to be permared for a while.

2024-08-29T17:46:16.8649631Z System Go installation: /usr/local/go/bin/go is go version go1.21.0 linux/amd64; Preparing to use /home/runner/go/bin/go1.22.5
2024-08-29T17:46:17.0648275Z go1.22.5: already downloaded in /home/runner/sdk/go1.22.5
2024-08-29T17:46:17.0665947Z /home/runner/go/bin/go1.22.5 test -v ./test/integration/xlang ./test/integration/io/xlang/... -p 3 -v -timeout 3h --runner=portable --project=apache-beam-testing --region=us-central1 --environment_type=DOCKER --environment_config=apache/beam_go_sdk:dev --staging_location=gs://temp-storage-for-end-to-end-tests/staging-validatesrunner-test/test10288 --temp_location=gs://temp-storage-for-end-to-end-tests/temp-validatesrunner-test/test10288 --endpoint=localhost:34069 --kafka_jar=/runner/_work/beam/beam/sdks/java/testing/kafka-service/build/libs/beam-sdks-java-testing-kafka-service-testKafkaService-2.60.0-SNAPSHOT.jar --expansion_jar=io:/runner/_work/beam/beam/sdks/java/io/expansion-service/build/libs/beam-sdks-java-io-expansion-service-2.60.0-SNAPSHOT.jar --expansion_jar=schemaio:/runner/_work/beam/beam/sdks/java/extensions/schemaio-expansion-service/build/libs/beam-sdks-java-extensions-schemaio-expansion-service-2.60.0-SNAPSHOT.jar --expansion_jar=debeziumio:/runner/_work/beam/beam/sdks/java/io/debezium/expansion-service/build/libs/beam-sdks-java-io-debezium-expansion-service-2.60.0-SNAPSHOT.jar --expansion_jar=gcpio:/runner/_work/beam/beam/sdks/java/io/google-cloud-platform/expansion-service/build/libs/beam-sdks-java-io-google-cloud-platform-expansion-service-2.60.0-SNAPSHOT.jar --bq_dataset=apache-beam-testing.beam_bigquery_io_test_temp --bt_instance=projects/apache-beam-testing/instances/beam-test --expansion_addr=test:localhost:39707
2024-08-29T17:46:17.0689704Z go: downloading cloud.google.com/go/bigtable v1.29.0
2024-08-29T17:46:17.0691189Z go: downloading github.com/lib/pq v1.10.9
2024-08-29T17:46:17.0693048Z go: downloading github.com/go-sql-driver/mysql v1.8.1
2024-08-29T17:46:17.1648532Z go: downloading filippo.io/edwards25519 v1.1.0
2024-08-29T17:46:17.2648855Z go: downloading go.opentelemetry.io/otel/sdk/metric v1.24.0
2024-08-29T17:46:17.2650985Z go: downloading cloud.google.com/go/monitoring v1.20.3
2024-08-29T17:46:17.2652702Z go: downloading go.opentelemetry.io/otel/sdk v1.24.0
2024-08-29T19:32:10.7892423Z ##[error]The operation was canceled.
2024-08-29T19:32:10.8228117Z ##[group]Run actions/upload-artifact@v4
2024-08-29T19:32:10.8229144Z with:
2024-08-29T19:32:10.8230291Z   name: JUnit Test Results

@tvalentyn
Copy link
Contributor

looks like we have an xlang test that runs with a 3hr time limit, passes on 3.12, fails on 3.8 after timing out after 2.5 hrs

@tvalentyn
Copy link
Contributor

The failing test is GoUsingJava xlang suite, it is not using Python ; test passes on Python 3.12 because the 3.12 suite excludes the GoUsingJava xlang variant since we only need to run it for one Python version. It appears that GoUsingJava xlang scenario not working on some runners is a known issue.
cc: @Abacn @lostluck who can correct me if they disagree with the assessment.

@lostluck
Copy link
Contributor

It's a known issue and it's also not a release blocker. The fact is we have spent very little time making Xlang for go robust and the people tasked with that move on. This is also not something that would be common for users, since they'd need to manually spin up the Python Portable runner.

@Abacn
Copy link
Contributor

Abacn commented Sep 11, 2024

last time I checked this it was a few failing xlang tests, and now it's timing out, likely new issues accumulated, which is common for long permared tests unfortunately.

For the same reason agree to disable gousingjava part of the test, so other tasks can still be monitored

@Abacn Abacn closed this as completed Sep 18, 2024
@github-actions github-actions bot added this to the 2.60.0 Release milestone Sep 18, 2024
@github-actions github-actions bot reopened this Oct 2, 2024
Copy link
Contributor Author

github-actions bot commented Oct 2, 2024

Reopening since the workflow is still flaky

@Abacn
Copy link
Contributor

Abacn commented Oct 2, 2024

pullLicense flakiness, fixed by #32626 , move to the next milestone for monitoring

@Abacn Abacn modified the milestones: 2.60.0 Release, 2.61.0 Release Oct 2, 2024
@damccorm
Copy link
Contributor

@github-actions github-actions bot reopened this Oct 31, 2024
Copy link
Contributor Author

Reopening since the workflow is still flaky

@damccorm damccorm removed this from the 2.61.0 Release milestone Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants