Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run non dask TPC-H benchmarks on schedule or label #1083

Merged
merged 10 commits into from
Oct 24, 2023

Conversation

milesgranger
Copy link
Contributor

#1044 (comment)

@crusaderky do you have a preference if these are ran as frequently as existing tests, or shall I add/extend a workflow to run once a day/ect?

@milesgranger milesgranger changed the title Remove flag skiping polars, duckdb, and pyspark tpch tests Remove flag skipping polars, duckdb, and pyspark tpch tests Oct 16, 2023
@crusaderky
Copy link
Contributor

I think it makes sense to run them only on nightly or when the PR contains a specific tag, exactly like with workflows?

- name: Determine if workflows should be run
# Run workflows on PRs with `workflows` label and nightly cron job
if: |
github.event_name == 'schedule'
|| (github.event_name == 'pull_request' && contains(github.event.pull_request.labels.*.name, 'workflows'))
run: |
# Put EXTRA_OPTIONS into $GITHUB_ENV so it can be used in subsequent workflow steps
export EXTRA_OPTIONS="--run-workflows"
echo $EXTRA_OPTIONS
echo EXTRA_OPTIONS=$EXTRA_OPTIONS >> $GITHUB_ENV

@milesgranger
Copy link
Contributor Author

@crusaderky see what you think about the added block. No rush though, thanks! 🙏

@mrocklin
Copy link
Member

No rush though, thanks!

I'll actually add a bit of a rush to this. I'd like to remove the need to add this keyword to run TPCH tests. Ideally I'd like to do this well before Thursday's demo day (so that I can write and update a readme)

@milesgranger
Copy link
Contributor Author

milesgranger commented Oct 19, 2023

Gentle ping @crusaderky

I believe @mrocklin is advocating we ought to just run them all the time, in that case maybe the block added in c44c916 ought to go in favor of just removing the flag done originally in 550c245

@milesgranger milesgranger changed the title Remove flag skipping polars, duckdb, and pyspark tpch tests Run non dask TPC-H benchmarks on schedule or label Oct 19, 2023
@crusaderky
Copy link
Contributor

LGTM; retesting in both configurations before merge

@crusaderky
Copy link
Contributor

crusaderky commented Oct 19, 2023

No rush though, thanks!

I'll actually add a bit of a rush to this. I'd like to remove the need to add this keyword to run TPCH tests. Ideally I'd like to do this well before Thursday's demo day (so that I can write and update a readme)

+1 for this. I assume @mrocklin is experiencing the same kind of frustration that I face daily when running @slow tests in distributed locally. I would rather have a flag that is ON by default and is added to ci:

on the command line: pytest -m 'not nondask_tpch'
in the header of the test files: pytestmark = pytest.mark.nondask_tpch

This follows the same design of avoid_ci in distributed.

@mrocklin
Copy link
Member

I'm actually fine with these never being run on CI. Mostly I do everything here manually. I want to target this directory

py.test tests/tpch

And have it run. I don't want to have to specify --tpch-non-dask every time I do this.

I think that my personal preference here is for us to not flag things on or off, but instead be more precise with our py.test foo commands about which directories to run.

@mrocklin mrocklin mentioned this pull request Oct 19, 2023
@milesgranger milesgranger force-pushed the remove-flag-skipping-duckdb-pyspark-polars branch from 2e64e63 to 4519fef Compare October 20, 2023 07:42
@milesgranger
Copy link
Contributor Author

That makes sense!
I've added some logic to check if the path(s) passed to pytest include any of the workflows or tpch tests then auto adds the appropriate flag to run those tests. (or technically, doesn't apply the skip)

A simple example:

pytest tests/ -k test_polars will not run any TPC-H Polars tests, where pytest tests/tpch -k test_polars will, because it includes the path into the tpch tests; same goes for pytest tests/workflows will then run all workflows, where pytest tests/ won't without the appropriate --run-workflows flag.

@milesgranger milesgranger force-pushed the remove-flag-skipping-duckdb-pyspark-polars branch from 4519fef to 79f510e Compare October 20, 2023 07:59
@crusaderky
Copy link
Contributor

LGTM; rerunning in both use cases before I approve

@crusaderky crusaderky added workflows Related to representative Dask user workflows tpch labels Oct 24, 2023
@crusaderky crusaderky closed this Oct 24, 2023
@crusaderky crusaderky reopened this Oct 24, 2023
@crusaderky crusaderky merged commit 40a036e into main Oct 24, 2023
26 checks passed
@crusaderky crusaderky deleted the remove-flag-skipping-duckdb-pyspark-polars branch October 24, 2023 23:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tpch workflows Related to representative Dask user workflows
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants