New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add manual benchmark workflow and S3 result persistence #429

Open

fabianliebig wants to merge 13 commits into main from feature/persisting-benchmarking-results

+236 −2

Collaborator

fabianliebig commented Nov 14, 2024 •

edited

Loading

Hi everyone,

This pull request introduces a class and a workflow designed to store the results of a benchmark run on an S3 bucket.

The key used for storage includes the identifier for the benchmark itself, the branch used, the release version, the current date, the commit hash, and the workflow ID as a fallback, in that order. This fallback is necessary because a range of dates is selected when building the visuals, which cannot accurately reflect the time, creating theoretical edge cases where identical keys are possible. Furthermore, boto3 package is added to interact with AWS components.

I will update the changelog and the lockfiles once the change is final.

I look forward to your feedback.

fabianliebig requested review from Scienfitz, AdrianSosic and AVHopp as code owners

November 14, 2024 13:06

fabianliebig self-assigned this

Collaborator

AVHopp commented Nov 18, 2024

@fabianliebig Looking at your commit history, I think the "fixup" did not do what you wanted it to do, you might want to check that out again ;)


          Add manual benchmark workflow and S3 result persistence

87dce85

fabianliebig force-pushed the feature/persisting-benchmarking-results branch from f1cdf02 to 87dce85 Compare

November 18, 2024 09:10

AVHopp reviewed

View reviewed changes

pyproject.toml

Collaborator

AVHopp Nov 18, 2024

Unrelated: The Build Docs Action currently fails since a link to an old NeurIPS paper in the Transfer Learning user guide seems to be broken. Need to create a separate PR for this.

AVHopp requested changes

View reviewed changes

Collaborator

AVHopp left a comment

Code looks quite clean and nice. Some things need minor adjustments :)

benchmarks/persistance/s3_persistance.py Outdated

		@@ -0,0 +1,89 @@
		"""Classes for persisting and loading experiment results to a S3-Bucket."""

Collaborator

AVHopp Nov 18, 2024

"using a S3-Bucket", otherwise it will be hard to find a preposition fitting for both "persisting" and "loading"

Collaborator Author

fabianliebig Nov 18, 2024

Currently persisting is added only and I forgot to change the module description. Would rather remove the loading part for now :)

benchmarks/persistance/s3_persistance.py Outdated

+                  _bucket_name: str = field(validator=instance_of(str), init=False)
+                  """The name of the S3 bucket where the results are stored."""
+                  _object_session = boto3.session.Session()

Collaborator

AVHopp Nov 18, 2024

Can you update the PR description with some information about what this boto3 is? Seems to be rather important and central, would appreciate a bit more information on it.

Collaborator Author

fabianliebig Nov 18, 2024

Sure, that is the software package which is provided by AWS to interact with factory components.

benchmarks/persistance/s3_persistance.py Outdated Show resolved Hide resolved

benchmarks/persistance/s3_persistance.py Outdated Show resolved Hide resolved

benchmarks/persistance/s3_persistance.py Outdated

+                  @_branch.default
+                  def _default_branch(self) -> str:
+                      repo = Repo(search_parent_directories=True)

Collaborator

AVHopp Nov 18, 2024

Just to make sure: I guess the search_parent_directories is necessary since the call will execute the benchmark module which is not the repository and hence needs to go to the BayBE repository?

Collaborator Author

fabianliebig Nov 18, 2024

Yes, exactly :D

benchmarks/persistance/s3_persistance.py Outdated

+                      )
+                      result_dict = result.to_dict()
+                      result_dict["best_possible_result"] = benchmark.best_possible_result

Collaborator

AVHopp Nov 18, 2024

Not sure if possible (or even within scope), but would it be possible to get the respective attributes of the benchmark object and have these automatically be converted to key-value pairs? This would change the logic from "explicitly including everything" to "explicitly excluding only a few fields" and would make this very robust to pontential changes in the benchmark. Would however like to hear @AdrianSosic and @Scienfitz opinion on that.

Collaborator Author

fabianliebig Nov 19, 2024 •

edited

Loading

@AVHopp I tried to adopt your idea as close as possible, let me know that you think :)

benchmarks/persistance/s3_persistance.py Outdated

+                      result_dict["optimal_function_inputs"] = benchmark.optimal_function_inputs
+                      result_dict["description"] = benchmark.description
+                      client.put_object(

Collaborator

AVHopp Nov 18, 2024

I guess this is the call that actually stores stuff in the S3-Bucket, right? What happens if this call fails for whatever reason, does this yield a reasonable error message or should this be put within a try-except block?

Collaborator Author

fabianliebig Nov 18, 2024

Boto3 comes with different self defined exceptions which have (depending on the error) meaningful semantics (Permission error, ServiceUnavailableException, etc). It is not always perfectly described where the error is coming from bit at least it gives you a hint :).

benchmarks/__main__.py

@@ @@ -2,12 +2,16 @@ @@
               # Run this via 'python -m benchmarks' from the root directory.
               from benchmarks.domains import BENCHMARKS
+              from benchmarks.persistance import S3ExperimentResultPersistence
               def main():

Collaborator

AVHopp Nov 18, 2024

What happens if someone now runs a benchmark locally? I guess since the environment variables necessary for persisting are not set, the code would crash after finishing the first benchmark, right? So maybe we persistance_handler call should only be done if the corresponding environment variables are set.
Also, would this then mean that we can get rid of the validation within the object itself? Would this be reasonable or a bad code design (summoning @AdrianSosic here)

Collaborator Author

fabianliebig Nov 18, 2024

That would be the case, yes. Would you want to store it locally? Might be reasonable so that the result can be used afterwards instead of volatile objects and prints from the results.

Collaborator

AdrianSosic Nov 18, 2024

So one option I see here is to use an abstraction layer over s3 that can connect to different file storage backends. One library for that is fsspec (used to be part of foundry-dev-tools, not sure if still is). With that, you can hide the backend behind an API, which has the advantage that you can flexibly configure where stuff gets actually stored. That way, we could configure it such that it gets written to a default location on the disk when run locally and both the s3 path and the local one can be configured via the same BAYBE_BENCHMARK_PERSISTENCE_PATH env variable.

Collaborator

AdrianSosic Nov 18, 2024

But you can check what foundry-dev-tools uses nowadays, perhaps there are more modern alternatives available these days. Although the package is quite solid, I think

Collaborator Author

fabianliebig Nov 19, 2024

@AdrianSosic I've added another suggestion but feel free to exchange on further improvements and restructurings :)

.github/workflows/manual_benchmark.yml Outdated Show resolved Hide resolved

.github/workflows/manual_benchmark.yml

+                      id: generate-token
+                      uses: actions/create-github-app-token@v1
+                      with:
+                        app-id: ${{ vars.APP_ID }}

Collaborator

AVHopp Nov 18, 2024

This is the part where we still need to add some information/configuration to the repository itself, right? For the sake of documentation, can you elaborate a little bit on that?

Collaborator Author

fabianliebig Nov 18, 2024

Sure. I think at some point a README will be necessary :D. The following information will need to be added:

APP_ID: The autogenerated number of the GitHub app used to generate a token for fetching temporary credentials for the runner.
APP_PRIVATE_KEY: The key from the GitHub app to assume its role.
AWS_ROLE_TO_ASSUME: The role to assume within the AWS-factory (must be deposited as a role to assume for the OIDC authentication)
TEST_RESULT_S3_BUCKET: The name of the bucket where the results should be stored.

fabianliebig added 5 commits

November 18, 2024 11:00


          fixup! Add manual benchmark workflow and S3 result persistence

a782aac

Refactor S3 persistence class docstrings for clarity and consistency


          fixup! Add manual benchmark workflow and S3 result persistence

84cb0e9

Rename workflow run ID variable for clarity and update related documentation


          fixup! Add manual benchmark workflow and S3 result persistence

d48fba7

Update environment variable references for benchmark persistence


          fixup! Add manual benchmark workflow and S3 result persistence

9e25e08

Rename manual benchmark workflow for clarity


          fixup! Add manual benchmark workflow and S3 result persistence

a9e6d9f

Fix spelling errors in persistence module and update import statements

AdrianSosic requested changes

View reviewed changes

Collaborator

AdrianSosic left a comment

Hi @fabianliebig, here a first few thoughts

.github/workflows/manual_benchmark.yml Outdated Show resolved Hide resolved

pyproject.toml

@@ @@ -145,6 +145,7 @@ benchmarking = [ @@
                   "baybe[chem]",
                   "baybe[onnx]",
                   "baybe[simulation]",
+                  "boto3>=1.0.0",

Collaborator

AdrianSosic Nov 18, 2024

Was there any specific reason for 1.0.0? If not, we could also just ditch the version completely. Let me know

Collaborator Author

fabianliebig Nov 18, 2024

There was a jump from 0.0.22 to 1.0.0 in 2015 which usually indicates a major breaking change and since the current version is 1.35.64 I thought the choice would be reasonable. We could also drop the version, but I would want to try the smallest version just to be sure.

benchmarks/__main__.py

@@ @@ -2,12 +2,16 @@ @@
               # Run this via 'python -m benchmarks' from the root directory.
               from benchmarks.domains import BENCHMARKS
+              from benchmarks.persistance import S3ExperimentResultPersistence
               def main():

Collaborator

AdrianSosic Nov 18, 2024

So one option I see here is to use an abstraction layer over s3 that can connect to different file storage backends. One library for that is fsspec (used to be part of foundry-dev-tools, not sure if still is). With that, you can hide the backend behind an API, which has the advantage that you can flexibly configure where stuff gets actually stored. That way, we could configure it such that it gets written to a default location on the disk when run locally and both the s3 path and the local one can be configured via the same BAYBE_BENCHMARK_PERSISTENCE_PATH env variable.

benchmarks/__main__.py

@@ @@ -2,12 +2,16 @@ @@
               # Run this via 'python -m benchmarks' from the root directory.
               from benchmarks.domains import BENCHMARKS
+              from benchmarks.persistance import S3ExperimentResultPersistence
               def main():

Collaborator

AdrianSosic Nov 18, 2024

But you can check what foundry-dev-tools uses nowadays, perhaps there are more modern alternatives available these days. Although the package is quite solid, I think

benchmarks/persistence/s3_persistence.py Outdated



		@define
		class S3ExperimentResultPersistence:

Collaborator

AdrianSosic Nov 18, 2024

I haven't really looked at the details of the class yet, but already the name indicates to me that a class is perhaps not the ideal architecture here. What I mean is that the class actually does not represent anything, it's not "an object of something". Rather, its only responsible for "doing" one thing, namely persisting stuff. And that is reflected by the somewhat weird name.

My initial reflex was: this should probably just be a function that uses clever dependency injection. What I see by taking a glimpse at your code is that you need some part that is responsible for "connecting to the file backend" and one part that "specifies baybe context". Would it make sense, e.g. to define some sort of "connector" object / protocol that then gets simply passed to persist_result? I think this would also play very nicely with the fsspec idea, since you can then still easily abstract away the actual type of backend used.

Collaborator Author

fabianliebig Nov 18, 2024 •

edited

Loading

The more important question is whether we want to hide the method by which an item is created. I see that class more as some kind of an interface implementation (which it was initially) to extend if needed. If we use fsspec and there is an S3 implementation for it, we must define the path explicitly. This necessitates different key constructions, which the function would need to handle. We could build a wrapper around the path construction to access everything as a file (and don't get me wrong, I like that Unix-like idea). S3 is not a file system but a database and provides different consistency models but we could also use a file system such as EFS or a database. However, it seems to me that this might fall under the YAGNI principle.

fabianliebig added 7 commits

November 18, 2024 14:47


          fixup! Add manual benchmark workflow and S3 result persistence

3b95edc

Rename workflow to 'Run Benchmark


          fixup! Add manual benchmark workflow and S3 result persistence

9411c69

Add Benchmark class with serialization support


          fixup! Add manual benchmark workflow and S3 result persistence

225b20c

Refactor S3ExperimentResultPersistence to streamline result dictionary creation


          fixup! Add manual benchmark workflow and S3 result persistence

8fbe9a6

Sanitize branch names for S3 more comprehensible


          Refactor persistence handling to use a factory pattern for better fle…

14b8755

…xibility and add local persistence option


          fixup! Refactor persistence handling to use a factory pattern for bet…

56d97e7

…ter flexibility and add local persistence option

Remove unused import for make_dict_unstructure_fn in config.py


          fixup! Refactor persistence handling to use a factory pattern for bet…

96d80d2

…ter flexibility and add local persistence option

Implement method overrides for persist_new_result in S3Persistence and LocalPersistence classes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet