Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add manual benchmark workflow and S3 result persistence #429

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

fabianliebig
Copy link
Collaborator

@fabianliebig fabianliebig commented Nov 14, 2024

Hi everyone,

This pull request introduces a class and a workflow designed to store the results of a benchmark run on an S3 bucket.

The key used for storage includes the identifier for the benchmark itself, the branch used, the release version, the current date, the commit hash, and the workflow ID as a fallback, in that order. This fallback is necessary because a range of dates is selected when building the visuals, which cannot accurately reflect the time, creating theoretical edge cases where identical keys are possible. Furthermore, boto3 package is added to interact with AWS components.

I will update the changelog and the lockfiles once the change is final.

I look forward to your feedback.

@AVHopp
Copy link
Collaborator

AVHopp commented Nov 18, 2024

@fabianliebig Looking at your commit history, I think the "fixup" did not do what you wanted it to do, you might want to check that out again ;)

@fabianliebig fabianliebig force-pushed the feature/persisting-benchmarking-results branch from f1cdf02 to 87dce85 Compare November 18, 2024 09:10
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated: The Build Docs Action currently fails since a link to an old NeurIPS paper in the Transfer Learning user guide seems to be broken. Need to create a separate PR for this.

Copy link
Collaborator

@AVHopp AVHopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks quite clean and nice. Some things need minor adjustments :)

@@ -0,0 +1,89 @@
"""Classes for persisting and loading experiment results to a S3-Bucket."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"using a S3-Bucket", otherwise it will be hard to find a preposition fitting for both "persisting" and "loading"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently persisting is added only and I forgot to change the module description. Would rather remove the loading part for now :)

_bucket_name: str = field(validator=instance_of(str), init=False)
"""The name of the S3 bucket where the results are stored."""

_object_session = boto3.session.Session()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the PR description with some information about what this boto3 is? Seems to be rather important and central, would appreciate a bit more information on it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that is the software package which is provided by AWS to interact with factory components.

benchmarks/persistance/s3_persistance.py Outdated Show resolved Hide resolved
benchmarks/persistance/s3_persistance.py Outdated Show resolved Hide resolved

@_branch.default
def _default_branch(self) -> str:
repo = Repo(search_parent_directories=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure: I guess the search_parent_directories is necessary since the call will execute the benchmark module which is not the repository and hence needs to go to the BayBE repository?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly :D

)

result_dict = result.to_dict()
result_dict["best_possible_result"] = benchmark.best_possible_result
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if possible (or even within scope), but would it be possible to get the respective attributes of the benchmark object and have these automatically be converted to key-value pairs? This would change the logic from "explicitly including everything" to "explicitly excluding only a few fields" and would make this very robust to pontential changes in the benchmark. Would however like to hear @AdrianSosic and @Scienfitz opinion on that.

Copy link
Collaborator Author

@fabianliebig fabianliebig Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AVHopp I tried to adopt your idea as close as possible, let me know that you think :)

result_dict["optimal_function_inputs"] = benchmark.optimal_function_inputs
result_dict["description"] = benchmark.description

client.put_object(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is the call that actually stores stuff in the S3-Bucket, right? What happens if this call fails for whatever reason, does this yield a reasonable error message or should this be put within a try-except block?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Boto3 comes with different self defined exceptions which have (depending on the error) meaningful semantics (Permission error, ServiceUnavailableException, etc). It is not always perfectly described where the error is coming from bit at least it gives you a hint :).

@@ -2,12 +2,16 @@
# Run this via 'python -m benchmarks' from the root directory.

from benchmarks.domains import BENCHMARKS
from benchmarks.persistance import S3ExperimentResultPersistence


def main():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if someone now runs a benchmark locally? I guess since the environment variables necessary for persisting are not set, the code would crash after finishing the first benchmark, right? So maybe we persistance_handler call should only be done if the corresponding environment variables are set.
Also, would this then mean that we can get rid of the validation within the object itself? Would this be reasonable or a bad code design (summoning @AdrianSosic here)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be the case, yes. Would you want to store it locally? Might be reasonable so that the result can be used afterwards instead of volatile objects and prints from the results.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So one option I see here is to use an abstraction layer over s3 that can connect to different file storage backends. One library for that is fsspec (used to be part of foundry-dev-tools, not sure if still is). With that, you can hide the backend behind an API, which has the advantage that you can flexibly configure where stuff gets actually stored. That way, we could configure it such that it gets written to a default location on the disk when run locally and both the s3 path and the local one can be configured via the same BAYBE_BENCHMARK_PERSISTENCE_PATH env variable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you can check what foundry-dev-tools uses nowadays, perhaps there are more modern alternatives available these days. Although the package is quite solid, I think

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AdrianSosic I've added another suggestion but feel free to exchange on further improvements and restructurings :)

.github/workflows/manual_benchmark.yml Outdated Show resolved Hide resolved
id: generate-token
uses: actions/create-github-app-token@v1
with:
app-id: ${{ vars.APP_ID }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the part where we still need to add some information/configuration to the repository itself, right? For the sake of documentation, can you elaborate a little bit on that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I think at some point a README will be necessary :D. The following information will need to be added:

  • APP_ID: The autogenerated number of the GitHub app used to generate a token for fetching temporary credentials for the runner.
  • APP_PRIVATE_KEY: The key from the GitHub app to assume its role.
  • AWS_ROLE_TO_ASSUME: The role to assume within the AWS-factory (must be deposited as a role to assume for the OIDC authentication)
  • TEST_RESULT_S3_BUCKET: The name of the bucket where the results should be stored.

Refactor S3 persistence class docstrings for clarity and consistency
Rename workflow run ID variable for clarity and update related documentation
Update environment variable references for benchmark persistence
Rename manual benchmark workflow for clarity
Fix spelling errors in persistence module and update import statements
Copy link
Collaborator

@AdrianSosic AdrianSosic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @fabianliebig, here a first few thoughts

.github/workflows/manual_benchmark.yml Outdated Show resolved Hide resolved
@@ -145,6 +145,7 @@ benchmarking = [
"baybe[chem]",
"baybe[onnx]",
"baybe[simulation]",
"boto3>=1.0.0",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there any specific reason for 1.0.0? If not, we could also just ditch the version completely. Let me know

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a jump from 0.0.22 to 1.0.0 in 2015 which usually indicates a major breaking change and since the current version is 1.35.64 I thought the choice would be reasonable. We could also drop the version, but I would want to try the smallest version just to be sure.

@@ -2,12 +2,16 @@
# Run this via 'python -m benchmarks' from the root directory.

from benchmarks.domains import BENCHMARKS
from benchmarks.persistance import S3ExperimentResultPersistence


def main():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So one option I see here is to use an abstraction layer over s3 that can connect to different file storage backends. One library for that is fsspec (used to be part of foundry-dev-tools, not sure if still is). With that, you can hide the backend behind an API, which has the advantage that you can flexibly configure where stuff gets actually stored. That way, we could configure it such that it gets written to a default location on the disk when run locally and both the s3 path and the local one can be configured via the same BAYBE_BENCHMARK_PERSISTENCE_PATH env variable.

@@ -2,12 +2,16 @@
# Run this via 'python -m benchmarks' from the root directory.

from benchmarks.domains import BENCHMARKS
from benchmarks.persistance import S3ExperimentResultPersistence


def main():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you can check what foundry-dev-tools uses nowadays, perhaps there are more modern alternatives available these days. Although the package is quite solid, I think



@define
class S3ExperimentResultPersistence:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't really looked at the details of the class yet, but already the name indicates to me that a class is perhaps not the ideal architecture here. What I mean is that the class actually does not represent anything, it's not "an object of something". Rather, its only responsible for "doing" one thing, namely persisting stuff. And that is reflected by the somewhat weird name.

My initial reflex was: this should probably just be a function that uses clever dependency injection. What I see by taking a glimpse at your code is that you need some part that is responsible for "connecting to the file backend" and one part that "specifies baybe context". Would it make sense, e.g. to define some sort of "connector" object / protocol that then gets simply passed to persist_result? I think this would also play very nicely with the fsspec idea, since you can then still easily abstract away the actual type of backend used.

Copy link
Collaborator Author

@fabianliebig fabianliebig Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more important question is whether we want to hide the method by which an item is created. I see that class more as some kind of an interface implementation (which it was initially) to extend if needed. If we use fsspec and there is an S3 implementation for it, we must define the path explicitly. This necessitates different key constructions, which the function would need to handle. We could build a wrapper around the path construction to access everything as a file (and don't get me wrong, I like that Unix-like idea). S3 is not a file system but a database and provides different consistency models but we could also use a file system such as EFS or a database. However, it seems to me that this might fall under the YAGNI principle.

Add Benchmark class with serialization support
Refactor S3ExperimentResultPersistence to streamline result dictionary creation
Sanitize branch names for S3 more comprehensible
…ter flexibility and add local persistence option

Remove unused import for make_dict_unstructure_fn in config.py
…ter flexibility and add local persistence option

Implement method overrides for persist_new_result in S3Persistence and LocalPersistence classes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants