-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add manual benchmark workflow and S3 result persistence #429
base: main
Are you sure you want to change the base?
Conversation
@fabianliebig Looking at your commit history, I think the "fixup" did not do what you wanted it to do, you might want to check that out again ;) |
f1cdf02
to
87dce85
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated: The Build Docs
Action currently fails since a link to an old NeurIPS paper in the Transfer Learning user guide seems to be broken. Need to create a separate PR for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks quite clean and nice. Some things need minor adjustments :)
@@ -0,0 +1,89 @@ | |||
"""Classes for persisting and loading experiment results to a S3-Bucket.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"using a S3-Bucket", otherwise it will be hard to find a preposition fitting for both "persisting" and "loading"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently persisting is added only and I forgot to change the module description. Would rather remove the loading part for now :)
_bucket_name: str = field(validator=instance_of(str), init=False) | ||
"""The name of the S3 bucket where the results are stored.""" | ||
|
||
_object_session = boto3.session.Session() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update the PR description with some information about what this boto3 is? Seems to be rather important and central, would appreciate a bit more information on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that is the software package which is provided by AWS to interact with factory components.
|
||
@_branch.default | ||
def _default_branch(self) -> str: | ||
repo = Repo(search_parent_directories=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure: I guess the search_parent_directories
is necessary since the call will execute the benchmark
module which is not the repository and hence needs to go to the BayBE repository?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly :D
) | ||
|
||
result_dict = result.to_dict() | ||
result_dict["best_possible_result"] = benchmark.best_possible_result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if possible (or even within scope), but would it be possible to get the respective attributes of the benchmark
object and have these automatically be converted to key-value pairs? This would change the logic from "explicitly including everything" to "explicitly excluding only a few fields" and would make this very robust to pontential changes in the benchmark
. Would however like to hear @AdrianSosic and @Scienfitz opinion on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AVHopp I tried to adopt your idea as close as possible, let me know that you think :)
result_dict["optimal_function_inputs"] = benchmark.optimal_function_inputs | ||
result_dict["description"] = benchmark.description | ||
|
||
client.put_object( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is the call that actually stores stuff in the S3-Bucket, right? What happens if this call fails for whatever reason, does this yield a reasonable error message or should this be put within a try-except block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Boto3 comes with different self defined exceptions which have (depending on the error) meaningful semantics (Permission error, ServiceUnavailableException, etc). It is not always perfectly described where the error is coming from bit at least it gives you a hint :).
@@ -2,12 +2,16 @@ | |||
# Run this via 'python -m benchmarks' from the root directory. | |||
|
|||
from benchmarks.domains import BENCHMARKS | |||
from benchmarks.persistance import S3ExperimentResultPersistence | |||
|
|||
|
|||
def main(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if someone now runs a benchmark locally? I guess since the environment variables necessary for persisting are not set, the code would crash after finishing the first benchmark, right? So maybe we persistance_handler
call should only be done if the corresponding environment variables are set.
Also, would this then mean that we can get rid of the validation within the object itself? Would this be reasonable or a bad code design (summoning @AdrianSosic here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be the case, yes. Would you want to store it locally? Might be reasonable so that the result can be used afterwards instead of volatile objects and prints from the results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So one option I see here is to use an abstraction layer over s3 that can connect to different file storage backends. One library for that is fsspec
(used to be part of foundry-dev-tools, not sure if still is). With that, you can hide the backend behind an API, which has the advantage that you can flexibly configure where stuff gets actually stored. That way, we could configure it such that it gets written to a default location on the disk when run locally and both the s3 path and the local one can be configured via the same BAYBE_BENCHMARK_PERSISTENCE_PATH
env variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you can check what foundry-dev-tools uses nowadays, perhaps there are more modern alternatives available these days. Although the package is quite solid, I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AdrianSosic I've added another suggestion but feel free to exchange on further improvements and restructurings :)
id: generate-token | ||
uses: actions/create-github-app-token@v1 | ||
with: | ||
app-id: ${{ vars.APP_ID }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the part where we still need to add some information/configuration to the repository itself, right? For the sake of documentation, can you elaborate a little bit on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I think at some point a README will be necessary :D. The following information will need to be added:
- APP_ID: The autogenerated number of the GitHub app used to generate a token for fetching temporary credentials for the runner.
- APP_PRIVATE_KEY: The key from the GitHub app to assume its role.
- AWS_ROLE_TO_ASSUME: The role to assume within the AWS-factory (must be deposited as a role to assume for the OIDC authentication)
- TEST_RESULT_S3_BUCKET: The name of the bucket where the results should be stored.
Refactor S3 persistence class docstrings for clarity and consistency
Rename workflow run ID variable for clarity and update related documentation
Update environment variable references for benchmark persistence
Rename manual benchmark workflow for clarity
Fix spelling errors in persistence module and update import statements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @fabianliebig, here a first few thoughts
@@ -145,6 +145,7 @@ benchmarking = [ | |||
"baybe[chem]", | |||
"baybe[onnx]", | |||
"baybe[simulation]", | |||
"boto3>=1.0.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there any specific reason for 1.0.0
? If not, we could also just ditch the version completely. Let me know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a jump from 0.0.22 to 1.0.0 in 2015 which usually indicates a major breaking change and since the current version is 1.35.64 I thought the choice would be reasonable. We could also drop the version, but I would want to try the smallest version just to be sure.
@@ -2,12 +2,16 @@ | |||
# Run this via 'python -m benchmarks' from the root directory. | |||
|
|||
from benchmarks.domains import BENCHMARKS | |||
from benchmarks.persistance import S3ExperimentResultPersistence | |||
|
|||
|
|||
def main(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So one option I see here is to use an abstraction layer over s3 that can connect to different file storage backends. One library for that is fsspec
(used to be part of foundry-dev-tools, not sure if still is). With that, you can hide the backend behind an API, which has the advantage that you can flexibly configure where stuff gets actually stored. That way, we could configure it such that it gets written to a default location on the disk when run locally and both the s3 path and the local one can be configured via the same BAYBE_BENCHMARK_PERSISTENCE_PATH
env variable.
@@ -2,12 +2,16 @@ | |||
# Run this via 'python -m benchmarks' from the root directory. | |||
|
|||
from benchmarks.domains import BENCHMARKS | |||
from benchmarks.persistance import S3ExperimentResultPersistence | |||
|
|||
|
|||
def main(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you can check what foundry-dev-tools uses nowadays, perhaps there are more modern alternatives available these days. Although the package is quite solid, I think
|
||
|
||
@define | ||
class S3ExperimentResultPersistence: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't really looked at the details of the class yet, but already the name indicates to me that a class is perhaps not the ideal architecture here. What I mean is that the class actually does not represent anything, it's not "an object of something". Rather, its only responsible for "doing" one thing, namely persisting stuff. And that is reflected by the somewhat weird name.
My initial reflex was: this should probably just be a function that uses clever dependency injection. What I see by taking a glimpse at your code is that you need some part that is responsible for "connecting to the file backend" and one part that "specifies baybe context". Would it make sense, e.g. to define some sort of "connector" object / protocol that then gets simply passed to persist_result
? I think this would also play very nicely with the fsspec
idea, since you can then still easily abstract away the actual type of backend used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more important question is whether we want to hide the method by which an item is created. I see that class more as some kind of an interface implementation (which it was initially) to extend if needed. If we use fsspec and there is an S3 implementation for it, we must define the path explicitly. This necessitates different key constructions, which the function would need to handle. We could build a wrapper around the path construction to access everything as a file (and don't get me wrong, I like that Unix-like idea). S3 is not a file system but a database and provides different consistency models but we could also use a file system such as EFS or a database. However, it seems to me that this might fall under the YAGNI principle.
Rename workflow to 'Run Benchmark
Add Benchmark class with serialization support
Refactor S3ExperimentResultPersistence to streamline result dictionary creation
Sanitize branch names for S3 more comprehensible
…xibility and add local persistence option
…ter flexibility and add local persistence option Remove unused import for make_dict_unstructure_fn in config.py
…ter flexibility and add local persistence option Implement method overrides for persist_new_result in S3Persistence and LocalPersistence classes
Hi everyone,
This pull request introduces a class and a workflow designed to store the results of a benchmark run on an S3 bucket.
The key used for storage includes the identifier for the benchmark itself, the branch used, the release version, the current date, the commit hash, and the workflow ID as a fallback, in that order. This fallback is necessary because a range of dates is selected when building the visuals, which cannot accurately reflect the time, creating theoretical edge cases where identical keys are possible. Furthermore, boto3 package is added to interact with AWS components.
I will update the changelog and the lockfiles once the change is final.
I look forward to your feedback.