-
Notifications
You must be signed in to change notification settings - Fork 44.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AGBenchmark: Codebase clean-up #6650
Commits on Dec 28, 2023
-
refactor(benchmark): Deduplicate configuration loading logic
- Move the configuration loading logic to a separate `load_agbenchmark_config` function in `agbenchmark/config.py` module. - Replace the duplicate loading logic in `conftest.py`, `generate_test.py`, `ReportManager.py`, `reports.py`, and `__main__.py` with calls to `load_agbenchmark_config` function.
Configuration menu - View commit details
-
Copy full SHA for 20862ca - Browse repository at this point
Copy the full SHA 20862caView commit details -
fix(benchmark): Fix type errors, linting errors, and clean up CLI val…
…idation in __main__.py - Fixed type errors and linting errors in `__main__.py` - Improved the readability of CLI argument validation by introducing a separate function for it
Configuration menu - View commit details
-
Copy full SHA for c14cfd8 - Browse repository at this point
Copy the full SHA c14cfd8View commit details
Commits on Dec 29, 2023
-
refactor(benchmark): Lint and typefix app.py
- Rearranged and cleaned up import statements - Fixed type errors caused by improper use of `psutil` objects - Simplified a number of `os.path` usages by converting to `pathlib` - Use `Task` and `TaskRequestBody` classes from `agent_protocol_client` instead of `.schema`
Configuration menu - View commit details
-
Copy full SHA for 4a32265 - Browse repository at this point
Copy the full SHA 4a32265View commit details -
refactor(benchmark): Replace
.agent_protocol_client
by `agent-protc……ol-client`, clean up schema.py - Remove `agbenchmark.agent_protocol_client` (an offline copy of `agent-protocol-client`). - Add `agent-protocol-client` as a dependency and change imports to `agent_protocol_client`. - Fix type annotation on `agent_api_interface.py::upload_artifacts` (`ApiClient` -> `AgentApi`). - Remove all unused types from schema.py (= most of them).
Configuration menu - View commit details
-
Copy full SHA for 60b9148 - Browse repository at this point
Copy the full SHA 60b9148View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9fb7b75 - Browse repository at this point
Copy the full SHA 9fb7b75View commit details -
Configuration menu - View commit details
-
Copy full SHA for 14d52b8 - Browse repository at this point
Copy the full SHA 14d52b8View commit details -
refactor(benchmark): Improve typing, response validation, and readabi…
…lity in app.py - Simplified response generation by leveraging type checking and conversion by FastAPI. - Introduced use of `HTTPException` for error responses. - Improved naming, formatting, and typing in `app.py::create_evaluation`. - Updated the docstring on `app.py::create_agent_task`. - Fixed return type annotations of `create_single_test` and `create_challenge` in generate_test.py. - Added default values to optional attributes on models in report_types_v2.py. - Removed unused imports in `generate_test.py`
Configuration menu - View commit details
-
Copy full SHA for 41b4972 - Browse repository at this point
Copy the full SHA 41b4972View commit details
Commits on Dec 30, 2023
-
refactor(benchmark): Clean up logging and print statements
- Introduced use of the `logging` library for unified logging and better readability. - Converted most print statements to use `logger.debug`, `logger.warning`, and `logger.error`. - Improved descriptiveness of log statements. - Removed unnecessary print statements. - Added log statements to unspecific and non-verbose `except` blocks. - Added `--debug` flag, which sets the log level to `DEBUG` and enables a more comprehensive log format. - Added `.utils.logging` module with `configure_logging` function to easily configure the logging library. - Converted raw escape sequences in `.utils.challenge` to use `colorama`. - Renamed `generate_test.py::generate_tests` to `load_challenges`.
Configuration menu - View commit details
-
Copy full SHA for 4064eb7 - Browse repository at this point
Copy the full SHA 4064eb7View commit details -
refactor(benchmark): Remove unused server.py and agent_interface.py::…
…run_agent - Remove unused server.py file - Remove unused run_agent function from agent_interface.py
Configuration menu - View commit details
-
Copy full SHA for 56d8d83 - Browse repository at this point
Copy the full SHA 56d8d83View commit details -
refactor(benchmark): Clean up conftest.py
- Fix and add type annotations - Rewrite docstrings - Disable or remove unused code - Fix definition of arguments and their types in `pytest_addoption`
Configuration menu - View commit details
-
Copy full SHA for 1aa1261 - Browse repository at this point
Copy the full SHA 1aa1261View commit details -
refactor(benchmark): Clean up generate_test.py file
- Refactored the `create_single_test` function for clarity and readability - Removed unused variables - Made creation of `Challenge` subclasses more straightforward - Made bare `except` more specific - Renamed `Challenge.setup_challenge` method to `run_challenge` - Updated type hints and annotations - Made minor code/readability improvements in `load_challenges` - Added a helper function `_add_challenge_to_module` for attaching a Challenge class to the current module
Configuration menu - View commit details
-
Copy full SHA for d89c7ea - Browse repository at this point
Copy the full SHA d89c7eaView commit details -
Configuration menu - View commit details
-
Copy full SHA for 294f6ff - Browse repository at this point
Copy the full SHA 294f6ffView commit details -
refactor(benchmark): Simplify const determination in agent_interface.py
- Simplify the logic that determines the value of `HELICONE_GRAPHQL_LOGS`
Configuration menu - View commit details
-
Copy full SHA for 1ea4123 - Browse repository at this point
Copy the full SHA 1ea4123View commit details -
fix(benchmark): Register category markers to prevent warnings
- Use the `pytest_configure` hook to register the known challenge categories as markers. Otherwise, Pytest will raise "unknown marker" warnings at runtime.
Configuration menu - View commit details
-
Copy full SHA for c7cf2c7 - Browse repository at this point
Copy the full SHA c7cf2c7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1db4bdc - Browse repository at this point
Copy the full SHA 1db4bdcView commit details -
refactor(benchmark): Update agent_api_interface.py
- Add type annotations to `copy_agent_artifacts_into_temp_folder` function - Add note about broken endpoint in the `agent_protocol_client` library - Remove unused variable in `run_api_agent` function - Improve readability and resolve linting error
Configuration menu - View commit details
-
Copy full SHA for 420469e - Browse repository at this point
Copy the full SHA 420469eView commit details -
feat(benchmark): Improve and centralize pathfinding
- Search path hierarchy for applicable `agbenchmark_config`, rather than assuming it's in the current folder. - Create `agbenchmark.utils.path_manager` with `AGBenchmarkPathManager` and exporting a `PATH_MANAGER` const. - Replace path constants defined in __main__.py with usages of `PATH_MANAGER`.
Configuration menu - View commit details
-
Copy full SHA for c3f2162 - Browse repository at this point
Copy the full SHA c3f2162View commit details -
feat(benchmark/cli): Clean up and improve CLI
- Updated commands, options, and their descriptions to be more intuitive and consistent - Moved slow imports into the entrypoints that use them to speed up application startup - Fixed type hints to match output types of Click options - Hid deprecated `agbenchmark start` command - Refactored code to improve readability and maintainability - Moved main entrypoint into `run` subcommand - Fixed `version` and `serve` subcommands - Added `click-default-group` package to allow using `run` implicitly (for backwards compatibility) - Renamed `--no_dep` to `--no-dep` for consistency - Fixed string formatting issues in log statements
Configuration menu - View commit details
-
Copy full SHA for fab5366 - Browse repository at this point
Copy the full SHA fab5366View commit details
Commits on Jan 1, 2024
-
refactor(benchmark/config): Move AgentBenchmarkConfig and related fun…
…ctions to config.py - Move the `AgentBenchmarkConfig` class from `utils/data_types.py` to `config.py`. - Extract the `calculate_info_test_path` function from `utils/data_types.py` and move it to `config.py` as a private helper function `_calculate_info_test_path`. - Move `load_agent_benchmark_config()` to `AgentBenchmarkConfig.load()`. - Changed simple getter methods on `AgentBenchmarkConfig` to calculated properties. - Update all code references according to the changes mentioned above.
Configuration menu - View commit details
-
Copy full SHA for 956b439 - Browse repository at this point
Copy the full SHA 956b439View commit details -
refactor(benchmark): Fix ReportManager init parameter types and use p…
…athlib - Fix the type annotation of the `benchmark_start_time` parameter in `ReportManager.__init__`, was mistyped as `str` instead of `datetime`. - Change the type of the `filename` parameter in the `ReportManager.__init__` method from `str` to `Path`. - Rename `self.filename` with `self.report_file` in `ReportManager`. - Change the way the report file is created, opened and saved to use the `Path` object.
Configuration menu - View commit details
-
Copy full SHA for 292ea9e - Browse repository at this point
Copy the full SHA 292ea9eView commit details -
refactor(benchmark): Improve typing surrounding ChallengeData and cle…
…an up its implementation - Use `ChallengeData` objects instead of untyped `dict` in app.py, generate_test.py, reports.py. - Remove unnecessary methods `serialize`, `get_data`, `get_json_from_path`, `deserialize` from `ChallengeData` class. - Remove unused methods `challenge_from_datum` and `challenge_from_test_data` from `ChallengeData class. - Update function signatures and annotations of `create_challenge` and `generate_single_test` functions in generate_test.py. - Add types to function signatures of `generate_single_call_report` and `finalize_reports` in reports.py. - Remove unnecessary `challenge_data` parameter (in generate_test.py) and fixture (in conftest.py).
Configuration menu - View commit details
-
Copy full SHA for 8990b23 - Browse repository at this point
Copy the full SHA 8990b23View commit details -
refactor(benchmark): Clean up generate_test.py, conftest.py and __mai…
…n__.py - Cleaned up generate_test.py and conftest.py - Consolidated challenge creation logic in the `Challenge` class itself, most notably the new `Challenge.from_challenge_spec` method. - Moved challenge selection logic from generate_test.py to the `pytest_collection_modifyitems` hook in conftest.py. - Converted methods in the `Challenge` class to class methods where appropriate. - Improved argument handling in the `run_benchmark` function in `__main__.py`.
Configuration menu - View commit details
-
Copy full SHA for 3ccb093 - Browse repository at this point
Copy the full SHA 3ccb093View commit details -
refactor(benchmark/config): Merge AGBenchmarkPathManager into AgentBe…
…nchmarkConfig and reduce fragmented/global state - Merge the functionality of `AGBenchmarkPathManager` into `AgentBenchmarkConfig` to consolidate the configuration management. - Remove the `.path_manager` module containing `AGBenchmarkPathManager`. - Pass the `AgentBenchmarkConfig` and its attributes through function arguments to reduce global state and improve code clarity.
Configuration menu - View commit details
-
Copy full SHA for 6fe5149 - Browse repository at this point
Copy the full SHA 6fe5149View commit details -
feat(benchmark/serve): Configurable port for
serve
subcommand- Added `--port` option to `serve` subcommand to allow for specifying the port to run the API on. - If no `--port` option is provided, the port will default to the value specified in the `PORT` environment variable, or 8080 if not set.
Configuration menu - View commit details
-
Copy full SHA for e09ec4e - Browse repository at this point
Copy the full SHA e09ec4eView commit details -
feat(benchmark/cli): Add
config
subcommand- Added a new subcommand `config` to the AGBenchmark CLI, to display information about the present AGBenchmark config.
Configuration menu - View commit details
-
Copy full SHA for 116f8c9 - Browse repository at this point
Copy the full SHA 116f8c9View commit details
Commits on Jan 2, 2024
-
fix(benchmark): Gracefully handle incompatible challenge spec files i…
…n app.py - Added a check to skip deprecated challenges - Added logging to allow debugging of the loading process - Added handling of validation errors when parsing challenge spec files - Added missing `spec_file` attribute to `ChallengeData`
Configuration menu - View commit details
-
Copy full SHA for fb15bf9 - Browse repository at this point
Copy the full SHA fb15bf9View commit details -
refactor(benchmark): Move
run_benchmark
entrypoint to main.py, use ……it in `/reports` endpoint - Move `run_benchmark` and `validate_args` from __main__.py to main.py - Replace agbenchmark subprocess in `app.py:run_single_test` with `run_benchmark` - Move `get_unique_categories` from __main__.py to challenges/__init__.py - Move `OPTIONAL_CATEGORIES` from __main__.py to challenge.py - Reduce operations on updates.json (including `initialize_updates_file`) outside of API
Configuration menu - View commit details
-
Copy full SHA for b786a29 - Browse repository at this point
Copy the full SHA b786a29View commit details -
refactor(benchmark): Remove unused
/updates
endpoint and all relate……d code - Remove `updates_json_file` attribute from `AgentBenchmarkConfig` - Remove `get_updates` and `_initialize_updates_file` in app.py - Remove `append_updates_file` and `create_update_json` functions in agent_api_interface.py - Remove call to `append_updates_file` in challenge.py
Configuration menu - View commit details
-
Copy full SHA for 27c5459 - Browse repository at this point
Copy the full SHA 27c5459View commit details -
refactor(benchmark/config): Clean up and update docstrings on `AgentB…
…enchmarkConfig` - Add and update docstrings - Change base class from `BaseModel` to `BaseSettings`, allow extras for backwards compatibility - Make naming of path attributes on `AgentBenchmarkConfig` more consistent - Remove unused `agent_home_directory` attribute - Remove unused `workspace` attribute
Configuration menu - View commit details
-
Copy full SHA for d6195b4 - Browse repository at this point
Copy the full SHA d6195b4View commit details -
fix(benchmark): Restore mechanism to select (optional) categories in …
…agent benchmark config
Configuration menu - View commit details
-
Copy full SHA for 7b92e81 - Browse repository at this point
Copy the full SHA 7b92e81View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2b56e67 - Browse repository at this point
Copy the full SHA 2b56e67View commit details -
fix(benchmark): Rename left-behind references to `AgentBenchmarkConfi…
…g` path attributes
Configuration menu - View commit details
-
Copy full SHA for 25c1aae - Browse repository at this point
Copy the full SHA 25c1aaeView commit details -
fix(benchmark): Update agent-protocol-client to v1.1.0
- Fixes issue with fetching task artifact listings
Configuration menu - View commit details
-
Copy full SHA for 2135019 - Browse repository at this point
Copy the full SHA 2135019View commit details