Building an agent using DSPy that can interact with the webarena environment.
We are looking to achieve SOTA performance on webarena benchmark, by implementing various ideas like:
The sample agent is for solving the task of finding the walking distance between 2 locations on OpenStreetMap.
- Install webarena depedencies
cd webarena
uv venv -p 3.11 --seed
source .venv/bin/activate
pip install -r requirements.txt
playwright install
pip install -e .
- Configure the environment
# export MAP="http://ec2-3-131-244-37.us-east-2.compute.amazonaws.com:3000"
export MAP="https://www.openstreetmap.org"
export SHOPPING="<your_shopping_site_domain>:7770"
export SHOPPING_ADMIN="<your_e_commerce_cms_domain>:7780/admin"
export REDDIT="<your_reddit_domain>:9999"
export GITLAB="<your_gitlab_domain>:8023"
- Obtain the auto-login cookies for all websites
mkdir -p ./.auth
python browser_env/auto_login.py
- Copy the map configs to config_data folder
python scripts/generate_test_data.py
For Fedora:
grep -ol "\"map\"" config_files/*.json | xargs cp -t ../config_data/
For Mac:
grep -ol "\"map\"" config_files/*.json | xargs -I {} cp {} ../config_data/
rm ../config_data/test*.json
- Setup the environment from the root directory
source webarena/.venv/bin/activate
python -m scripts.evaluate.debug_webarena