Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sandbox Capabilities Framework #1251

Closed
rbren opened this issue Apr 20, 2024 · 5 comments
Closed

Sandbox Capabilities Framework #1251

rbren opened this issue Apr 20, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@rbren
Copy link
Collaborator

rbren commented Apr 20, 2024

Summary
We have an existing use case for a Jupyter-aware agent, which always runs in a sandbox where Jupyter is available. There are some other scenarios I can think of where an agent might want some guarantees about what it can do with the sandbox:

  • We might want a "postgres migration writer", which needs access to a postgres instance
  • We might have a "cypress test creator" agent, which would need access to cypress
  • Further down the road, we might want to have an Open Interpreter agent, which needs access to osascript
  • etc etc

This proposal would allow agents to guarantee that certain programs are available in the sandbox, or that certain services are running in a predictable way.

What if we did something like this:

Motivation
We want agents to be able to have certain guarantees about the sandbox environment. But we also want our sandbox interface to be generic--something like "you have a bash terminal".

The latter is especially important, because we want users to be able to bring their own sandbox images. E.g. you might use an off-the-shelf haskell image if your project uses haskell--otherwise you'd need to go through the install process every time you start OpenDevin, or maintain a fork of the sandbox.

Technical Design

  • For every requirement we support (e.g. jupyter, postgres, cypress), we have a bash script that
    • checks if it's installed
    • if not, installs it
    • maybe starts something in the background
  • Let agents specify a list of requirements
    • e.g. CodeActAgent could say requirements: ['jupyter']
  • When we start the Agent+Sandbox pair, we run the necessary bash scripts
    • should be pretty quick if the requirement is already built into the image
  • Then the agent has some guarantees about the requirement being met, and how it's running
    • e.g. we can put in the prompt "there's a postgres server running on port 5432, user foo, password bar"
  • If there are specific ways of interacting with that env (e.g. for jupyter, it seems we have to write to a websocket that's open in the sandbox?) the agent can implement custom Actions, like run_in_jupyter

Alternatives to Consider

  • Building a bunch of stuff into one big sandbox
  • Building special sandboxes that are required by certain agents (e.g. a JupyterSandbox)

Additional context
https://opendevin.slack.com/archives/C06QKSD9UBA/p1713552591042089

@rbren
Copy link
Collaborator Author

rbren commented Apr 20, 2024

@mlejva curious to get your thoughts on this one!

@rbren rbren added the enhancement New feature or request label Apr 20, 2024
@Mike-FreeAI
Copy link

@PierrunoYT here

@PierrunoYT
Copy link
Contributor

Devin has reviewed the Sandbox Capabilities Framework as outlined in issue #1251 and finds the proposal to be comprehensive and well-thought-out. It addresses the key concerns and provides a solid foundation for future development and integration.

@Mike-FreeAI
Copy link

Devin has reviewed the Sandbox Capabilities Framework as outlined in issue #1251 and finds the proposal to be comprehensive and well-thought-out. It addresses the key concerns and provides a solid foundation for future development and integration.

come to my github repos

@rbren
Copy link
Collaborator Author

rbren commented Apr 24, 2024

This is in!

@rbren rbren closed this as completed Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants