-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] Introduce new OldAPIStack decorator; Do-over of all API decorators #43657
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. More cleanup and finally I can feel the last breath of the old stack :D
@@ -11,7 +11,7 @@ | |||
tf1, _, _ = try_import_tf() | |||
|
|||
|
|||
@ExperimentalAPI | |||
@OldAPIStack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why old stack? Isnt't this the class SingleAgentEnvRunner
and MultiAgentEnvRunner
inherit from?
@@ -19,7 +19,7 @@ | |||
logger = logging.getLogger(__name__) | |||
|
|||
|
|||
@PublicAPI | |||
@OldAPIStack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imo we should keep this until we have a better solution to vectorize multi-agent environments as pettingzoo
doe snot provide vectorization, yet.
from ray.rllib.utils.torch_utils import sequence_mask | ||
from ray.rllib.utils.framework import TensorType | ||
|
||
torch, nn = try_import_torch() | ||
|
||
|
||
@OldAPIStack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's actually the state on attention? WIll the new stack get encoders and heads with attention? Also, I see more often in papers an MLP encoder followed by an LSTM head - maybe we should provide an LSTM head in the same breath, too.
@@ -60,7 +59,7 @@ | |||
logger = logging.getLogger(__name__) | |||
|
|||
|
|||
@DeveloperAPI | |||
@OldAPIStack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bye bye policy :) I like it.
@@ -12,7 +13,7 @@ def _linear_interpolation(left, right, alpha): | |||
return left + alpha * (right - left) | |||
|
|||
|
|||
@PublicAPI | |||
@DeveloperAPI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might think about a sinusoidal schedule - I see this more often in the papers now.
…api_stack_decorator
…d_api_stack_decorator
Introduce new OldAPIStack decorator; Do-over of all API decorators
Why are these changes needed?
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.