Fix TD3 DDPG Implementation: Move Sampling Inside Gradient Step LoopFix implementation mismatch #147

alessandroassirelli98 · 2024-04-15T17:08:10Z

This pull request addresses a discrepancy between the original TD3 and DDPG paper's algorithm and the current implementation in the repository. Specifically, the original implementation performs the sampling step outside of the gradient step loop, which diverges from the methodology outlined in the paper. We have corrected this by moving the sampling process inside the gradient step loop, aligning the implementation more closely with the intended algorithmic procedure described in the original paper and SpinningUp description.

Toni-SM · 2024-04-16T14:32:43Z

Hi @alessandroassirelli98

Great!

Please, also update:

docs

skrl/docs/source/api/agents/ddpg.rst

Lines 50 to 51 in 631613a

    
           | :green:`# sample a batch from memory` 
        
           | [:math:`s, a, r, s', d`] :math:`\leftarrow` states, actions, rewards, next_states, dones of size :guilabel:`batch_size`

skrl/docs/source/api/agents/td3.rst

Lines 50 to 51 in 631613a

    
           | :green:`# sample a batch from memory` 
        
           | [:math:`s, a, r, s', d`] :math:`\leftarrow` states, actions, rewards, next_states, dones of size :guilabel:`batch_size`

CHANGELOG under the following version (create it if it doesn't exist)

## [1.2.0] - Unreleased
### Fixed
- YOUR_FIX_IMPLEMENTATION_SHORT_DESCRIPTION

alessandroassirelli98 · 2024-04-16T15:30:51Z

Hi @Toni-SM !
I see the current version is 1.1.0.
Should I just add a comment on that version in the changelog, like so?

Toni-SM · 2024-04-16T16:01:44Z

I see the current version is 1.1.0.

Version 1.1.0 is already released (see: https://github.com/Toni-SM/skrl/releases)

New additions/changes/fixes (like the one you proposed) will be released under version 1.2.0. Please, use this for CHANGELOG:

## [1.2.0] - Unreleased
### Fixed
- YOUR_FIX_IMPLEMENTATION_SHORT_DESCRIPTION

Fix implementation mismatch

3c1b712

Toni-SM mentioned this pull request Apr 16, 2024

Fix TD3 DDPG Implementation: Move Sampling Inside Gradient Step Loop #144

Closed

Updated doc

f8e0b27

alessandroassirelli98 and others added 2 commits April 16, 2024 18:53

set changelog

3a08596

Update CHANGELOG.md

94f18fa

Toni-SM approved these changes Apr 17, 2024

View reviewed changes

Toni-SM merged commit a3fb36c into Toni-SM:develop Apr 17, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix TD3 DDPG Implementation: Move Sampling Inside Gradient Step LoopFix implementation mismatch #147

Fix TD3 DDPG Implementation: Move Sampling Inside Gradient Step LoopFix implementation mismatch #147

alessandroassirelli98 commented Apr 15, 2024

Toni-SM commented Apr 16, 2024

alessandroassirelli98 commented Apr 16, 2024

Toni-SM commented Apr 16, 2024

Fix TD3 DDPG Implementation: Move Sampling Inside Gradient Step LoopFix implementation mismatch #147

Fix TD3 DDPG Implementation: Move Sampling Inside Gradient Step LoopFix implementation mismatch #147

Conversation

alessandroassirelli98 commented Apr 15, 2024

Toni-SM commented Apr 16, 2024

alessandroassirelli98 commented Apr 16, 2024

Toni-SM commented Apr 16, 2024