ABR_Sim Results Replication #8

hashbrown512 · 2020-03-31T20:14:22Z

I had discussed with @hongzimao issues with replicating results on ABRSimEnv
This post doesn’t need a response, just posting here so others can learn from it.

I had initially had issues replicating results on the ABRSimEnv.
The A2C agent in the Park paper contains scores of around ~420+-210

I was able to replicate the scores on ABR using code from @hongzimao here: abr_agents.zip

Entropy Ratio | Average Episode Score and Standard Deviation for 100,000 actions
10.0: | 517.3681106430971 +- 405.73426203813045
5.0: | 524.5324282999072 +- 400.950983685324

I was able to reach similar results using the same parameters in an A2C agent from stable-baselines modified with entropy decay, and a vf_coef of 0.25 a2c_stable_baselines.zip

Entropy Ratio | Average Episode Score and Standard Deviation for 100,000 actions
10.0 | 441.72765 +- 343.60534
5.0 | 420.04653 +- 178.98197
However, I initially ran the same experiments with RMSProp (default parameters) for optimization and was not able to beat the robustMPC and buffer based heuristics.

Thanks for the help!!

hongzimao · 2020-03-31T21:38:40Z

Thanks for sharing! :)

xuzhiyuan1528 · 2020-07-19T14:56:00Z

Thanks for sharing! :)

I noticed there are several implemented baseline agents for the task congestion_control in eval_baseline.py (from the above file abr_agents.zip):

from congestion_control_agents.static_agent import StaticAgent
from congestion_control_agents.tcp_vegas_agent import TCPVegasAgent

I am wondering if it possible to share the codes of these baseline agents? Thus I can learn how to implement the classical algorithms and use them for comparison in Park.

Thanks a lot for your help!

hongzimao · 2020-07-19T18:57:47Z

I will just copy&paste the agent's code here for the classical control since they are not too long.

import numpy as np


class StaticAgent(object):
    def __init__(self, state_space, action_space, *args, **kwargs):
        self.state_space = state_space
        self.action_space = action_space

        self.rew_record = 0

        self.cwnd = args[0]  # e.g., 100 cwnd
        self.rate = args[1]  # e.g., 1e8 mbps

    def get_action(self, obs, prev_reward, prev_done, prev_info):
        self.rew_record = self.rew_record * 0.8 + prev_reward * 0.2
        return np.array([self.cwnd, self.rate])

and

import numpy as np


class TCPVegasAgent(object):
    def __init__(self, state_space, action_space, *args, **kwargs):
        self.state_space = state_space
        self.action_space = action_space

        self.rew_record = 0
        self.cwnd = args[0]
        self.rate = args[1]

        self.min_rtt = np.inf
        self.pkt_size = 1500  # bytes

    def get_action(self, obs, prev_reward, prev_done, prev_info):
        self.rew_record = self.rew_record * 0.8 + prev_reward * 0.2

        obs_rtt = obs[-3]
        obs_rout = obs[-1]

        if self.min_rtt > obs_rtt:
            self.min_rtt = obs_rtt

        queue_delay = obs_rtt - self.min_rtt  # microseconds
        throughput = obs_rout # bytes/sec
        num_pkts = queue_delay / 1e6 * throughput / self.pkt_size

        if num_pkts < 20:
            return np.array([self.cwnd + 1, self.rate])
        elif num_pkts > 20:
            return np.array([self.cwnd - 1, self.rate])
        else:
            return np.array([self.cwnd, self.rate])

xuzhiyuan1528 · 2020-07-19T22:01:24Z

Thanks a lot for the code!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ABR_Sim Results Replication #8

ABR_Sim Results Replication #8

hashbrown512 commented Mar 31, 2020

hongzimao commented Mar 31, 2020

xuzhiyuan1528 commented Jul 19, 2020

hongzimao commented Jul 19, 2020

xuzhiyuan1528 commented Jul 19, 2020

ABR_Sim Results Replication #8

ABR_Sim Results Replication #8

Comments

hashbrown512 commented Mar 31, 2020

hongzimao commented Mar 31, 2020

xuzhiyuan1528 commented Jul 19, 2020

hongzimao commented Jul 19, 2020

xuzhiyuan1528 commented Jul 19, 2020