Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ABR_Sim Results Replication #8

Open
hashbrown512 opened this issue Mar 31, 2020 · 4 comments
Open

ABR_Sim Results Replication #8

hashbrown512 opened this issue Mar 31, 2020 · 4 comments

Comments

@hashbrown512
Copy link
Contributor

I had discussed with @hongzimao issues with replicating results on ABRSimEnv
This post doesn’t need a response, just posting here so others can learn from it.

I had initially had issues replicating results on the ABRSimEnv.
The A2C agent in the Park paper contains scores of around ~420+-210

I was able to replicate the scores on ABR using code from @hongzimao here: abr_agents.zip

Entropy Ratio | Average Episode Score and Standard Deviation for 100,000 actions
10.0: | 517.3681106430971 +- 405.73426203813045
5.0: | 524.5324282999072 +- 400.950983685324

I was able to reach similar results using the same parameters in an A2C agent from stable-baselines modified with entropy decay, and a vf_coef of 0.25 a2c_stable_baselines.zip

Entropy Ratio | Average Episode Score and Standard Deviation for 100,000 actions
10.0 | 441.72765 +- 343.60534
5.0 | 420.04653 +- 178.98197
However, I initially ran the same experiments with RMSProp (default parameters) for optimization and was not able to beat the robustMPC and buffer based heuristics.

Thanks for the help!!

@hongzimao
Copy link
Contributor

Thanks for sharing! :)

@xuzhiyuan1528
Copy link

Thanks for sharing! :)

I noticed there are several implemented baseline agents for the task congestion_control in eval_baseline.py (from the above file abr_agents.zip):

from congestion_control_agents.static_agent import StaticAgent
from congestion_control_agents.tcp_vegas_agent import TCPVegasAgent

I am wondering if it possible to share the codes of these baseline agents? Thus I can learn how to implement the classical algorithms and use them for comparison in Park.

Thanks a lot for your help!

@hongzimao
Copy link
Contributor

I will just copy&paste the agent's code here for the classical control since they are not too long.

import numpy as np


class StaticAgent(object):
    def __init__(self, state_space, action_space, *args, **kwargs):
        self.state_space = state_space
        self.action_space = action_space

        self.rew_record = 0

        self.cwnd = args[0]  # e.g., 100 cwnd
        self.rate = args[1]  # e.g., 1e8 mbps

    def get_action(self, obs, prev_reward, prev_done, prev_info):
        self.rew_record = self.rew_record * 0.8 + prev_reward * 0.2
        return np.array([self.cwnd, self.rate])

and

import numpy as np


class TCPVegasAgent(object):
    def __init__(self, state_space, action_space, *args, **kwargs):
        self.state_space = state_space
        self.action_space = action_space

        self.rew_record = 0
        self.cwnd = args[0]
        self.rate = args[1]

        self.min_rtt = np.inf
        self.pkt_size = 1500  # bytes

    def get_action(self, obs, prev_reward, prev_done, prev_info):
        self.rew_record = self.rew_record * 0.8 + prev_reward * 0.2

        obs_rtt = obs[-3]
        obs_rout = obs[-1]

        if self.min_rtt > obs_rtt:
            self.min_rtt = obs_rtt

        queue_delay = obs_rtt - self.min_rtt  # microseconds
        throughput = obs_rout # bytes/sec
        num_pkts = queue_delay / 1e6 * throughput / self.pkt_size

        if num_pkts < 20:
            return np.array([self.cwnd + 1, self.rate])
        elif num_pkts > 20:
            return np.array([self.cwnd - 1, self.rate])
        else:
            return np.array([self.cwnd, self.rate])

@xuzhiyuan1528
Copy link

Thanks a lot for the code!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants