Cache Env #9

hashbrown512 · 2020-04-03T23:07:52Z

Hi, I have a few questions about the cache environment.

I can submit fixes, but I wanted to see if there was another version of the environment first. Is the environment that is currently on GitHub what was used in the Park paper evaluation?

The park cache environment will crash during execution. The provided test traces are numbered 0 to 999 and the random integer in reset(low = 1, high = 1001) can be between 1 and 1000.

    def reset(self, low=1, high=1001):
        new_trace = self.np_random.randint(low, high)

Objects that have never been seen before are assigned a last request time of 500. What is the reasoning behind assigning this constant for the time?

    def get_state(self, obj=[0, 0, 0, 0]):
        '''
        Return the state of the object,  [obj_size, cache_size_online_remain, self.last_req_time_dict[obj_id]]
        '''
        obj_time, obj_id, obj_size = obj[0], obj[1], obj[2]
        try:
            req = self.req - self.cache[obj_id][1]
        except IndexError:
            try:
                req = self.req - self.non_cache[obj_id][1]
            except IndexError:
                req = 500
        state = [obj_size, self.cache_remain, req]

def step(self, action, obj):
....
        # Initialize the last request time
        try:
            self.last_req_time_dict[obj_id] = req - self.cache[obj[1]][1]
        except IndexError:
            try:
                self.last_req_time_dict[obj_id] = req - self.non_cache[obj[1]][1]
            except IndexError:
                self.last_req_time_dict[obj_id] = 500

The Park paper states that you use an open dataset containing 500 million requests. When running the environment with trace = real, it deterministically starts at the beginning of the trace to continue for 500 million requests, and it is not chunked into different episodes like 'test' traces. Is this intended for evaluation purposes?

The Park paper additionally states that the cache environment supports training an eviction agent together with the admission agent. Is there code for this available? It looks like this isn’t implemented in the repo. No worries at all if this wasn't completed.

Thanks for the help!

The text was updated successfully, but these errors were encountered:

hongzimao · 2020-04-04T03:11:51Z

I'm cc'ing @haonanw98 who developed this part of the environment to better answer your questions.

haonanw98 · 2020-04-04T03:41:51Z

Sorry for the crashing. I will fix this soon. (The code here might be slightly different from the code we use for Park evaluation.)

The reason for assigning this time constant for unseen object is to help training. The general idea is that we should assign a gap(the time from last request to now) for unseen object that is big but not that big. If we directly assign a very big value(e.g., INT_MAX) then the agent will always not admit it. You can change this value according to the dataset you use.

The trace = real option is just for you to download the dataset. It is not chunked into different episodes because this real dataset cannot be used for RL training directly. You should subsample it. For more details, please take a look at our paper Learning Caching Policies with Subsampling (http://mlforsystems.org/assets/papers/neurips2019/learning_wang_2019.pdf).

In the following research, we find that it is not necessary to train an eviction agent together with the admission agent, as the eviction agent itself can also be the admission agent. I do have a version of park caching for training eviction object.

hongzimao · 2020-04-04T14:23:49Z

Also @hashbrown512 , please feel free to submit a fix as pull request and @haonanw98 can help merge it into the main repo.

hashbrown512 · 2020-04-04T15:52:55Z

@hongzimao @haonanw98 Thank you both for your prompt responses, I really appreciate the help!! I will submit a PR later today for the crashing issue, and I can additionally add the unseen object time constant as a parameter in the config file and add some clean up to the cache.py file. This will be my first PR in a public repo :')

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache Env #9

Cache Env #9

hashbrown512 commented Apr 3, 2020

hongzimao commented Apr 4, 2020

haonanw98 commented Apr 4, 2020 •

edited by hongzimao

Loading

hongzimao commented Apr 4, 2020

hashbrown512 commented Apr 4, 2020

Cache Env #9

Cache Env #9

Comments

hashbrown512 commented Apr 3, 2020

hongzimao commented Apr 4, 2020

haonanw98 commented Apr 4, 2020 • edited by hongzimao Loading

hongzimao commented Apr 4, 2020

hashbrown512 commented Apr 4, 2020

haonanw98 commented Apr 4, 2020 •

edited by hongzimao

Loading