Skip to content

Commit

Permalink
[llvm-ic-v0/cBench-v0] Leaderboard submission: Random Search.
Browse files Browse the repository at this point in the history
This adds a pure random policy that selects actions randomly until a
fixed number of steps (the "patience" of the search) have been
evaluated without an improvement to reward. The search stops after a
predetermined amount of search time has elapsed.

The patience value was selected by running a random 200 searches on
programs from the `blas-v0` dataset and selecting the value that
produced the best average reward.
  • Loading branch information
ChrisCummins committed Mar 22, 2021
1 parent 6b9dbd1 commit b84fa03
Show file tree
Hide file tree
Showing 9 changed files with 780 additions and 0 deletions.
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,8 +154,12 @@ environment on the 23 benchmarks in the `cBench-v1` dataset.

| Author | Algorithm | Links | Date | Walltime (mean) | Codesize Reduction (geomean) |
| --- | --- | --- | --- | --- | --- |
| Facebook | Random search (t=10800) | [write-up](leaderboard/llvm_codesize/random_search/README.md), [results](leaderboard/llvm_codesize/random_search/results_p125_t10800.csv) | 2021-03 | 10,512.356s | **1.062×** |
| Facebook | Random search (t=3600) | [write-up](leaderboard/llvm_codesize/random_search/README.md), [results](leaderboard/llvm_codesize/random_search/results_p125_t3600.csv) | 2021-03 | 3,630.821s | 1.061× |
| Facebook | Greedy search | [write-up](leaderboard/llvm_codesize/e_greedy/README.md), [results](leaderboard/llvm_codesize/e_greedy/results_e0.csv) | 2021-03 | 169.237s | 1.055× |
| Facebook | Random search (t=60) | [write-up](leaderboard/llvm_codesize/random_search/README.md), [results](leaderboard/llvm_codesize/random_search/results_p125_t60.csv) | 2021-03 | 91.215s | 1.045× |
| Facebook | e-Greedy search (e=0.1) | [write-up](leaderboard/llvm_codesize/e_greedy/README.md), [results](leaderboard/llvm_codesize/e_greedy/results_e10.csv) | 2021-03 | 152.579s | 1.041× |
| Facebook | Random search (t=10) | [write-up](leaderboard/llvm_codesize/random_search/README.md), [results](leaderboard/llvm_codesize/random_search/results_p125_t10.csv) | 2021-03 | **42.939s** | 1.031× |

# Contributing

Expand Down
22 changes: 22 additions & 0 deletions leaderboard/llvm_codesize/random_search/BUILD
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Copyright (c) Facebook, Inc. and its affiliates.
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

py_library(
name = "random_search",
srcs = ["random_search.py"],
deps = [
"//leaderboard/llvm_codesize:eval_policy",
],
)

py_test(
name = "random_search_test",
timeout = "short",
srcs = ["random_search_test.py"],
deps = [
":random_search",
"//tests:test_main",
],
)
130 changes: 130 additions & 0 deletions leaderboard/llvm_codesize/random_search/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# Random Search

**tldr;**
A pure random policy that records the best result found within a fixed time
budget.

**Authors:**
Facebook AI Research

**Results:**

| | Walltime (mean) | Codesize Reduction (geomean) |
| --- | --- | --- |
| [Random search, p=1.25, t=10](results_p125_t10.csv) | 42.939s | 1.031× |
| [Random search, p=1.25, t=60](results_p125_t60.csv) | 91.215s | 1.045× |
| [Random search, p=1.25, t=3600](results_p125_t3600.csv) | 3,630.821s | 1.061× |
| [Random search, p=1.25, t=10800](results_p125_t10800.csv) | 10,512.356s | 1.062× |


**Publication:**
<!-- TODO(cummins): Add CompilerGym citation when ready. -->

**CompilerGym version:**
0.1.4

**Open source?**
Yes, MIT licensed. [Source Code](random_search.py).

**Did you modify the CompilerGym source code?**
No.

**What parameters does the approach have?**
Search time *t*, patience *p* (see "Description" below).

**What range of values were considered for the above parameters?** Search time
is fixed and is indicated by the leaderboard entry name, e.g. "Random search
(t=30)" means a random search for a minimum 30 seconds. Eight values for
patience were considered, *n/4*, *n/2*, *3n/4*, *n*, *5n/4*, *3n/2*, *7n/8*, and
*2n*, where *n* is the size of the initial action space. The patience value was
selected using the `blas-v0` dataset for validation, see appendix below.

**Is the policy deterministic?**
No.

## Description

This approach uses a simple random agent on the action space of the target
program. This is equivalent to running `python -m
compiler_gym.bin.random_search` on the programs in the test set.

The random search operates by selecting actions randomly until a fixed number of
steps (the "patience" of the search) have been evaluated without an improvement
to reward. The search stops after a predetermined amount of search time has
elapsed.

Pseudo-code for this search is:

```c++
float search_with_patience(CompilerEnv env, int patience, int search_time) {
float best_reward = -INFINITY;
int end_time = time() + search_time;
while (time() < end_time) {
env.reset();
int p = patience;
bool done = false;
do {
env.step(env.action_space.sample());
if (env.reward() > best_reward) {
p = patience; // Reset patience every time progress is made
best_reward = env.reward();
}
} while (--p && time() < end_time && !env.done())
// search terminates when patience or search time is exhausted, or
// terminal state is reached
}
return best_reward;
}
```
To reproduce the search, run the [random_search.py](random_search.py) script.
### Machine Description
| | Hardware Specification |
| ------ | --------------------------------------------- |
| OS | Ubuntu 20.04 |
| CPU | Intel Xeon Gold 6230 CPU @ 2.10GHz (80× core) |
| Memory | 754.5 GiB |
### Tuning the patience parameter
The patience value is specified as a ratio of the LLVM action space size. The
`--patience_ratio` value was selected by running a random 200 searches on
programs from the `blas-v0` dataset and selecting the value that produced the
best average reward:
```sh
#!/usr/bin/env bash
set -euo pipefail
MAX_BENCHMARKS=20
REPS_PER_BENCHMARK=10
SEARCH_TIME=30
VALIDATION_DATASET=blas-v0
for patience_ratio in 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 ; do
echo -e "\nEvaluating --patience_ratio=$patience_ratio"
logfile="blas_t${SEARCH_TIME}_${patience_ratio}.csv"
python random_search.py \
--max_benchmarks="${MAX_BENCHMARKS}" \
--n="${REPS_PER_BENCHMARK}" \
--dataset="${VALIDATION_DATASET}" \
--search_time="${SEARCH_TIME}" \
--patience_ratio="${PATIENCE_RATIO}" \
--logfile="$LOGFILE"
python -m compiler_gym.bin.validate --env=llvm-ic-v0 \
--reward_aggregation=geomean < "$LOGFILE" | tail -n4
done
```

The patience value that returned the best geomean reward can then be used as the
value for the search on the test set. For example:

```sh
$ python random_search.py \
--search_time=30 \
--patience_ratio=1.25 \
--logfile=results_p125_t30.csv
```
78 changes: 78 additions & 0 deletions leaderboard/llvm_codesize/random_search/random_search.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# Copyright (c) Facebook, Inc. and its affiliates.
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
"""An implementation of a random search policy for the LLVM codesize task.
The search is the same as the included compiler_gym.bin.random_search. See
random_search.md for a detailed description.
"""
import os
import sys
from time import sleep

import gym
from absl import flags

from compiler_gym.envs import LlvmEnv
from compiler_gym.random_search import RandomAgentWorker

# Import the ../eval_policy.py helper.
sys.path.insert(0, os.path.dirname(os.path.realpath(__file__)) + "/..")
from eval_policy import eval_policy # noqa pylint: disable=wrong-import-position

flags.DEFINE_float(
"patience_ratio",
1.0,
"The ratio of patience to the size of the action space. "
"Patience = patience_ratio * action_space_size",
)
flags.DEFINE_integer(
"search_time",
60,
"The minimum number of seconds to run the random search for. After this "
"many seconds have elapsed the best results are aggregated from the "
"search threads and the search is terminated.",
)
FLAGS = flags.FLAGS


def random_search(env: LlvmEnv) -> None:
"""Run a random search on the given environment."""
patience = int(env.action_space.n * FLAGS.patience_ratio)

# Start parallel random search workers.
workers = [
RandomAgentWorker(
make_env=lambda: gym.make("llvm-ic-v0", benchmark=env.benchmark),
patience=patience,
)
for _ in range(FLAGS.nproc)
]
for worker in workers:
worker.start()

sleep(FLAGS.search_time)

# Stop the workers.
for worker in workers:
worker.alive = False
for worker in workers:
worker.join()

# Aggregate the best results.
best_actions = []
best_reward = -float("inf")
for worker in workers:
if worker.best_returns > best_reward:
best_reward, best_actions = worker.best_returns, list(worker.best_actions)

# Replay the best sequence of actions to produce the final environment
# state.
for action in best_actions:
_, _, done, _ = env.step(action)
assert not done


if __name__ == "__main__":
eval_policy(random_search)
36 changes: 36 additions & 0 deletions leaderboard/llvm_codesize/random_search/random_search_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Copyright (c) Facebook, Inc. and its affiliates.
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
"""Tests for //leaderboard/llvm_codesize:eval_policy."""
import pytest
from absl import flags

from leaderboard.llvm_codesize.random_search.random_search import (
eval_policy,
random_search,
)
from tests.test_main import main as _test_main

FLAGS = flags.FLAGS


def test_random_search():
FLAGS.unparse_flags()
FLAGS(
[
"argv0",
"--n=1",
"--max_benchmarks=1",
"--search_time=1",
"--nproc=1",
"--patience_ratio=0.1",
"--novalidate",
]
)
with pytest.raises(SystemExit):
eval_policy(random_search)


if __name__ == "__main__":
_test_main()
Loading

0 comments on commit b84fa03

Please sign in to comment.