Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[examples] Add a top-level README file #468

Merged
merged 7 commits into from
Oct 20, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,9 +97,9 @@ In Python, import `compiler_gym` to use the environments:
>>> env.close() # closes the environment, freeing resources
```

See the [documentation website](http://facebookresearch.github.io/CompilerGym/)
for tutorials, further details, and API reference. See the [examples](/examples)
directory for pytorch integration, agent implementations, etc.
See the [examples](/examples) directory for agent implementations, environment
extensions, and more. See the [documentation
website](http://facebookresearch.github.io/CompilerGym/) for the API reference.


## Leaderboards
Expand Down
5 changes: 5 additions & 0 deletions compiler_gym/util/shell_format.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,8 @@ def emph(stringable: Any) -> str:
def plural(quantity: int, singular: str, plural: str) -> str:
"""Return the singular or plural word."""
return singular if quantity == 1 else plural


def indent(string: str, n=4) -> str:
"""Indent a multi-line string by given number of spaces."""
return "\n".join(" " * n + x for x in str(string).split("\n"))
226 changes: 226 additions & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,226 @@
# CompilerGym Examples <!-- omit in toc -->

This directory contains code samples for everything from implementing simple
RL agents to adding support for entirely new compilers. Is there an example that
you think is missing? If so, please [contribute](/CONTRIBUTING.md)!


**Table of contents:**

- [Autotuning](#autotuning)
- [Performing a random walk of an environment](#performing-a-random-walk-of-an-environment)
- [GCC Autotuning (genetic algorithms, hill climbing, + more)](#gcc-autotuning-genetic-algorithms-hill-climbing--more)
- [Makefile integration](#makefile-integration)
- [Random search using the LLVM C++ API](#random-search-using-the-llvm-c-api)
- [Reinforcement learning](#reinforcement-learning)
- [PPO and integration with RLlib](#ppo-and-integration-with-rllib)
- [Actor-critic](#actor-critic)
- [Tabular Q learning](#tabular-q-learning)
- [Extending CompilerGym](#extending-compilergym)
- [Example CompilerGym service](#example-compilergym-service)
- [Example loop unrolling](#example-loop-unrolling)
- [Miscellaneous](#miscellaneous)
- [Exhaustive search of bounded action spaces](#exhaustive-search-of-bounded-action-spaces)
- [Estimating the immediate and cumulative reward of actions and benchmarks](#estimating-the-immediate-and-cumulative-reward-of-actions-and-benchmarks)


## Autotuning

### Performing a random walk of an environment

The [random_walk.py](random_walk.py) script runs a single episode of a
CompilerGym environment, logging the action taken and reward received at each
step. Example usage:

```sh
$ python random_walk.py --env=llvm-v0 --step_min=100 --step_max=100 \
--benchmark=cbench-v1/dijkstra --reward=IrInstructionCount

=== Step 1 ===
Action: -lower-constant-intrinsics (changed=False)
Reward: 0.0
Step time: 805.6us

=== Step 2 ===
Action: -forceattrs (changed=False)
Reward: 0.0
Step time: 229.8us

...

=== Step 100 ===
Action: -globaldce (changed=False)
Reward: 0.0
Step time: 213.9us

Completed 100 steps in 91.6ms (1091.3 steps / sec).
Total reward: 161.0
Max reward: 111.0 (+68.94% at step 31)
```

For further details run: `python random_walk.py --help`.


### GCC Autotuning (genetic algorithms, hill climbing, + more)

The [gcc_search.py](gcc_search.py) script contains implementations of several
autotuning techniques for the GCC environment. It was used to produce the
results for the GCC experiments in the [CompilerGym
whitepaper](https://arxiv.org/pdf/2109.08267.pdf). For further details run:
`python gcc_search.py --help`.


### Makefile integration

The [makefile_integration](makefile_integration/) directory demonstrates a
simple integration of CopmilerGym into a C++ Makefile config. For details see
the [Makefile](makefile_integration/Makefile).


### Random search using the LLVM C++ API

While not intended for the majority of users, it is entirely straightforward to
skip CompilerGym's Python frontend and interact with the C++ APIs directly. The
[RandomSearch.cc](RandomSearch.cc) file demonstrates a simple parallelized
random search implemented for the LLVM compiler service. Run it using:

```
bazel run -c opt //examples:RandomSearch -- --benchmark=benchmark://cbench-v1/crc32
```

For further details run: `bazel run -c opt //examples:RandomSearch -- --help`


## Reinforcement learning


### PPO and integration with RLlib

<a href="https://colab.research.google.com/github/facebookresearch/CompilerGym/blob/stable/examples/getting-started.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Colab" height="20">
</a>

The [rllib.ipynb](rllib.ipynb) notebook demonstrates integrating CompilerGym
with the popular [RLlib](https://docs.ray.io/en/master/rllib.html) reinforcement
learning library. In notebook covers registering a custom environment using a
constrained subset of the LLVM environment's action space a finite time horizon,
and trains a PPO agent using separate train/val/test datasets.


### Actor-critic

The [actor_critic](actor_critic.py) script contains a simple actor-critic
example using PyTorch. The objective is to minimize the size of a benchmark
(program) using LLVM compiler passes. At each step there is a choice of which
pass to pick next and an episode consists of a sequence of such choices,
yielding the number of saved instructions as the overall reward. For
simplification of the learning task, only a (configurable) subset of LLVM passes
are considered and every episode has the same (configurable) length.

For further details run: `python actor_critic.py --help`.


### Tabular Q learning

The [tabular_q](tabular_q.py) script contains a simple tabular Q learning
example for the LLVM environment. Using selected features from Autophase
observation space, given a specific training program as gym environment, find
the best action sequence using online Q learning.

For further details run: `python tabular_q.py --help`.


## Extending CompilerGym


### Example CompilerGym service

The [example_compiler_gym_service](example_compiler_gym_service) directory
demonstrates how to extend CompilerGym with support for new compiler problems.
The directory contains bare bones implementations of backends in Python or C++
that can be used as the basis for adding new compiler environments. See the
[README.md](example_compiler_gym_service/README.md) file for further details.


### Example loop unrolling

The [example_unrolling_service](example_unrolling_service) directory
demonstrates how to implement support for a real compiler problem by integrating
with commandline loop unrolling flags for the LLVM compiler. See the
[README.md](example_unrolling_service/README.md) file for further details.


## Miscellaneous


### Exhaustive search of bounded action spaces

The [brute_force.py](brute_force.py) script runs a parallelized brute force of
an action space. It enumerates all possible combinations of actions up to a
finite episode length and evaluates them, logging the incremental rewards of
each. Example usage:

```
$ python brute_force.py --env=llvm-ic-v0 --benchmark=cbench-v1/dijkstra \
--episode_length=8 --brute_force_action_list=-sroa,-mem2reg,-newgvn

Enumerating all episodes of 3 actions x 8 steps
Started 24 brute force workers for benchmark benchmark://cbench-v1/dijkstra using reward IrInstructionCountOz.
=== Running 6,561 trials ===
Runtime: 8 seconds. Progress: 100.00%. Best reward found: 0.8571428571428572.
Ending jobs ... I1014 12:04:51.671775 3245811 CreateAndRunCompilerGymServiceImpl.h:128] Service "/dev/shm/compiler_gym_cec/s/1014T120451-646797-5770" listening on 37505, PID = 3245811
completed 6,561 of 6,561 trials (100.000%), best sequence -mem2reg -mem2reg -sroa -sroa -mem2reg -sroa -sroa -newgvn
```

For further details run: `python brute_force.py --help`.

The [explore.py](explore.py) script evaluates all possible combinations of
actions up to a finite limit, but partial sequences of actions that end up in
the same state are deduplicated, sometimes dramatically reducing the size of the
search space. This script can also be configured to do a beam search.

Example usage:

```
$ python explore.py --env=llvm-ic-v0 --benchmark=cbench-v1/dijkstra \
--episode_length=8 --explore_actions=-simplifycfg,-instcombine,-mem2reg,-newgvn

...

*** Processing depth 6 of 8 with 11 states and 4 actions.

unpruned self_pruned cross_pruned back_pruned dropped sum
added this depth 0 33 0 11 0 44
full nodes this depth 0 2,833 1,064 199 0 4,096
added across depths 69 151 23 34 0 277
full added across depths 69 3,727 1,411 254 0 5,461

Time taken for depth: 0.05 s
Top 3 sequence(s):
0.9694 -mem2reg, -newgvn, -simplifycfg, -instcombine
0.9694 -newgvn, -instcombine, -mem2reg, -simplifycfg
0.9694 -newgvn, -instcombine, -mem2reg, -simplifycfg, -instcombine


*** Processing depth 7 of 8 with 0 states and 4 actions.

There are no more states to process, stopping early.
```

For further details run: `python brute_force.py --help`.


### Estimating the immediate and cumulative reward of actions and benchmarks

The [sensitivity_analysis](sensitivity_analysis/) directory contains a pair of
scripts for estimating the sensitivity of the reward signal to different
environment parameters:

* [action_sensitivity_analysis.py](sensitivity_analysis/action_sensitivity_analysis.py):
This script estimates the immediate reward that running a specific action has
by running trials. A trial is a random episode that ends with the determined
action.
* [benchmark_sensitivity_analysis.py](sensitivity_analysis/benchmark_sensitivity_analysis.py):
This script estimates the cumulative reward for a random episode on a
benchmark by running trials. A trial is an episode in which a random number of
random actions are performed and the total cumulative reward is recorded.
6 changes: 4 additions & 2 deletions examples/RandomSearch.cc
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ class Environment {
}

StartSessionRequest startRequest;
startRequest.mutable_benchmark()->set_uri(benchmark_);
StartSessionReply startReply;
RETURN_IF_ERROR(service_.StartSession(nullptr, &startRequest, &startReply));
sessionId_ = startReply.session_id();
Expand Down Expand Up @@ -138,8 +139,9 @@ Status runSearch(const fs::path& workingDir, std::vector<int>* bestActions, int6
void runThread(std::vector<int>* bestActions, int64_t* bestCost) {
const fs::path workingDir = fs::unique_path();
fs::create_directories(workingDir);
if (!runSearch(workingDir, bestActions, bestCost).ok()) {
LOG(ERROR) << "Search failed";
const auto status = runSearch(workingDir, bestActions, bestCost);
if (!status.ok()) {
LOG(ERROR) << "ERROR " << status.error_code() << ": " << status.error_message();
}
fs::remove_all(workingDir);
}
Expand Down
20 changes: 10 additions & 10 deletions examples/brute_force.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,15 @@

Example usage:

$ $ python -m compiler_gym.bin.brute_force \
--env=llvm-ic-v0 --benchmark=cbench-v1/dijkstra \
--episode_length=10 --actions=-sroa,-mem2reg,-newgvn
Enumerating all episodes of 3 actions x 10 steps
Started 24 brute force workers for benchmark cbench-v1/dijkstra using reward IrInstructionCountOz.
=== Running 59,049 trials ===
Runtime: 3 minutes. Progress: 100.00%. Best reward found: 101.1905%.
Ending jobs ... completed 59,049 of 59,049 trials (100.000%)
$ python brute_force.py --env=llvm-ic-v0 --benchmark=cbench-v1/dijkstra \
--episode_length=8 --brute_force_action_list=-sroa,-mem2reg,-newgvn

Enumerating all episodes of 3 actions x 8 steps
Started 24 brute force workers for benchmark benchmark://cbench-v1/dijkstra using reward IrInstructionCountOz.
=== Running 6,561 trials ===
Runtime: 8 seconds. Progress: 100.00%. Best reward found: 0.8571428571428572.
Ending jobs ... I1014 12:04:51.671775 3245811 CreateAndRunCompilerGymServiceImpl.h:128] Service "/dev/shm/compiler_gym_cec/s/1014T120451-646797-5770" listening on 37505, PID = 3245811
completed 6,561 of 6,561 trials (100.000%), best sequence -mem2reg -mem2reg -sroa -sroa -mem2reg -sroa -sroa -newgvn

Use --help to list the configurable options.
"""
Expand Down Expand Up @@ -306,8 +307,7 @@ def main(argv):

with env_from_flags(benchmark) as env:
env.reset()
benchmark = env.benchmark
sanitized_benchmark_uri = "/".join(benchmark.split("/")[-2:])
sanitized_benchmark_uri = "/".join(str(env.benchmark).split("/")[-2:])
logs_dir = Path(
FLAGS.output_dir or create_logging_dir(f"brute_force/{sanitized_benchmark_uri}")
)
Expand Down
28 changes: 9 additions & 19 deletions examples/random_walk.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@
Example usage:

# Run a random walk on cBench example program using instruction count reward.
$ python3 examples/random_walk.py --env=llvm-v0 --step_min=100 --step_max=100 \
--benchmark=cbench-v1/dijkstra --reward=IrInstructionCount
$ python3 random_walk.py --env=llvm-v0 --step_min=100 --step_max=100 \
--benchmark=cbench-v1/dijkstra --reward=IrInstructionCount
"""
import random

Expand Down Expand Up @@ -39,7 +39,7 @@ def run_random_walk(env: CompilerEnv, step_count: int) -> None:
fewer steps will be performed if any of the actions lead the
environment to end the episode.
"""
rewards, actions = [], []
rewards = []

step_num = 0
with Timer() as episode_time:
Expand All @@ -48,14 +48,13 @@ def run_random_walk(env: CompilerEnv, step_count: int) -> None:
action_index = env.action_space.sample()
with Timer() as step_time:
observation, reward, done, info = env.step(action_index)
print(f"\n=== Step {humanize.intcomma(step_num)} ===")
print(
f"\n=== Step {humanize.intcomma(step_num)} ===\n"
f"Action: {env.action_space.names[action_index]} "
f"(changed={not info.get('action_had_no_effect')})"
f"(changed={not info.get('action_had_no_effect')})\n"
f"Reward: {reward}"
)
rewards.append(reward)
actions.append(env.action_space.names[action_index])
print(f"Reward: {reward}")
if env.observation_space:
print(f"Observation:\n{observation}")
print(f"Step time: {step_time}")
Expand All @@ -71,27 +70,18 @@ def reward_percentage(reward, rewards):

print(
f"\nCompleted {emph(humanize.intcomma(step_num))} steps in {episode_time} "
f"({step_num / episode_time.time:.1f} steps / sec)."
)
print(f"Total reward: {sum(rewards)}")
print(
f"({step_num / episode_time.time:.1f} steps / sec).\n"
f"Total reward: {sum(rewards)}\n"
f"Max reward: {max(rewards)} ({reward_percentage(max(rewards), rewards)} "
f"at step {humanize.intcomma(rewards.index(max(rewards)) + 1)})"
)

def remove_no_change(rewards, actions):
return [a for (r, a) in zip(rewards, actions) if r != 0]

actions = remove_no_change(rewards, actions)
print("Effective actions from trajectory: " + ", ".join(actions))


def main(argv):
"""Main entry point."""
assert len(argv) == 1, f"Unrecognized flags: {argv[1:]}"

benchmark = benchmark_from_flags()
with env_from_flags(benchmark) as env:
with env_from_flags(benchmark=benchmark_from_flags()) as env:
step_min = min(FLAGS.step_min, FLAGS.step_max)
step_max = max(FLAGS.step_min, FLAGS.step_max)
run_random_walk(env=env, step_count=random.randint(step_min, step_max))
Expand Down
Loading