Leaderboard implementation of tabular Q #141

JD-ETH · 2021-03-20T17:22:55Z

@ChrisCummins I am encountering this error during compilation, and I have been using the most recent development branch and up-to-date pip package of compiler gym.

Use --sandbox_debug to see verbose messages from the sandbox
Traceback (most recent call last):
  File "/private/var/tmp/_bazel_jdguo/7889602d80c1d2c23f0de266ae0d59dd/sandbox/darwin-sandbox/19/execroot/CompilerGym/bazel-out/host/bin/compiler_gym/envs/llvm/make_specs.runfiles/CompilerGym/compiler_gym/envs/llvm/make_specs.py", line 10, in <module>
    from compiler_gym.envs.llvm.llvm_env import LlvmEnv
  File "/private/var/tmp/_bazel_jdguo/7889602d80c1d2c23f0de266ae0d59dd/sandbox/darwin-sandbox/19/execroot/CompilerGym/bazel-out/host/bin/compiler_gym/envs/llvm/make_specs.runfiles/CompilerGym/compiler_gym/envs/llvm/llvm_env.py", line 15, in <module>
    from compiler_gym.envs.compiler_env import CompilerEnv
  File "/private/var/tmp/_bazel_jdguo/7889602d80c1d2c23f0de266ae0d59dd/sandbox/darwin-sandbox/19/execroot/CompilerGym/bazel-out/host/bin/compiler_gym/envs/llvm/make_specs.runfiles/CompilerGym/compiler_gym/envs/compiler_env.py", line 24, in <module>
    from compiler_gym.datasets.dataset import LegacyDataset, require
  File "/private/var/tmp/_bazel_jdguo/7889602d80c1d2c23f0de266ae0d59dd/sandbox/darwin-sandbox/19/execroot/CompilerGym/bazel-out/host/bin/compiler_gym/envs/llvm/make_specs.runfiles/CompilerGym/compiler_gym/datasets/dataset.py", line 15, in <module>
    from deprecated.sphinx import deprecated
ModuleNotFoundError: No module named 'deprecated'

Decided to make it a PR so that you have context as well.

ChrisCummins · 2021-03-20T17:26:21Z

The development branch has a new dependency that the version on pip does not have, so the module is missing. Run make init from the root of the repo to grab all of the python dependencies.

Cheers,
Chris

JD-ETH · 2021-03-20T17:36:24Z

Thanks, it works! Funny to call a package deprecated...

JD-ETH · 2021-03-22T21:46:44Z

I'm facing this error during the evaluation:

Resulting sequence:  -break-crit-edges,-newgvn,-simplifycfg,-sroa,-instcombine total reward 0.9352701325178389
 13%|██████████████████████████▊                                                                                                                                                                                   | 30/230 [29:27<3:16:25, 58.93s/ benchmark]Exception in thread Thread-6:
Traceback (most recent call last):
  File "/Users/jdguo/.conda/envs/CompilerGym/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/private/var/tmp/_bazel_jdguo/7889602d80c1d2c23f0de266ae0d59dd/execroot/CompilerGym/bazel-out/darwin-opt/bin/leaderboard/llvm_codesize/tabular_q/tabular_q_eval.runfiles/CompilerGym/leaderboard/llvm_codesize/eval_policy.py", line 139, in run
    self.policy(self.env)
  File "/private/var/tmp/_bazel_jdguo/7889602d80c1d2c23f0de266ae0d59dd/execroot/CompilerGym/bazel-out/darwin-opt/bin/leaderboard/llvm_codesize/tabular_q/tabular_q_eval.runfiles/CompilerGym/leaderboard/llvm_codesize/tabular_q/tabular_q_eval.py", line 19, in train_and_run
    train(q_table, training_env)
  File "/private/var/tmp/_bazel_jdguo/7889602d80c1d2c23f0de266ae0d59dd/execroot/CompilerGym/bazel-out/darwin-opt/bin/leaderboard/llvm_codesize/tabular_q/tabular_q_eval.runfiles/CompilerGym/examples/tabular_q.py", line 149, in train
    observation = env.observation["Autophase"]
  File "/private/var/tmp/_bazel_jdguo/7889602d80c1d2c23f0de266ae0d59dd/execroot/CompilerGym/bazel-out/darwin-opt/bin/leaderboard/llvm_codesize/tabular_q/tabular_q_eval.runfiles/CompilerGym/compiler_gym/views/observation.py", line 56, in __getitem__
    reply: StepReply = self._get_observation(request)
  File "/private/var/tmp/_bazel_jdguo/7889602d80c1d2c23f0de266ae0d59dd/execroot/CompilerGym/bazel-out/darwin-opt/bin/leaderboard/llvm_codesize/tabular_q/tabular_q_eval.runfiles/CompilerGym/compiler_gym/envs/compiler_env.py", line 279, in <lambda>
    get_observation=lambda req: self.service(self.service.stub.Step, req),
AttributeError: 'NoneType' object has no attribute 'stub'

Not sure why this is happening, also the iteration seems to slow down. Is there a way to identify the issue? @ChrisCummins

JD-ETH · 2021-03-23T21:04:16Z

this seems to happen consistently at 30th evaluations.

ChrisCummins · 2021-03-23T22:24:58Z

I'm facing this error during the evaluation:

Yep, that's a strange one. The fact that it happens consistently on the 30th evaluation implies that it's a problem with the third benchmark (the benchmarks are always evaluted in the same order, and the default --n flag runs each benchmark 10 times.

Can you confirm that you're running this via bazel on the latest development branch?

Cheers,
Chris

@JD-at-work

* [llvm] Temporarily disable polybench Mitigates facebookresearch#55. * [validation] Add a flakiness retry loop around validation. Add a retry loop around the granular individual validation callbacks for cBench-v1. Mitigates facebookresearch#144. * [validation] Catch timeouts in retry loop. * [docs/faq] "I updated with 'git pull' and not it doesn't work * [tests] Extend timeout on datasets test. * [tests] Update regression tests. * [tests] Reduce validation regression test retry counts. * Call env.reset() just after creation - fixes facebookresearch#150 * [docs] Update LLVM actions table. * Add a target to rename the manylinux file. * Force UTF-8 on README decoding. * [util] Improve runfiles docstrings. * [llvm] Remove LLVM binaries from wheel. This patch removes the LLVM binaries from the shipped wheel. This is to reduce the package size to be under the 100MB default maximum imposed by PyPi. Instead of shipping the files in the wheel, the LLVM binaries are downloaded from an archive hosted by Facebook when needed. The circumstances for needing them are: (1) starting an LLVM service, (2) attempting to resolve the path to an LLVM binary. * Defer evaluation of cBench runtime data directory. * [tests] Remove tests that overwrite site data path. These no longer work now that site data requires LLVM binaries to be present. * Release v0.1.5. * Add a fast path check for downloaded LLVM files. * [tests] Use full URI for benchmark. * Correct retry count in error message. * [env] Include last error on init failure. * [env] Add a special error message for UNKNOWN errors. * [rpc] Allow loglines() when logs directory does not exist. * [rpc] Include service logs in error message on init failure. * [rpc] Include final error message on retry loop failure. * [rpc] Add decoded signal name on init error. * [llvm] Replace DCHECK() with Status error. * [tests] Remove tests that interfere with site data path. Site data directory is now a pre-requisite of the LLVM environment and cannot be moved. * [tests] Fix caught exception type. * [llvm] Add a check for runfile requirement. * [llvm] Add a file existing check. * [rpc] Disable logs buffering on debugging runs. * [tests] Fix error message comparison tests * [bin/manual_env] Update prompt after reset(). Running `reset()` with no benchmark set will select a random program, so the prompt must be updated. * [tests] Add workaround for prompt issue. * Release v0.1.6. This release focuses on hardening the LLVM environments, providing improved semantics validation, and improving the datasets. Many thanks to @JD-at-work, @bwasti, and @mostafaelhoushi for code contributions. - [llvm] Added a new `cBench-v1` dataset which changes the function attributes of the IR to permit inlining. `cBench-v0` is deprecated and will be removed no earlier than v0.1.6. - [llvm] Removed 15 passes from the LLVM action space: `-bounds-checking`, `-chr`, `-extract-blocks`, `-gvn-sink`, `-loop-extract-single`, `-loop-extract`, `-objc-arc-apelim`, `-objc-arc-contract`, `-objc-arc-expand`, `-objc-arc`, `-place-safepoints`, `-rewrite-symbols`, `-strip-dead-debug-info`, `-strip-nonlinetable-debuginfo`, `-structurizecfg`. Passes are removed if they are: irrelevant (e.g. used only debugging), if they change the program semantics (e.g. inserting runtimes bound checking), or if they have been found to have nondeterministic behavior between runs. - Extended `env.step()` so that it can take a list of actions that are all performed in a single batch. This improve efficiency. - Added default reward spaces for `CompilerEnv` that are derived from scalar observations (thanks @bwasti!) - Added a new Q learning example (thanks @JD-at-work!). - *Deprecation:* The next release v0.1.5 will introduce a new datasets API that is easier to use and more flexible. In preparation for this, the `Dataset` class has been renamed to `LegacyDataset`, the following dataset operations have been marked deprecated: `activate()`, `deactivate()`, and `delete()`. The `GetBenchmarks()` RPC interface method has also been marked deprecated.. - [llvm] Improved semantics validation using LLVM's memory, thread, address, and undefined behavior sanitizers. - Numerous bug fixes and improvements. * [tests] Add temporary workaround for flaky init benchmark. * Add missing copyright header to make_specs.py. * [util] Force string type in truncate(). * [bin/service]: Fix reporting of observation space shape. * [bin/service]: Fix reporting of observation space shape. * [util] Force string type in truncate(). * [llvm] Add an InstCount observation space. This adds new observation spaces that expose the -instcount pass values. The -instcount pass counts the number of instructions of each type in a program, along with the total number of instructions, total number of blocks, and total number of functions. There are four new observation spaces: `InstCount`, which returns the feature vector as a numpy array, `InstCountDict`, which returns the values as a dictionary of named features, and `InstCountNorm` and `InstCountNormDict`, which are the same as above but the counts are instead normalized to the total number of instructions in the program. Example usage: >>> import gym >>> import compiler_gym >>> env = gym.make("llvm-v0") >>> env.observation_space = "InstCountDict" >>> env.reset("cBench-v0/crc32") {'TotalInstsCount': 196, 'TotalBlocksCount': 29, 'TotalFuncsCount': 13, 'RetCount': 5, 'BrCount': 24, 'SwitchCount': 0, 'IndirectBrCount': 0, 'InvokeCount': 0, 'ResumeCount': 0, 'UnreachableCount': 0, 'CleanupRetCount': 0, 'CatchRetCount': 0, 'CatchSwitchCount': 0, 'CallBrCount': 0, 'FNegCount': 0, 'AddCount': 5, 'FAddCount': 0, 'SubCount': 0, 'FSubCount': 0, 'MulCount': 0, 'FMulCount': 0, 'UDivCount': 0, 'SDivCount': 0, 'FDivCount': 0, 'URemCount': 0, 'SRemCount': 0, 'FRemCount': 0, 'ShlCount': 0, 'LShrCount': 3, 'AShrCount': 0, 'AndCount': 3, 'OrCount': 1, 'XorCount': 8, 'AllocaCount': 24, 'LoadCount': 51, 'StoreCount': 38, 'GetElementPtrCount': 5, 'FenceCount': 0, 'AtomicCmpXchgCount': 0, 'AtomicRMWCount': 0, 'TruncCount': 1, 'ZExtCount': 5, 'SExtCount': 0, 'FPToUICount': 0, 'FPToSICount': 0, 'UIToFPCount': 0, 'SIToFPCount': 0, 'FPTruncCount': 0, 'FPExtCount': 0, 'PtrToIntCount': 0, 'IntToPtrCount': 0, 'BitCastCount': 0, 'AddrSpaceCastCount': 0, 'CleanupPadCount': 0, 'CatchPadCount': 0, 'ICmpCount': 10, 'FCmpCount': 0, 'PHICount': 0, 'CallCount': 13, 'SelectCount': 0, 'UserOp1Count': 0, 'UserOp2Count': 0, 'VAArgCount': 0, 'ExtractElementCount': 0, 'InsertElementCount': 0, 'ShuffleVectorCount': 0, 'ExtractValueCount': 0, 'InsertValueCount': 0, 'LandingPadCount': 0, 'FreezeCount': 0} The InstCount observation spaces are quick to compute and lightweight. They have similar computational complexity as Autophase. Fixes facebookresearch#149. * [ci] Enable test workflows on Python 3.9. Issue facebookresearch#162. * Bump grpcio from 1.34 to 1.36. Issue facebookresearch#162. * Bump bazel requirement to 4.0.0. This is required to build grpcio 1.36.0. Issue facebookresearch#162. * [ci] Reverse order of sudo in setup. Issue facebookresearch#162. * Add libjpeg-dev to list of required linux packages. This to enable compiling Pillow from source on Python 3.9. Issue facebookresearch#162. * Bump the gym dependency to 0.18.0. Issue facebookresearch#162. * [examples] Fix initialization of temporary directory variable. * Add zlib to macOS dependencies. This is to fix compilation of Pillow using Python 3.9. Issue facebookresearch#162. * [readme] Recommend python 3.9 for conda environments. Issue facebookresearch#162. * [ci] Use python 3.9 for continuous integration jobs. Issue facebookresearch#162. * [setup.py] Add a list of supported python versions * [setup.py] Bump development status to Alpha. * [README] Use non-sudo instructions for linux setup. * [README] Simplify table of contents. This adds  annotations to some of the minor subheadings to keep the table of contents as simple as possible. This uses the "Markdown All in One" plugin for VSCode to automatically keep the table of contents up to date: https://marketplace.visualstudio.com/items?itemName=yzhang.markdown-all-in-one#table-of-contents * [README] Use syntax highlighting for installation instructions. * [README] Small tweak to wording. * [README] Use -U in pip install example. * [README] Don't use '$' prefix on shell commands. It makes it harder to copy and paste the commands. * [README] Add explicit "proceed to all platforms" below. * Add missing load() of bazel rules. * [leaderboard] Move leaderboard utility into compiler_gym namespace. This adds a compiler_gym.leaderboard module that contains the LLVM codesize leaderboard helper code. New API docs provide improved explanation of how to use it. Issue facebookresearch#158. * [leaderboard] Rename --logfile to --results_logfile. This is to break the duplicate flag error from //tests/benchmarks:parallelization_load_test. * [leaderboard] Make it clear that users can set observation spaces. Issue facebookresearch#142. * [CONTRIBUTING] Improve leaderboard submission instructions. Re-order the file so that leaderboard submissions appear directly below pull requests. Then provide more details about the submission review process. * [leaderboard] Rename LLVM codesize to instruction count. Be clear that this leaderboard evaluates performance at reducing the instruction count of LLVM-IR, not the binary codesize. * Add leaderboard package as a dependency of //compiler_gym. * [CONTRIBUTING] Use random-agent PR as example for leaderboard. * leaderboard implementation * fails due to env selection * fails at 60th evaluation * Rebase Tabular Q leaderboard on latest development. * Add load() for bazel symbol. Co-authored-by: Bram Wasti <bwasti@fb.com> Co-authored-by: Jiadong Guo <jdguo@fb.com>

ChrisCummins · 2021-04-14T14:58:23Z

LGTM, thanks for this @JD-ETH! Feel free to merge when happy.

Cheers,
Chris

@JD-at-work

* leaderboard implementation * fails due to env selection * fails at 60th evaluation * Update to WIP Tabular Q leaderboard submission (facebookresearch#1) * [llvm] Temporarily disable polybench Mitigates facebookresearch#55. * [validation] Add a flakiness retry loop around validation. Add a retry loop around the granular individual validation callbacks for cBench-v1. Mitigates facebookresearch#144. * [validation] Catch timeouts in retry loop. * [docs/faq] "I updated with 'git pull' and not it doesn't work * [tests] Extend timeout on datasets test. * [tests] Update regression tests. * [tests] Reduce validation regression test retry counts. * Call env.reset() just after creation - fixes facebookresearch#150 * [docs] Update LLVM actions table. * Add a target to rename the manylinux file. * Force UTF-8 on README decoding. * [util] Improve runfiles docstrings. * [llvm] Remove LLVM binaries from wheel. This patch removes the LLVM binaries from the shipped wheel. This is to reduce the package size to be under the 100MB default maximum imposed by PyPi. Instead of shipping the files in the wheel, the LLVM binaries are downloaded from an archive hosted by Facebook when needed. The circumstances for needing them are: (1) starting an LLVM service, (2) attempting to resolve the path to an LLVM binary. * Defer evaluation of cBench runtime data directory. * [tests] Remove tests that overwrite site data path. These no longer work now that site data requires LLVM binaries to be present. * Release v0.1.5. * Add a fast path check for downloaded LLVM files. * [tests] Use full URI for benchmark. * Correct retry count in error message. * [env] Include last error on init failure. * [env] Add a special error message for UNKNOWN errors. * [rpc] Allow loglines() when logs directory does not exist. * [rpc] Include service logs in error message on init failure. * [rpc] Include final error message on retry loop failure. * [rpc] Add decoded signal name on init error. * [llvm] Replace DCHECK() with Status error. * [tests] Remove tests that interfere with site data path. Site data directory is now a pre-requisite of the LLVM environment and cannot be moved. * [tests] Fix caught exception type. * [llvm] Add a check for runfile requirement. * [llvm] Add a file existing check. * [rpc] Disable logs buffering on debugging runs. * [tests] Fix error message comparison tests * [bin/manual_env] Update prompt after reset(). Running `reset()` with no benchmark set will select a random program, so the prompt must be updated. * [tests] Add workaround for prompt issue. * Release v0.1.6. This release focuses on hardening the LLVM environments, providing improved semantics validation, and improving the datasets. Many thanks to @JD-at-work, @bwasti, and @mostafaelhoushi for code contributions. - [llvm] Added a new `cBench-v1` dataset which changes the function attributes of the IR to permit inlining. `cBench-v0` is deprecated and will be removed no earlier than v0.1.6. - [llvm] Removed 15 passes from the LLVM action space: `-bounds-checking`, `-chr`, `-extract-blocks`, `-gvn-sink`, `-loop-extract-single`, `-loop-extract`, `-objc-arc-apelim`, `-objc-arc-contract`, `-objc-arc-expand`, `-objc-arc`, `-place-safepoints`, `-rewrite-symbols`, `-strip-dead-debug-info`, `-strip-nonlinetable-debuginfo`, `-structurizecfg`. Passes are removed if they are: irrelevant (e.g. used only debugging), if they change the program semantics (e.g. inserting runtimes bound checking), or if they have been found to have nondeterministic behavior between runs. - Extended `env.step()` so that it can take a list of actions that are all performed in a single batch. This improve efficiency. - Added default reward spaces for `CompilerEnv` that are derived from scalar observations (thanks @bwasti!) - Added a new Q learning example (thanks @JD-at-work!). - *Deprecation:* The next release v0.1.5 will introduce a new datasets API that is easier to use and more flexible. In preparation for this, the `Dataset` class has been renamed to `LegacyDataset`, the following dataset operations have been marked deprecated: `activate()`, `deactivate()`, and `delete()`. The `GetBenchmarks()` RPC interface method has also been marked deprecated.. - [llvm] Improved semantics validation using LLVM's memory, thread, address, and undefined behavior sanitizers. - Numerous bug fixes and improvements. * [tests] Add temporary workaround for flaky init benchmark. * Add missing copyright header to make_specs.py. * [util] Force string type in truncate(). * [bin/service]: Fix reporting of observation space shape. * [bin/service]: Fix reporting of observation space shape. * [util] Force string type in truncate(). * [llvm] Add an InstCount observation space. This adds new observation spaces that expose the -instcount pass values. The -instcount pass counts the number of instructions of each type in a program, along with the total number of instructions, total number of blocks, and total number of functions. There are four new observation spaces: `InstCount`, which returns the feature vector as a numpy array, `InstCountDict`, which returns the values as a dictionary of named features, and `InstCountNorm` and `InstCountNormDict`, which are the same as above but the counts are instead normalized to the total number of instructions in the program. Example usage: >>> import gym >>> import compiler_gym >>> env = gym.make("llvm-v0") >>> env.observation_space = "InstCountDict" >>> env.reset("cBench-v0/crc32") {'TotalInstsCount': 196, 'TotalBlocksCount': 29, 'TotalFuncsCount': 13, 'RetCount': 5, 'BrCount': 24, 'SwitchCount': 0, 'IndirectBrCount': 0, 'InvokeCount': 0, 'ResumeCount': 0, 'UnreachableCount': 0, 'CleanupRetCount': 0, 'CatchRetCount': 0, 'CatchSwitchCount': 0, 'CallBrCount': 0, 'FNegCount': 0, 'AddCount': 5, 'FAddCount': 0, 'SubCount': 0, 'FSubCount': 0, 'MulCount': 0, 'FMulCount': 0, 'UDivCount': 0, 'SDivCount': 0, 'FDivCount': 0, 'URemCount': 0, 'SRemCount': 0, 'FRemCount': 0, 'ShlCount': 0, 'LShrCount': 3, 'AShrCount': 0, 'AndCount': 3, 'OrCount': 1, 'XorCount': 8, 'AllocaCount': 24, 'LoadCount': 51, 'StoreCount': 38, 'GetElementPtrCount': 5, 'FenceCount': 0, 'AtomicCmpXchgCount': 0, 'AtomicRMWCount': 0, 'TruncCount': 1, 'ZExtCount': 5, 'SExtCount': 0, 'FPToUICount': 0, 'FPToSICount': 0, 'UIToFPCount': 0, 'SIToFPCount': 0, 'FPTruncCount': 0, 'FPExtCount': 0, 'PtrToIntCount': 0, 'IntToPtrCount': 0, 'BitCastCount': 0, 'AddrSpaceCastCount': 0, 'CleanupPadCount': 0, 'CatchPadCount': 0, 'ICmpCount': 10, 'FCmpCount': 0, 'PHICount': 0, 'CallCount': 13, 'SelectCount': 0, 'UserOp1Count': 0, 'UserOp2Count': 0, 'VAArgCount': 0, 'ExtractElementCount': 0, 'InsertElementCount': 0, 'ShuffleVectorCount': 0, 'ExtractValueCount': 0, 'InsertValueCount': 0, 'LandingPadCount': 0, 'FreezeCount': 0} The InstCount observation spaces are quick to compute and lightweight. They have similar computational complexity as Autophase. Fixes facebookresearch#149. * [ci] Enable test workflows on Python 3.9. Issue facebookresearch#162. * Bump grpcio from 1.34 to 1.36. Issue facebookresearch#162. * Bump bazel requirement to 4.0.0. This is required to build grpcio 1.36.0. Issue facebookresearch#162. * [ci] Reverse order of sudo in setup. Issue facebookresearch#162. * Add libjpeg-dev to list of required linux packages. This to enable compiling Pillow from source on Python 3.9. Issue facebookresearch#162. * Bump the gym dependency to 0.18.0. Issue facebookresearch#162. * [examples] Fix initialization of temporary directory variable. * Add zlib to macOS dependencies. This is to fix compilation of Pillow using Python 3.9. Issue facebookresearch#162. * [readme] Recommend python 3.9 for conda environments. Issue facebookresearch#162. * [ci] Use python 3.9 for continuous integration jobs. Issue facebookresearch#162. * [setup.py] Add a list of supported python versions * [setup.py] Bump development status to Alpha. * [README] Use non-sudo instructions for linux setup. * [README] Simplify table of contents. This adds  annotations to some of the minor subheadings to keep the table of contents as simple as possible. This uses the "Markdown All in One" plugin for VSCode to automatically keep the table of contents up to date: https://marketplace.visualstudio.com/items?itemName=yzhang.markdown-all-in-one#table-of-contents * [README] Use syntax highlighting for installation instructions. * [README] Small tweak to wording. * [README] Use -U in pip install example. * [README] Don't use '$' prefix on shell commands. It makes it harder to copy and paste the commands. * [README] Add explicit "proceed to all platforms" below. * Add missing load() of bazel rules. * [leaderboard] Move leaderboard utility into compiler_gym namespace. This adds a compiler_gym.leaderboard module that contains the LLVM codesize leaderboard helper code. New API docs provide improved explanation of how to use it. Issue facebookresearch#158. * [leaderboard] Rename --logfile to --results_logfile. This is to break the duplicate flag error from //tests/benchmarks:parallelization_load_test. * [leaderboard] Make it clear that users can set observation spaces. Issue facebookresearch#142. * [CONTRIBUTING] Improve leaderboard submission instructions. Re-order the file so that leaderboard submissions appear directly below pull requests. Then provide more details about the submission review process. * [leaderboard] Rename LLVM codesize to instruction count. Be clear that this leaderboard evaluates performance at reducing the instruction count of LLVM-IR, not the binary codesize. * Add leaderboard package as a dependency of //compiler_gym. * [CONTRIBUTING] Use random-agent PR as example for leaderboard. * leaderboard implementation * fails due to env selection * fails at 60th evaluation * Rebase Tabular Q leaderboard on latest development. * Add load() for bazel symbol. Co-authored-by: Bram Wasti <bwasti@fb.com> Co-authored-by: Jiadong Guo <jdguo@fb.com> * Add JD's tabular-q leaderboard submission * updated smoke test and readme Co-authored-by: Jiadong Guo <jdguo@fb.com> Co-authored-by: Chris Cummins <chrisc.101@gmail.com> Co-authored-by: Bram Wasti <bwasti@fb.com> Co-authored-by: Chris Cummins <cummins@fb.com>

leaderboard implementation

ed24b9b

JD-ETH added the Bug Something isn't working label Mar 20, 2021

JD-ETH requested a review from ChrisCummins March 20, 2021 17:22

JD-ETH self-assigned this Mar 20, 2021

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 20, 2021

fails due to env selection

ceb3d4a

ChrisCummins marked this pull request as draft March 22, 2021 10:54

JD-at-work and others added 4 commits March 24, 2021 21:00

fails at 60th evaluation

442f6af

merge dev

4df012a

Add JD's tabular-q leaderboard submission

5c58601

JD-at-work added 2 commits April 13, 2021 14:36

merge recent

87c59c9

updated smoke test and readme

ec9b44f

JD-ETH marked this pull request as ready for review April 13, 2021 15:07

ChrisCummins changed the title ~~[WIP] Leaderboard implementation of tabular Q~~ Leaderboard implementation of tabular Q Apr 14, 2021

JD-ETH merged commit e6a53d3 into facebookresearch:development Apr 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Leaderboard implementation of tabular Q #141

Leaderboard implementation of tabular Q #141

JD-ETH commented Mar 20, 2021

ChrisCummins commented Mar 20, 2021

JD-ETH commented Mar 20, 2021

JD-ETH commented Mar 22, 2021

JD-ETH commented Mar 23, 2021

ChrisCummins commented Mar 23, 2021

ChrisCummins commented Apr 14, 2021

Leaderboard implementation of tabular Q #141

Leaderboard implementation of tabular Q #141

Conversation

JD-ETH commented Mar 20, 2021

ChrisCummins commented Mar 20, 2021

JD-ETH commented Mar 20, 2021

JD-ETH commented Mar 22, 2021

JD-ETH commented Mar 23, 2021

ChrisCummins commented Mar 23, 2021

ChrisCummins commented Apr 14, 2021