This repository has been archived by the owner on Mar 31, 2019. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 71
Development status
justheuristic edited this page May 4, 2016
·
4 revisions
The library is currently in active development and there is much to be done yet.
Below is the list of library components and development vectors. All of these are open to your ideas and contributions.
[priority] Component; no priority means "done"
-
Core components
-
Environment
-
Objective
-
Agent architecture
- MDP (RL) agent
- Generator
- Fully customizable agent
-
Experiment platform
- [high] Experiment setup zoo
- [medium] Pre-trained model zoo
- [medium] quick experiment running (
- experiment is defined as (environment, objective function, NN architecture, training algorithm)
-
Layers
-
Memory
- Simple RNN
- One-step GRU memory
- Custom GateLayer
- LSTM as GRU + output GateLayer
- Window augmentation (K last states)
- Stack Augmentation
- [low] List augmentation
- [low] Neural Turing Machine controller
-
Resolvers
- Greedy resolver (as BaseResolver)
- Epsilon-greedy resolver
- Probablistic resolver
- [High] Adversarial resolver (test if it works)
-
Learning objectives algorithms
- Q-learning
- SARSA
- k-step Q-learning
- k-step Advantage Actor-critic methods
- Can use any theano/lasagne expressions for loss, gradients and updates
- Experience replay pool
-
Experiment setups
- boolean reasoning - basic "tutorial" experiment about learning to exploit variable dependencies
- Wikicat - guessing person's traits based on wikipedia biographies
- [high, in progress] openAI gym training/evaluation api and demos
- [medium] KSfinder - detecting particle decays in Large Hadron Collider beauty experiment
- [medium] 2048-in-a-browser with Selenium
-
Visualization tools
- basic monitoring tools
- [medium] generic tunable session visualizer
-
Technical stuff
- [high] Ensuring Python3 compatibility
- including examples
- [medium] TensorFlow backend (manual or TensorFuse)
- [high] Making tests out examples
- [high] Ensuring Python3 compatibility
-
Explanatory material
-
[medium] readthedocs pages
-
[global] MOAR sensible examples
-
[medium] report on prior basic research (optimizer comparison, training algorithm comparison, layers, etc)