Skip to content

Commit

Permalink
add readme
Browse files Browse the repository at this point in the history
Signed-off-by: Gerald Shen <geshen@nvidia.com>
  • Loading branch information
gshennvm committed Dec 2, 2023
1 parent 82d8e0e commit e26d4e5
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ The toolkit is currently in it's early stages, and we are committed to improving
* **Supervised Fine Tuning**
* **Reward Model Training**
* **Reinforcement Learning from Human Feedback using the PPO Algorithm**
* **[Direct Preference Optimization](https://arxiv.org/pdf/2305.18290.pdf)**

## Learn More
* [Documentation](./docs/README.md)
Expand Down Expand Up @@ -44,7 +45,7 @@ To build your own, refer to the [Dockerfile](https://github.com/NVIDIA/NeMo/blob
For the list of changes within each release please see the [Changelog](CHANGELOG.md).

## Future work
- Add DPO and Rejection Sampling support
- Add Rejection Sampling support
- We will continue improving the stability of the PPO learning phase.
- Improve the performance of RLHF

Expand Down

0 comments on commit e26d4e5

Please sign in to comment.