Skip to content

A repository of datasets in the domain of code for instruction fine-tuning.

Notifications You must be signed in to change notification settings

Denilah/Instruction_Code_Datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

Instruction_Code_Datasets

A repository of datasets in the domain of code for instruction fine-tuning.

Datasets

Note: The following datasets are not processed, only collected.

Dataset Release time Scale Lang Programming Lang Task
Instruct-to-Code Mar 28,2023 451k Mul python…et al. et al.
godot_dodo_4x_60k Apr 27,2023 62533 EN GDScript Code Generation
TSSB-3M-instructions Apr 28,2023 3M EN python…et al. Code bugfix
Codegen May 4,2023 4535 EN C++,Node.js,Python,shell script,Java,JavaScript,et al. Code Generation,Code Summary,QA et al.
codealpaca May 13,2023 20k EN HTML,CSS,Java,SQL,Python,JavaScript,JSX,C++,Swift,Ruby,PHP,et al. Code Generation,Code Search et al.
CodeGPT May 10,2023 32k CN C#,C,C++,Go,Java,JavaScript,PHP,Python,Ruby,et al. Code Generation,Code Search, QA el al.

About

A repository of datasets in the domain of code for instruction fine-tuning.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published