🤗 TransformerTraining

Welcome! This project focuses on training 🤗 transformers using custom datasets for classification and segmentation, in my case particularly focusing on data from Minecraft environments. (minecraft-llm-sandbox)[https://github.com/MartinDeanMoriarty/minecraft-llm-sandbox].

The repository includes code for handling and processing such data, as well for model training and evaluation.

Warning

This project is just a starting point and i do NOT know how to make it work, because i got no time to deal with dataset creation. The code provided here is not tested and the documentation is incomplete. Use at your own risk. Maybe someone stops by and fixes it or leaves an explanation on how to create a dataset correctly.

Installation

Please ensure you have Python installed on your system.

Clone the repository: git clone
Navigate to the project directory.
Create a virtual environment (optional but recommended).
Start the virtual environment: source venv/bin/activate (on Windows use venv\Scripts\activate).
Install dependencies: pip install -r requirements.txt
Download or prepare your dataset and place it in the appropriate directory within the project structure.
Run main.py with aguments to start the script: python main.py --arg

Arguments

--prepareData , script is used for preparing data from Minecraft environments, converting it into a format suitable for training 🤗 transformers. This includes steps like cleaning, tokenization, and other preprocessing tasks.

--verifyData , script verifies the prepared data to ensure that it meets the required standards before proceeding with model training.

--train , script is used for training 🤗 transformer models using the processed data.

--evaluateModel , script evaluates the trained model's performance on a validation or test dataset to assess its effectiveness in handling specific tasks.

Contributing

Contributions are welcome! If you think something is missing, incorrect, or could be improved upon, please feel free to open an issue or submit a pull request.

License

This project is licensed under the "I DONT CARE License". Feel free to use it as you wish!

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
modules		modules
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤗 TransformerTraining

Table of Contents

Warning

Installation

Arguments

Contributing

License

About

Languages

MartinDeanMoriarty/TransformerTraining

Folders and files

Latest commit

History

Repository files navigation

🤗 TransformerTraining

Table of Contents

Warning

Installation

Arguments

Contributing

License

About

Topics

Resources

Stars

Watchers

Forks

Languages