AutoFinetune

Fine-tune an OpenAI model in one command line.

TuneAI provides an effortless way to fine-tune OpenAI models using YouTube video transcripts or text input. The project automates the process of transcript cleaning, prompt-completion pair generation, and training, making it easier to refine AI models for specific tasks.

Features

Automatically clean YouTube video transcripts
Generate prompt-completion pairs from cleaned transcripts
Fine-tune OpenAI models based on generated prompt-completion pairs
Support for both YouTube video links and text input

Installation

Prerequisites

Python 3.7 or later
An OpenAI API key

Steps

Clone the repository: git clone https://github.com/emmethalm/tuneAI.git
Change to the project directory: cd tuneAI
Install the required packages: pip install -r requirements.txt
Create a .env file in the project root directory and add your OpenAI API key OR just add your API key to cleaner.py and prompt_completion_gen.py: echo "OPENAI_API_KEY=your_api_key_here" > .env

Usage

Fine-tuning with a YouTube video transcript

./run_pipeline.sh https://www.youtube.com/watch?v=your_video_id_here

Fine-tuning with a text file

./run_pipeline.sh --text-file path/to/your/text_file.txt

Additional options

--epochs: Specify the number of training epochs (default: 1)
--batch-size: Specify the training batch size (default: 8)
--prompt-length: Specify the maximum prompt length (default: 150)
--response-length: Specify the maximum response length (default: 150)

Best Practices

While you can run the fine-tuning process in one line by running the pipeline, for more precise results run each script individually, check the outputs at each step, and tweak the context sentence in the prompt in prompt_completion_gen.py.

To run step by step:

(install dependencies)

tsc youtube_scraper.ts
node youtube_scraper.js
python3 cleaner.py
python3 prompt_comnpletion_gen.py
export OPENAI_API_KEY=$OPENAI_API_KEY
openai api fine_tunes.create -t prompt_completion_pairs.jsonl -m davinci

The quality of your fine-tuning is fully dependent on the quality of your data.

Happy building!

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
__pycache__		__pycache__
node_modules		node_modules
scripts		scripts
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
run_pipeline.sh		run_pipeline.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AutoFinetune

Features

Installation

Prerequisites

Steps

Usage

Fine-tuning with a YouTube video transcript

Fine-tuning with a text file

Additional options

Best Practices

License

About

Releases

Packages

Languages

License

Dkogan90/tuneAI

Folders and files

Latest commit

History

Repository files navigation

AutoFinetune

Features

Installation

Prerequisites

Steps

Usage

Fine-tuning with a YouTube video transcript

Fine-tuning with a text file

Additional options

Best Practices

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages