`xeus-finetune`

Warning

Currently, this work is in progress.

This repository contains training code for the XEUS model for Automatic Speech Recognition (ASR). This is a fork of https://github.com/pashanitw/xeus-finetune

Required software

python3.11, python3.11-dev
build-essential, cmake
uv
git-lfs

Note

Python 3.12 cannot be used because one of the dependencies in ESPnet relies on an old package.

Install

uv venv --python 3.11

source .venv/bin/activate

# install espnet
git clone --branch ssl --depth 1 https://github.com/wanchichen/espnet espnet-code
cd espnet-code
git fetch --unshallow
uv pip install -e .

# download XEUS checkpoint
git clone https://huggingface.co/espnet/XEUS

# install required packages
uv pip install -r requirements.txt

# in development mode install additional packages
uv pip install -r requirements-dev.txt

Fine-tuning

Authenticate with HF

huggingface-cli login

Copy a config file, change dataset sections and hparams

cp configs/hi_hf.yaml configs/uk_hf.yaml

Start fine-tuning

accelerate launch finetune.py --config configs/uk_hf.yaml

# if you want to use only one GPU
accelerate launch --num_processes 1 finetune.py --config configs/uk_hf.yaml

Inference

python inference.py --ckpt_path <checkpoint path> --audio audio.wav

# example
python inference.py --ckpt_path ./step_2000 --audio audio.wav

Evaluation

Run the following command to calculate Word Error Rate:

python eval.py --ckpt_path <checkpoint path> --dataset <dataset> --name <subset> --split <split>

# example
python eval.py --ckpt_path ./step_2000 --dataset mozilla-foundation/common_voice_17_0 --name uk --split test

Development

Check and format the code:

ruff check
ruff format

TODO

Enable Flash-Attention for training
Set a cache_dir for load_dataset

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
configs		configs
.gitignore		.gitignore
README.md		README.md
eval.py		eval.py
finetune.py		finetune.py
inference.py		inference.py
model.py		model.py
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
ruff.toml		ruff.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`xeus-finetune`

Required software

Install

Fine-tuning

Inference

Evaluation

Development

TODO

About

Languages

egorsmkv/xeus-finetune

Folders and files

Latest commit

History

Repository files navigation

xeus-finetune

Required software

Install

Fine-tuning

Inference

Evaluation

Development

TODO

About

Topics

Resources

Stars

Watchers

Forks

Languages

`xeus-finetune`