Dippy Empathetic Speech Subnet

Creating the World’s Best Open-Source Speech Model on Bittensor

Check out the beta version of our Front-End!

Introduction
Roadmap
Overview of Miner and Validator Functionality
- Miner
- Validator
Running Miners and Validators
- Running a Miner
- Running a Validator
Contributing
License

Introduction

Note: The following documentation assumes you are familiar with basic Bittensor concepts: Miners, Validators, and incentives. If you need a primer, please check out https://docs.bittensor.com/learn/bittensor-building-blocks.

Dippy is one of the world's leading AI companion apps with 1M+ users. The app has ranked #3 on the App Store in countries like Germany, been covered by publications like Wired magazine and the average Dippy user spends 1+ hour on the app.

The Dippy team is also behind Bittensor's Subnet 11, which exists to create the world's best open-source roleplay LLM. Open-source miner models created on Subnet 11 are used to power the Dippy app. We also plan to integrate the models created from this speech subnet within the Dippy app.

The Dippy Empathetic Speech Subnet on Bittensor is dedicated to developing the world’s most advanced open-source Speech model for immersive, lifelike interactions. By leveraging the collaborative strength of the open-source community, this subnet meets the growing demand for genuine companionship through a speech-first approach. Our objective is to create a model that delivers personalized, empathetic speech interactions beyond the capabilities of traditional assistants and closed-source models.

Unlike existing models that depend on reference speech recordings that limit creative flexibility, we use natural language prompting to manage speaker identity and style. This intuitive approach enables more dynamic and personalized roleplay experiences, fostering deeper and more engaging interactions.

Roadmap

Given the complexity of creating a state of the art speech model, we plan to divide the process into 3 distinct phases.

Phase 1:

Launch a subnet with a robust pipeline for roleplay-specific TTS models, capable of interpreting prompts for speaker identity and stylistic speech description.
Launch infinitely scaling synthetic speech data pipeline
Implement a public model leaderboard, ranked on core evaluation metric
Introduce Human Likeness Score and Word Error Rate as live evaluation criteria for ongoing model assessment.

Phase 2:

Refine TTS models toward producing more creatively expressive, highly human-like speech outputs.
Showcase the highest-scoring models and make them accessible to the public through the front-end interface.

Phase 3:

Advance toward an end-to-end Speech model that seamlessly generates and processes high-quality roleplay audio.
Establish a comprehensive pipeline for evaluating new Speech model submissions against real-time performance benchmarks.
Integrate the Speech model within the Dippy app
Drive the state of the art in Speech roleplay through iterative enhancements and ongoing data collection.

Overview of Miner and Validator Functionality

Miners would use existing frameworks to fine tune models to improve upon the current SOTA open-source TTS model. The finetuned weights would be submitted to a shared Hugging Face pool.

Validators would evaluate and assess model performance via our protocol and rank the submissions based on various metrics (e.g. how natural it sounds, emotion matching, clarity etc.). We will provide a suite of testing and benchmarking protocols with state-of-the-art datasets.

Running a Miner to Submit a Model

Requirements

Python 3.8+
GPU with at least 24 GB of VRAM

Step 1: Setup

To start, clone the repository and cd into it:

git clone https://github.com/impel-intelligence/dippy-speech-subnet.git
cd dippy-speech-subnet
pip install -e .

Step 2: Submitting a model

As a miner, you're responsible for leveraging all methods available at your disposal to finetune the provided base model.

We outline the following criteria for Phase 1:

Models should be a fine-tune of the 880M Parler-TTS model.
Models MUST be Safetensors Format!
Model: We currently use Parler TTS Mini v1 on Hugging Face as our base model.

Once you're happy with the performance of the model for the usecase of Roleplay, you can simply submit it to Hugging Face 🤗 and then use the following command:

git clone https://github.com/impel-intelligence/dippy-speech-subnet.git
cd dippy-speech-subnet

uv venv .miner
source .miner/bin/activate

uv pip install -r requirements.miner.txt
uv pip install -e .

python neurons/miner.py \
    --repo_namespace REPO_NAMESPACE \  # Replace with the namespace of your repository (e.g., parler-tts)
    --repo_name REPO_NAME \            # Replace with the name of your repository (e.g., parler-tts-mini-v1)
    --config_template CONFIG_TEMPLATE \  # Replace with the miner configuration template (e.g., default)
    --netuid NETUID \                  # Replace with the unique network identifier (e.g., 231)
    --subtensor.network NETWORK \      # Replace with the network (e.g., test or finney)
    --online ONLINE \                  # Set to True to enable mining
    --model_hash MODEL_HASH \          # Replace with the hash of your model
    --wallet.name WALLET_NAME \        # Replace with the name of your wallet coldkey name
    --wallet.hotkey HOTKEY \           # Replace with your wallet hotkey name
    --wallet.path WALLET_PATH \        # Replace with the path to your wallet directory (e.g.,  "~/.bittensor/wallets/" )
    --logging.debug DEBUG              # Set to True for debug logging (or False for production)

Example

python neurons/miner.py \    
   --repo_namespace parler-tts  \   
   --repo_name parler-tts-mini-v1     
   --config_template default \    
   --netuid 231 \    
   --subtensor.network test   \  
   --online True  \   
   --model_hash 555    \ 
   --wallet.name coldkey2    \ 
   --wallet.hotkey hotkey2     \
   --wallet.path "~/.bittensor/wallets/"  \ 
   --logging.debug True

Running a Validator

Requirements

Use Python 3.11.5
UV python package manager

Running Validator Via Auto Update (Recommended):

git clone https://github.com/impel-intelligence/dippy-speech-subnet.git
cd dippy-speech-subnet/validator_updater

Step 1:

Request the log token from the moderator or relevant team member via discord channel.

Step 2:

Locate the fluent.conf file in the project structure:

.
├── Dockerfile.fluent
├── Dockerfile.validator
├── build.sh
├── docker-compose.yml
└── fluentd
    └── fluent.conf

Update the following line in the fluent.conf file with the provided token:

source_token <ADD LOG TOKEN HERE>

Step 3:

Execute the auto-update validator script with your wallet keys and organization name:

bash build.sh \
    --wallet.name WALLET_NAME \  # Replace with your cold wallet name
    --wallet.hotkey HOTKEY \     # Replace with your hotkey name
    --org.name ORGNAME           # Replace with your organization name

bash build.sh --wallet.name Examplekey4 --wallet.hotkey Examplekey4 --org.name Dippy_EXAMPLE

Running Script Directly

To start, clone the repository and cd to it:

git clone https://github.com/impel-intelligence/dippy-speech-subnet.git
cd dippy-speech-subnet

uv venv .validator
source .validator/bin/activate

uv pip install -r requirements.validator.txt
uv pip install -e .

To run the evaluation, simply use the following command:

python neurons/validator.py \
    --wallet.name WALLET_NAME \           # Replace with the name of your wallet coldkey (e.g., coldkey4)
    --wallet.hotkey HOTKEY \              # Replace with your wallet hotkey name (e.g., hotkey4)
    --device DEVICE \                     # Replace with the device to use (e.g., cpu or cuda)
    --netuid NETUID \                     # Replace with the unique network identifier (e.g., 231)
    --subtensor.network NETWORK \         # Replace with the network name (e.g., test or finney)
    --wallet.path WALLET_PATH             # Replace with the path to your wallet directory (e.g., "~/.bittensor/wallets/")

 python neurons/validator.py \ 
   --wallet.name coldkey4 \
   --wallet.hotkey hotkey4 \
   --device cuda \
   --netuid 231  \
   --subtensor.network finney \
   --wallet.path "~/.bittensor/wallets/"

Please note that this validator will call the model validation service hosted by the dippy subnet owners. If you wish to run the model validation service locally, please follow the instructions below.

Running the model evaluation API (Experimental)

Note: Currently (November 22 2024) this is experimental. We recommend using the remote validation api temporarily.

Starting a validator using your local validator API requires starting validator with --use-local-validation-api flag. Additionally, a "model_queue" and "worker" is required to push models to the validation api.

Note: Validator API needs to be installed in a different venv than validator due to pydantic version conflict.

Requirements

Python 3.9+
Linux
UV python package manager

Setup

Install Git Lfs if not installed.

curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs

If you are running on runpod you might also need to install 'netstat'.

apt-get install net-tools

Step 1 - Run Worker

Build Evaluator Image

cd dippy-speech-subnet
docker build -f evaluator.Dockerfile -t speech .

Build Worker Image

docker build -f worker.Dockerfile -t worker-image .

Run Worker Image

docker run -d --name worker-container -v /var/run/docker.sock:/var/run/docker.sock worker-image

Stream Logs:

docker logs -f worker-container

Step 2 - Run Model Queue

Build Model Queue Image:

docker build -f modelq.Dockerfile -t modelq-image .

Run Model Queue Image:

docker run -d --name modelq-container modelq-image

Stream Logs:

docker logs -f modelq-container

Run the Validation API

Set .env variable:

POSTGRES_URL=xxxxxxxxxx

python voice_validation_api/validation_api.py

Running the validator with your own validation API service running locally

# Make a separate venv for the validator because of pydantic version conflict
uv venv .validator
source .validator/bin/activate

uv pip install -e .
python neurons/validator.py --wallet.name WALLET_NAME --wallet.hotkey WALLET_HOT_NAME --use-local-validation-api
# Run model queue to push models to validation api to be evaluated
python neurons/model_queue.py --use-local-validation-api

python voice_validation_api/worker_queue.py

Local Development

Prepare a .env File

Ensure you have a .env file in the root directory of your project. This file should include the necessary environment variables for the application to function correctly. Below is an example .env file:

# Admin credentials
ADMIN_KEY=example_admin_key

# Supabase credentials
SUPABASE_KEY=example_supabase_key
SUPABASE_URL=https://example.supabase.co

# Hugging Face credentials
HF_ACCESS_TOKEN=hf_example_access_token
HF_USER=ExampleUser
DIPPY_KEY=example_dippy_key

# OpenAI API Key
OPENAI_API_KEY=sk-example_openai_api_key

# Dataset API Key
DATASET_API_KEY=example_dataset_api_key

POSTGRES_URL=postgresql://vapi:vapi@localhost:5432/vapi # For local dev db spun up by docker

Spin up services

To quickly spin up the model_queue, worker_queue, and validation_api, use the local-compose script.

docker compose -f local-compose.yml up -d --build

Viewing Logs

To monitor logs for a specific container, use the following command, replacing "Container name to see logs" with the desired container's name:

docker logs -f "<Container name to see logs>"

Viewing and Interacting with the Local PostgreSQL Database

1. Setup SSH Tunnel

ssh -L 5432:localhost:5432 <remote_server_ip>

2. Connect to the Database Using DBeaver

If you prefer a graphical interface to inspect the database, such as entries in tables:

Open DBeaver (or any other database client of your choice).
Configure a new connection:
- Host: localhost
- Port: 5432
- Database name, username, and password as per the local configuration.
Access and query the database contents as needed.

This is only required if you need a tool like DBeaver to examine database contents interactively.

Notes

The setup is configured to test netuid 231 and uses a local database provided by the Docker Compose file.
Ensure Docker Compose is installed and running on your system before executing the commands.

Model Evaluation Criteria

Human Likeness Score

Models are evaluated on how closely their vocal outputs resemble natural human speech, considering factors such as emotional expression, intonation, pauses, and excitation levels. The more natural and convincingly human the voice sounds, the higher the score.

Word Error Rate

Models that produce human-like speech with high clarity and coherence will achieve the highest scores. Lower word error rates indicate clearer, more accurate speech output, enhancing the model's overall evaluation.

Acknowledgement

Our codebase is built upon Nous Research's and MyShell's Subnets.

License

The Dippy Bittensor subnet is released under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 151 Commits
.github/workflows		.github/workflows
assets		assets
common		common
constants		constants
neurons		neurons
scoring		scoring
utilities		utilities
validator_updater		validator_updater
voice_validation_api		voice_validation_api
.dockerignore		.dockerignore
.gitignore		.gitignore
Dippy.png		Dippy.png
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
drawio.png		drawio.png
evaluator.Dockerfile		evaluator.Dockerfile
local-compose.yml		local-compose.yml
min_compute.yml		min_compute.yml
modelq.Dockerfile		modelq.Dockerfile
pyproject.toml		pyproject.toml
requirements.api.txt		requirements.api.txt
requirements.eval.txt		requirements.eval.txt
requirements.miner.txt		requirements.miner.txt
requirements.txt		requirements.txt
requirements.validator.txt		requirements.validator.txt
requirements_modelq.txt		requirements_modelq.txt
requirements_val_api.txt		requirements_val_api.txt
setup.py.bak		setup.py.bak
vapi.Dockerfile		vapi.Dockerfile
worker.Dockerfile		worker.Dockerfile

License

impel-intelligence/dippy-speech-subnet

Folders and files

Latest commit

History

Repository files navigation

Dippy Empathetic Speech Subnet

Introduction

Roadmap

Overview of Miner and Validator Functionality

Running a Miner to Submit a Model

Requirements

Step 1: Setup

Step 2: Submitting a model

Example

Running a Validator

Requirements

Running Validator Via Auto Update (Recommended):

Step 1:

Step 2:

Step 3:

Running Script Directly

Running the model evaluation API (Experimental)

Requirements

Setup

Step 1 - Run Worker

Step 2 - Run Model Queue

Run the Validation API

Running the validator with your own validation API service running locally

Local Development

Prepare a .env File

Spin up services

Viewing Logs

Viewing and Interacting with the Local PostgreSQL Database

1. Setup SSH Tunnel

2. Connect to the Database Using DBeaver

Notes

Model Evaluation Criteria

Human Likeness Score

Word Error Rate

Acknowledgement

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages