Purpose

This project offers a privacy-focused solution for transcribing and summarizing audio recordings through entirely local processing. Using OpenAI's Whisper for transcription and local LLMs via Ollama for summarization, it processes audio files (MP3/WAV) entirely on your machine, ensuring sensitive content never leaves your environment.

The tool automatically generates structured summaries including:

Executive overview
Detailed content breakdown
Action items
Meeting metadata

The project uses a configuration-based approach (config.yaml) for easy customization of output formats, model parameters, and summary structures, making it adaptable for various meeting types and organizational needs

Setup

Select Python Interpreter Version Between 3.8-3.11

I am using Python 3.10.11

Install `ffmpeg` Globally as PowerShell Administrator

NOTE: ffmpeg DOES NOT work in virutal environment and is required for Whisper to work. A "[Win2]File not found error" populates when attempting to use within a virtual environment, although it is not best practice, utilize your global environment instead!

Follow the instructions HERE and install choclatey install via PowerShell Administration to install ffmpeg.
Install FFmpeg:
```
  choco install ffmpeg
```

Requirements Installation

python3.X -m pip install -r requirements.txt --no-warn-script-location

Enable Long Paths

From PowerShell Administrator run the following:

New-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\FileSystem" -Name "LongPathsEnabled" -Value 1 -PropertyType DWORD -Force

Download PyTorch with CUDA Support for GPU Acceleration

If you have NVIDIA GPUs, determine what compute platform you have present:

  nvidia-smi.exe

Identify "CUDA Version"
Navigate to: https://pytorch.org/get-started/locally/
Select options specific to your environment and install the command specified!
Once installation is complete run:

  python3.X pytorch_verify.py

Example Successfull Output:

True
NVIDIA GeForce RTX 3080 Ti Laptop GPU

Usage

1. `transcribe.py` (user input) or `transcribe-args.py` (commandline arguments)

Simple User Input

python3.X transcribe.py

Commandline Input

Single File

python3.X transcribe-args.py --mode single --input-file ./path/to/audio.mp3 --output-dir ./path/to/output --model large

Multiple Files

python3.X transcribe-args.py --mode multiple --input-dir ./path/to/audio/files --output-dir ./path/to/output --model base

-h or --help: To see all available options and examples
--mode: Choose between 'single' or 'multiple' REQUIRED
- single: Transcribe one audio file
- multiple: Transcribe all audio files in a directory
--input-file: Path to the audio file REQUIRED
--input-dir: Path to directory containing audio files REQUIRED
--output-dir: Directory where transcription files will be saved REQUIRED
--model: Choose Whisper model size (If not specified, the script uses the 'base' model by default.) OPTIONAL
- tiny: Fastest, lowest accuracy
- base: Good balance of speed and accuracy
- small: Better accuracy, slower than base
- medium: High accuracy, slower
- large: Highest accuracy, slowest

2. `summarize.py`

Configurations

Download Ollama and run your chosen model, follow the process HERE.
Configure and update the REQUIRED settings in the config.yaml

python3.X summarize.py

Example `config.yaml`

llm: 
  model_name: "YOUR_MODEL_NAME" # REQUIRED
  max_retries: 3 # OPTIONAL
  retry_delay: 2 # OPTIONAL
  api_url: "http://localhost:11434/api/generate"
  options:
    temperature: 0.7 # OPTIONAL
    top_p: 0.9 # OPTIONAL
    max_tokens: 8000 # OPTIONAL

output:
  format: "md" # OPTIONAL
  log_file: "./data/transcript_processor.log" # REQUIRED

paths: 
  input_transcript: "./data/transcriptions/" # REQUIRED
  output_directory: "./data/meeting summaries" # REQUIRED
  audio_file: "./data/converted-audio/" # REQUIRED
  
Shortened for Space...

LLM Config Settings

model_name: Choose your Ollama model
max_retries: Number of API call attempts
retry_delay: Delay between retries
temperature: Controls response creativity, higher = more creative, lower = more concise (0.0-1.0)
top_p: controls similarity sampling (accuracy) when generating a response (0.1-1)
max_tokens: Maximum response length

Path Configuration

input_transcript: Path to your transcript file
output_directory: Directory for generated summaries

Output Settings

format: Output format ('md' or 'txt')
log_file: Location of log file

Prompt Templates

executive_summary
detailed_summary
action_items

Troubleshooting

Ollama Connection

Navigate to http://localhost:11434 to ensure Ollama is running.
From terminal input the following to identify and ensure your model is downloaded successfully:
```
ollama list
```
```
ollama run YOUR_MODEL
```

TO-DO

Updating README

Include the use of audio-converter.py at the beginning to ensure that audio conversion is in either .mp3 or .wav format.
Separate audio-converter.py
Make audio-converter.py it's own project and just have the current project reference it to be used...

Command Line Overrides

Incorporation of commandline overrides should they become a repetitive occurence.

# Future implementation in summarize.py
import argparse

def parse_args():
    """Parse command line arguments for path overrides."""
    parser = argparse.ArgumentParser(description='Transcript Summarization Tool')
    parser.add_argument(
        '--input-file',
        help='Override input transcript file path from config.yaml'
    )
    parser.add_argument(
        '--output-dir',
        help='Override output directory path from config.yaml'
    )
    return parser.parse_args()

Override Input File and Output Directory

python3.X summarize.py --input-file ./new/transcript.txt --output-dir ./new/output

Config.yaml Command Line Arguments

config.yaml should take command-line arguments rather than directy file manipulation
User should be able to specify the title of the summary based off of the original audio file name when placed inside of "meeting summaries" so that there is clarity.

Stream-line the Entire Process

GOAL: Start with audio conversion (if needed) > transciption > summarization all with one command...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Purpose

Setup

Select Python Interpreter Version Between 3.8-3.11

Install `ffmpeg` Globally as PowerShell Administrator

Requirements Installation

Enable Long Paths

Download PyTorch with CUDA Support for GPU Acceleration

Usage

1. `transcribe.py` (user input) or `transcribe-args.py` (commandline arguments)

Simple User Input

Commandline Input

Single File

Multiple Files

2. `summarize.py`

Configurations

Example `config.yaml`

LLM Config Settings

Path Configuration

Output Settings

Prompt Templates

Troubleshooting

Ollama Connection

TO-DO

Updating README

Command Line Overrides

Override Input File and Output Directory

Config.yaml Command Line Arguments

Stream-line the Entire Process

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
README.md		README.md
audio-converter.py		audio-converter.py
config.yaml		config.yaml
preprocess.py		preprocess.py
pytorch_verify.py		pytorch_verify.py
requirements.txt		requirements.txt
summarize.py		summarize.py
transcribe-args.py		transcribe-args.py
transcribe.py		transcribe.py

chumphrey-cmd/Ollama-Transcriber

Folders and files

Latest commit

History

Repository files navigation

Purpose

Setup

Select Python Interpreter Version Between 3.8-3.11

Install ffmpeg Globally as PowerShell Administrator

Requirements Installation

Enable Long Paths

Download PyTorch with CUDA Support for GPU Acceleration

Usage

1. transcribe.py (user input) or transcribe-args.py (commandline arguments)

Simple User Input

Commandline Input

Single File

Multiple Files

2. summarize.py

Configurations

Example config.yaml

LLM Config Settings

Path Configuration

Output Settings

Prompt Templates

Troubleshooting

Ollama Connection

TO-DO

Updating README

Command Line Overrides

Override Input File and Output Directory

Config.yaml Command Line Arguments

Stream-line the Entire Process

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Install `ffmpeg` Globally as PowerShell Administrator

1. `transcribe.py` (user input) or `transcribe-args.py` (commandline arguments)

2. `summarize.py`

Example `config.yaml`

Packages