GitHub - amsehili/auditok: An audio/acoustic activity detection and audio segmentation tool

https://codecov.io/github/amsehili/auditok/graph/badge.svg?token=0rwAqYBdkf

auditok is an Audio Activity Detection tool that processes online data (from an audio device or standard input) and audio files. It can be used via the command line or through its API.

Full documentation is available on Read the Docs.

Installation

auditok requires Python 3.7 or higher.

To install the latest stable version, use pip:

sudo pip install auditok

To install the latest development version from GitHub:

pip install git+https://github.com/amsehili/auditok

Alternatively, clone the repository and install it manually:

git clone https://github.com/amsehili/auditok.git
cd auditok
python setup.py install

Basic example

Here's a simple example of using auditok to detect audio events:

import auditok

# `split` returns a generator of AudioRegion objects
audio_events = auditok.split(
    "audio.wav",
    min_dur=0.2,     # Minimum duration of a valid audio event in seconds
    max_dur=4,       # Maximum duration of an event
    max_silence=0.3, # Maximum tolerated silence duration within an event
    energy_threshold=55 # Detection threshold
)

for i, r in enumerate(audio_events):
    # AudioRegions returned by `split` have defined 'start' and 'end' attributes
    print(f"Event {i}: {r.start:.3f}s -- {r.end:.3f}")

    # Play the audio event
    r.play(progress_bar=True)

    # Save the event with start and end times in the filename
    filename = r.save("event_{start:.3f}-{end:.3f}.wav")
    print(f"Event saved as: {filename}")

Example output:

Event 0: 0.700s -- 1.400s
Event saved as: event_0.700-1.400.wav
Event 1: 3.800s -- 4.500s
Event saved as: event_3.800-4.500.wav
Event 2: 8.750s -- 9.950s
Event saved as: event_8.750-9.950.wav
Event 3: 11.700s -- 12.400s
Event saved as: event_11.700-12.400.wav
Event 4: 15.050s -- 15.850s
Event saved as: event_15.050-15.850.wav

Split and plot

Visualize the audio signal with detected events:

import auditok
region = auditok.load("audio.wav") # Returns an AudioRegion object
regions = region.split_and_plot(...) # Or simply use `region.splitp()`

Example output:

Split an audio stream and re-join (glue) audio events with silence

The following code detects audio events within an audio stream, then insert 1 second of silence between them to create an audio with pauses:

# Create a 1-second silent audio region
# Audio parameters must match the original stream
from auditok import split, make_silence
silence = make_silence(duration=1,
                       sampling_rate=16000,
                       sample_width=2,
                       channels=1)
events = split("audio.wav")
audio_with_pauses = silence.join(events)

Alternatively, use split_and_join_with_silence:

from auditok import split_and_join_with_silence
audio_with_pauses = split_and_join_with_silence(silence_duration=1, input="audio.wav")

Export an `AudioRegion` as a `numpy` array

from auditok import load, AudioRegion
audio = load("audio.wav") # or use `AudioRegion.load("audio.wav")`
x = audio.numpy()
assert x.shape[0] == audio.channels
assert x.shape[1] == len(audio)

Limitations

The detection algorithm is based on audio signal energy. While it performs well in low-noise environments (e.g., podcasts, language lessons, or quiet recordings), performance may drop in noisy settings. Additionally, the algorithm does not distinguish between speech and other sounds, so it is not suitable for Voice Activity Detection in multi-sound environments.

License

MIT.

Name		Name	Last commit message	Last commit date
Latest commit History 438 Commits
.github/workflows		.github/workflows
auditok		auditok
doc		doc
tests		tests
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
CHANGELOG		CHANGELOG
INSTALL		INSTALL
LICENSE		LICENSE
README.rst		README.rst
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Basic example

Split and plot

Split an audio stream and re-join (glue) audio events with silence

Export an `AudioRegion` as a `numpy` array

Limitations

License

About

Releases 4

Packages

Used by 171

Contributors 7

Languages

License

amsehili/auditok

Folders and files

Latest commit

History

Repository files navigation

Installation

Basic example

Split and plot

Split an audio stream and re-join (glue) audio events with silence

Export an AudioRegion as a numpy array

Limitations

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 4

Packages 0

Used by 171

Contributors 7

Languages

Export an `AudioRegion` as a `numpy` array

Packages