Skip to content
/ PAM Public

PAM is a no-reference audio quality metric for audio generation tasks

License

Notifications You must be signed in to change notification settings

soham97/PAM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PAM: Prompting Audio-Language Models for Audio Quality Assessment

[Paper] [data]

PAM is a no-reference metric for assessing audio quality for different audio processing tasks. It prompts Audio-Language Models (ALMs) using an antonym prompt strategy to calculate an audio quality score. It does not require reference data or task-specific models and correlates well with human perception. PAM_9 (1)

News

[Jul 24] Improved human correlation across tasks [commit]
[Mar 24] PAM is accepted at INTERSPEECH 2024

Setup

Open the Anaconda terminal and run:

> git clone https://github.com/soham97/PAM.git
> cd PAM 
> conda create -n pam python=3.10
> conda activate pam
> pip install -r requirements.txt

Compute PAM

Folder evaluation

To compute PAM on folder containing audio files, you can directly run:

> python run.py --folder {folder_path}

The symbol {..} indicates user input.

Custom evaluation

To compute PAM on heirarchy of folder or multiple directory, we recommed creating a custom dataset.

  • In dataset.py creating a custom dataset by inheriting from AudioDataset, similar to ExampleDataset
  • Modify the get_filelist function to fit to your directory structure
  • Update the run.py with your custom dataset and make changes to evaluation if needed

Data

The manuscript uses data from multiple sources. It can be obtained as follows:

Paper reproduction

This section covers reproducing numbers for text-to-audio and text-to-music. First download the human listening test data by following the instruction listed above. The download should contain a folder titled human_eval.

Then run the following commands.

> python pcc.py --folder {folder_path}

where {folder_path} points to human_eval folder.

Citation

@article{deshmukh2024pam,
  title={PAM: Prompting Audio-Language Models for Audio Quality Assessment},
  author={Soham Deshmukh and Dareen Alharthi and Benjamin Elizalde and Hannes Gamper and Mahmoud Al Ismail and Rita Singh and Bhiksha Raj and Huaming Wang},
  journal={arXiv preprint arXiv:2402.00282},
  year={2023}
}

About

PAM is a no-reference audio quality metric for audio generation tasks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages