Skip to content

Latest commit

 

History

History
96 lines (77 loc) · 9.17 KB

README.md

File metadata and controls

96 lines (77 loc) · 9.17 KB

Speech-Prompts-Adapters

This Repository surveys the paper focusing on Adapters and Prompting methods for Speech Processing.

Navigation

NEWS

  • In ICASSP 2023, we will give a tutorial about Paramter-Efficient Learning for speech processing and natural langauge processing. I (Kai-Wei Chang) will cover the topics of adapters and prompts for speech processing.

ICASSP 2023 Tutorial Information

  • Title: Parameter-Efficient Learning for Speech and Language Processing: Adapters, Prompts, and Reprogramming
  • Conference: ICASSP 2023
  • Website: ICASSP 2023 - Tutorials
  • Parameter-Efficient Learning for Speech Processing Slides

Presenters:

  • Pin-Yu Chen (IBM Research)
  • Hung-yi Lee (National Taiwan University)
  • Chao-Han Huck Yang (Georgia Institute of Technology )
  • Kai-Wei Chang (National Taiwan University)
  • Cheng-Han Chiang (National Taiwan University)

Adapters and Prompting for Speech Processing

Adapters for Speech Processing

Title Authors Modality Task Link
Differentially Private Adapters for Parameter Efficient Acoustic Modeling Chun-Wei Ho et al. Speech keyword Spotting Interspeech 2023
Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition Haoyu Tang et al. Speech ASR arXiv 2023
A Parameter-Efficient Learning Approach to Arabic Dialect Identification with Pre-Trained General-Purpose Speech Model Srijith Radhakrishnan et al. Speech Dialect Identification Interspeech 2023
CHAPTER: Exploiting Convolutional Neural Network Adapters for Self-supervised Speech Models Zih-Ching Chen et al. Speech [Multiple] arXiv 2022
Parameter Efficient Transfer Learning for Various Speech Processing Tasks Shinta Otake et al. Speech [Multiple] arXiv 2022
Parameter-efficient transfer learning of pre-trained Transformer models for speaker verification using adapters Junyi Peng et al. Speech Speaker Verification arXiv 2022
Exploring Efficient-tuning Methods in Self-supervised Speech Models Zih-Ching Chen et al. Speech [Multiple] SLT 2022
DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised Learning and Its Application to Children’s ASR Ruchao Fan, Abeer Alwan Speech ASR Interspeech 2022
Speaker adaptation for Wav2vec2 based dysarthric ASR Murali Karthick Baskar et al. Speech ASR Interspeech 2022
Adaptive multilingual speech recognition with pretrained models Ngoc-Quan Pham et al. Speech ASR Interspeech 2022
An Adapter Based Pre-Training for Efficient and Scalable Self-Supervised Speech Representation Learning Samuel Kessler et al. Speech ASR ICASSP 2022
Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition Bethan Thomas et al. Speech ASR ICASSP 2022
Scaling End-to-End Models for Large-Scale Multilingual ASR Bo Li et al. Speech ASR ASRU 2021
Meta-Adapter: Efficient Cross-Lingual Adaptation With Meta-Learning Wenxin Hou et al. Speech ASR ICASSP 2021
Exploiting Adapters for Cross-Lingual Low-Resource Speech Recognition Wenxin Hou et al. Speech ASR TASLP 2021
Lightweight Adapter Tuning for Multilingual Speech Translation Hang Le et al. Speech Speech Translation ACL-IJCNLP 2021
Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech Katrin Tomanek et al. Speech ASR EMNLP 2021
Multilingual Speech Recognition with Self-Attention Structured Parameterization Yun Zhu et al. Speech ASR Interspeech 2020
Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model Anjuli Kannan et al. Speech ASR Interspeech 2019

Prompting for Speech Processing

Title Authors Modality Task Link
Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization Puyuan Peng et al. Speech [Multiple] Interspeech 2023
From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition Chao-Han Huck Yang et al. Speech ASR ICASSP 2023
SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks Kai-Wei Chang et al. Speech [Multiple] arXiv 2023
Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision Eugene Kharitonov et al. Text & Speech TTS arXiv 2023
Describing emotions with acoustic property prompts for speech emotion recognition Hira Dhamyal et al. Text & Speech ER arXiv 2022
PromptTTS: Controllable Text-to-Speech with Text Descriptions Zhifang Guo et al. Text & Speech TTS arXiv 2022
Neural Model Reprogramming with Similarity Based Mapping for Low-Resource Spoken Command Classification Hao Yen et al. Speech Spoken Command Recognition arXiv 2022
WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models Heting Gao et al. Text & Speech SLU Interspeech 2022
An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks Kai-Wei Chang et al. Speech [Multiple] Interspeech 2022

Reprogramming and Prompting

For more information about reprogramming and prompting for large pre-trained models, please refer to the "awesome-neural-reprogramming-acoustic-prompting" repository. This topic was also covered in ICASSP 2022 tutorial by Dr. Pin-Yu Chen and Dr. Huck Yang.


Parameter Efficient Learning Methods

Title Authors Link
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models Elad Ben Zaken et al. ACL 2022
Towards a Unified View of Parameter-Efficient Transfer Learning Junxian He et al. ICLR 2022
LoRA: Low-Rank Adaptation of Large Language Models Edward J. Hu et al. ICLR 2022
Parameter-Efficient Transfer Learning for NLP Neil Houlsby et al. ICML 2019

Acknowledgment

We thank Kuang-Chen Peng, Tzu-Han Lin, and Fabian Ritter for their invaluable contribution to the initial collection.

Contact

This repository is maintained by Kai-Wei Chang (kaiwei.chang.tw@gmail.com) and Zih-Ching Chen. Feel free to contact us or make a pull request