AirUI

Team Members: Aman Bhargava, Alice Zhou, Adam Carnaffan

Goal

We aim to create an audio classification system for various gestures (scratches, taps, swipes, etc.) made on a wooden surface by a user. This will form the basis for a type of natural user interface known as ‘reality user interface’ (hence the name Artificially Intelligent Reality User Interface, or AirUI).

Utility

This user interface method would offer an extremely low cost alternative for touch input in computer systems and could help to make technology more accessible, particularly for individuals who have difficulty using conventional user interfaces.

Rationale for Neural Network Model

Convolutional Neural Networks are a commonly used state-of-the-art method for audio classification, particularly when combined with time-frequency methods such as the short-time Fourier transform (STFT).

Todo List

Overall Project Architecture

The data will be pre-processed using audio production software Ableton Live (cropping), Scipy.io.wavfile (loading), Numpy (energy graph), and Scipy (spectrogram).
Processed data will then be passed into one or multiple conventional ML models such as KNN, SVM, and/or Logistic Regression as a benchmark in the baseline stage.
MLP will be used in the development stage.
CNN (convolutional layers followed by fully connected layers) will be used in the final stage.

Results

Baseline Model

Our baseline models included a multilayer perceptron (ReLU activation, 100 hidden neurons, adam solver) for classifying downsampled energy distributions (20 samples per feature vector). Classes included scratches, taps, and 'noise' (in this case, the audio was gathered from lecture to simulate 'real world' noise).

The results are as follows:

=================================================
=== RESULTS FOR INITIAL DATASET BASLINE MODEL ===
=================================================


Total Dataset Size: 		258 Samples
Total Training Set Size: 	193 Samples
Total Validation Set Size: 	65 Samples

Test Set Confusion Matrix: ['scratch', 'tap', 'silence/rose lecture']
[[23  1  0]
 [ 1 16  0]
 [ 0  1 23]]

Training Set Confusion Matrix: ['scratch', 'tap', 'silence/rose lecture']
[[61  1  0]
 [ 0 69  0]
 [ 0  0 62]]

OVERALL SCORE OF MLP on TEST SET: 	93.84615384615384%
OVERALL SCORE OF MLP on TRAIN SET: 	99.48186528497409%

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
B_O_T_X_D		B_O_T_X_D
Baseline_Model		Baseline_Model
CNN_Models		CNN_Models
Feature_Extraction		Feature_Extraction
Parameter_Search		Parameter_Search
Proposal		Proposal
Prototype_Models		Prototype_Models
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AirUI

Goal

Utility

Rationale for Neural Network Model

Todo List

Overall Project Architecture

Results

Baseline Model

About

Releases

Packages

Contributors 4

Languages

ece324-2020/AirUI

Folders and files

Latest commit

History

Repository files navigation

AirUI

Goal

Utility

Rationale for Neural Network Model

Todo List

Overall Project Architecture

Results

Baseline Model

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages