--- Hemanshu Bhargav's Portfolio

Novel Machine Learning for Speech Disorders

Producing novel machine learning research for those with motor, neurological and articulatory speech disorders

Preamble

In my "ASR for Dysarthria" project I outlined an interface solution to the problems that quadriplegics face.

Please navigate to that project from the drop-down as it was the preliminary work for this one.

This solution was limited to current state-of-art which is ineffective.

Background Information: Pictoral grid showing current ASR systems are lacking

Problem

Dysarthric speech has more diffuse regions than non-disordered speech, so generic voice assistants like Alexa and Siri do not recognize voice commands

Waveform image showing sysarthric speech has more diffuse regions than non-disordered speech

Methods

Voice assistants use transformers trained on massive datasets, but the amount of training data for dysarthric speech is insufficient

Outline of training data provided by different datasets

Process

Proposed a two-stage training process (fine-tuning a speech-based Large Language Model like ChatGPT) and wake-word detection process

Process

Recommendations

Collected the largest dataset of dysarthric speech and used SpeechBrain to train on the latest transformers.

Training revealed that the tranformer could not converge on keyword spotting.

Alternatively, tried merging samples to create longer tokens, but this reduced the number of observations preventing learning.

Publication 1

Literature Review conducted summarizing research and surverying state-of-the-art for Automatic Speech Recognition.

Publication 2

Academic publication disseminating my research-- presented at the Young Researchers Consoritum (ICCHP-AATE) 2022.