Noisy BiLSTM-Based Models for Disfluency Detection

Abstract

This paper describes BiLSTM-based models to disfluency detection in speech transcripts using residual BiLSTM blocks, self-attention, and noisy training approach. Our best model not only surpasses BERT in 4 non-Switchboard test sets, but also is 20 times smaller than the BERT-based model. Thus, we demonstrate that strong performance can be achieved without extensively use of very large training data. In addition, we show that it is possible to be robust across data sets with noisy training approach in which we found insertion is the most useful noise for augmenting training data.

Publication
In Proc. Interspeech 2019