SearchSkip to Search Results
The present paper investigates the effect of different inputs on the accuracy of a forced alignment tool built using deep neural networks. Both raw audio samples and Mel-frequency cepstral coefficients were compared as network inputs. A set of experiments were performed using the TIMIT speech...
Poster for the paper "A comparison of input types to a deep neural network-based forced aligner," presented at Interspeech 2018. Typo in alignment matrix (O[2,2] referenced O[1,2] instead of O[1,1]) updated on June 4, 2019. PAPER ABSTRACT: The present paper investigates the effect of different...