VGTU talpykla > Fundamentinių mokslų fakultetas / Faculty of Fundamental Sciences > Moksliniai straipsniai / Research articles >

Lietuvių   English
Please use this identifier to cite or link to this item:

Title: Unsupervised Pre-Training for Voice Activation
Authors: Kolesau, Aliaksei
Šešok, Dmitrij
Keywords: voice activation
unsupervised learning
keyword spotter
neural network
Issue Date: 2020
Publisher: MDPI
Citation: Kolesau, A.; Šešok, D. Unsupervised Pre-Training for Voice Activation. Appl. Sci. 2020, 10, 8643.
Series/Report no.: 10;23
Abstract: The problem of voice activation is to find a pre-defined word in the audio stream. Solutions such as keyword spotter “Ok, Google” for Android devices or keyword spotter “Alexa” for Amazon devices use tens of thousands to millions of keyword examples in training. In this paper, we explore the possibility of using pre-trained audio features to build voice activation with a small number of keyword examples. The contribution of this article consists of two parts. First, we investigate the dependence of the quality of the voice activation system on the number of examples in training for English and Russian and show that the use of pre-trained audio features, such as wav2vec, increases the accuracy of the system by up to 10% if only seven examples are available for each keyword during training. At the same time, the benefits of such features become less and disappear as the dataset size increases. Secondly, we prepare and provide for general use a dataset for training and testing voice activation for the Lithuanian language. We also provide training results on this dataset.
Description: This article belongs to the Section Computing and Artificial Intelligence
ISSN: 2076-3417
Appears in Collections:Moksliniai straipsniai / Research articles

Files in This Item:

File Description SizeFormat
Unsupervised Pre-Training for Voice Activation.pdf368.86 kBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.


Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback