RESEARCH EFFICIENCY FEATURES SPEAKERS RECOGNITION USING NEURAL NETWORKS COAGULATION

Authors

  • M. M. Bykov Vinnytsia National Technical University
  • V. V. Kovtun Vinnytsia National Technical University

Keywords:

automated recognition system speakers critical application, pattern recognition, digital signal processing, cepstral analysis, neural network coagulation

Abstract

The study represents the results of the research of efficiency of the spectral features of speech signal for automated decision-making critical system for speaker recognition with convolutional neural network deep learning classifier, the use of which caused the submission of informative features in graphical view.

Author Biographies

M. M. Bykov, Vinnytsia National Technical University

k. t. n., Associate Professor of Computer Control Systems

V. V. Kovtun, Vinnytsia National Technical University

k. t. n., Associate Professor, Associate Professor of Computer Control Systems

References

1. Critical system — Wikipedia [Електронний ресурс] — Режим доступу : https://en.wikipedia.org/wiki/Critical_system.
2. Биков М. М. Аналіз ефективності ідентифікації мовця за частотою основного тону /
М. М. Биков, В. В. Ковтун. — Вісник Хмельницького національного університету. — 2004. — № 2. — Ч.1. — Т. 2 (60). — С. 20—23.
3. Рабинер Л. Цифровая обработка речевых сигналов / Л. Рабинер, Р. Шафер. — М. : Радио и связь, 1981. — 496 с.
4. Hermansky H. RASTA processing of speech / H. Hermansky, N. Morgan. — IEEE Trans. Speech and Audio Processing. — 1994. — 2, N 6. — P. 578—589.
5. Hermansky H. Perceptual Linear Prediction (PLP) analysis of speech / H. Hermansky. — J. Acoust. Soc. America. — 1990. — 87. — P. 1738—1753.
6. rasta-plp speech analysis — ICSI [Електронний ресурс] — Режим доступу : http://www.icsi.berkeley.edu/pubs/techreports/tr-91-069.pdf.
7. Perceptual Linear Predictive (PLP) Analysis of Speech [Електронний ресурс] — Режим доступу : http://seed.ucsd.edu/mediawiki/images/5/5c/PLP.pdf
8. CS231n: Convolutional Neural Networks for Visual Recognition [Електронний ресурс] —
Режим доступу: http://cs231n.github.io/convolutional-networks/
9. Caffe | Deep Learning Framework [Електронний ресурс] — Режим доступу: http://caffe.berkeleyvision.org/.
10. An overview of gradient descent optimization algorithms [Електронний ресурс] —
Режим доступу: http://sebastianruder.com/optimizing-gradient-descent/.
11. NOIZEUS: Noisy speech corpus - Univ. Texas-Dallas [Електронний ресурс] — Режим доступу: http://ecs.utdallas.edu/loizou/speech/noizeus/.

=============REFERENCES================
1. Critical system — Wikipedia [Yelektronniy resurs] — Rezhim dostupu: https://en.wikipedia.org/wiki/Critical_system.
2. Bikov M. M. Analíz yefektivností ídentifíkatsíí̈ movtsya za chastotoyu osnovnogo tonu /
M. M. Bikov, V. V. Kovtun. — Vísnik Khmel'nits'kogo natsíonal'nogo uníversitetu. — 2004. —
№ 2. — CH.1. — T.2(60). — S. 20—23.
3. Rabiner L. Tsifrovaya obrabotka rechevykh signalov / L. Rabiner, R. Shafer. — M. : Radio i svyaz', 1981. — 496 s.
4. Hermansky H. RASTA processing of speech / H. Hermansky, N. Morgan. — IEEE Trans. Speech and Audio Processing. — 1994. — 2, N 6. — P. 578—589.
5. Hermansky H. Perceptual Linear Prediction (PLP) analysis of speech / H. Hermansky. — J. Acoust. Soc. America. — 1990. — 87. — P. 1738—1753.
6. rasta-plp speech analysis — ICSI [Yelektronniy resurs] — Rezhim dostupu: http://www.icsi.berkeley.edu/pubs/techreports/tr-91-069.pdf.
7. Perceptual Linear Predictive (PLP) Analysis of Speech [Yelektronniy resurs] — Rezhim dostupu: http://seed.ucsd.edu/mediawiki/images/5/5c/PLP.pdf
8. CS231n: Convolutional Neural Networks for Visual Recognition [Yelektronniy resurs] — Rezhim dostupu: http://cs231n.github.io/convolutional-networks/
9. Caffe | Deep Learning Framework [Yelektronniy resurs] — Rezhim dostupu: http://caffe.berkeleyvision.org/.
10. An overview of gradient descent optimization algorithms [Yelektronniy resurs] — Rezhim dostupu: http://sebastianruder.com/optimizing-gradient-descent/.
11. NOIZEUS: Noisy speech corpus — Univ. Texas-Dallas [Yelektronniy resurs] — Rezhim dostupu: http://ecs.utdallas.edu/loizou/speech/noizeus/.

Downloads

Abstract views: 283

Published

2017-04-13

How to Cite

[1]
M. M. Bykov and V. V. Kovtun, “RESEARCH EFFICIENCY FEATURES SPEAKERS RECOGNITION USING NEURAL NETWORKS COAGULATION”, Опт-ел. інф-енерг. техн., vol. 32, no. 2, pp. 22–28, Apr. 2017.

Issue

Section

Systems Of Technical Vision And Artificial Intelligence, Image Processing And Pattern Recognition

Metrics

Downloads

Download data is not yet available.