Please use this identifier to cite or link to this item:
http://ricaxcan.uaz.edu.mx/jspui/handle/20.500.11845/1894
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor | 31249 | es_ES |
dc.contributor.other | https://orcid.org/0000-0002-7337-8974 | - |
dc.contributor.other | https://orcid.org/0000-0002-8060-6170 | - |
dc.coverage.spatial | Global | es_ES |
dc.creator | Becerra de la Rosa, Aldonso | - |
dc.creator | De la Rosa Vargas, José Ismael | - |
dc.creator | González Ramírez, Efrén | - |
dc.creator | Pedroza Ramírez, Ángel David | - |
dc.creator | Martínez, Juan Manuel | - |
dc.creator | Escalante, Nivia | - |
dc.date.accessioned | 2020-05-06T20:42:07Z | - |
dc.date.available | 2020-05-06T20:42:07Z | - |
dc.date.issued | 2017-11 | - |
dc.identifier | info:eu-repo/semantics/publishedVersion | es_ES |
dc.identifier.issn | 2573-0770 | es_ES |
dc.identifier.uri | http://ricaxcan.uaz.edu.mx/jspui/handle/20.500.11845/1894 | - |
dc.identifier.uri | https://doi.org/10.48779/9ds7-t936 | - |
dc.description.abstract | The aim of this paper is to present two new variations of the frame-level cost function for training a Deep neural network in order to achieve better word error rates in speech recognition. Minimization functions of a neural network are salient aspects to deal with when researchers are working on machine learning, and hence their improvement is a process of constant evolution. In the first proposed method, the conventional cross-entropy function can be mapped to a nonuniform loss function based on its corresponding extropy (a complementary dual function), enhancing the frames that have ambiguity in their belonging to specific senones (tied-triphone states in a hidden Markov model). The second proposition is a fusion of the proposed mapped cross-entropy and the boosted cross-entropy function, which emphasizes those frames with low target posterior probability. The developed approaches have been performed by using a personalized mid-vocabulary speaker-independent voice corpus. This dataset is employed for recognition of digit strings and personal name lists in Spanish from the northern central part of Mexico on a connected-words phone dialing task. A relative word error rate improvement of 12.3% and 10.7% is obtained with the two proposed approaches, respectively, regarding the conventional well-established crossentropy objective function. | es_ES |
dc.language.iso | eng | es_ES |
dc.publisher | IEEE | es_ES |
dc.relation.uri | generalPublic | es_ES |
dc.rights | Atribución-NoComercial-SinDerivadas 3.0 Estados Unidos de América | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/us/ | * |
dc.source | Proc. of the IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC2017), at Ixtapa, Mexico, pp. 1-6, 2017. | es_ES |
dc.subject.classification | INGENIERIA Y TECNOLOGIA [7] | es_ES |
dc.subject.other | Speech recognition | es_ES |
dc.subject.other | Deep neural network | es_ES |
dc.subject.other | Deep Learning | es_ES |
dc.title | Speech recognition using deep neural networks trained with non-uniform frame-level cost functions | es_ES |
dc.type | info:eu-repo/semantics/conferencePaper | es_ES |
Appears in Collections: | *Documentos Académicos*-- M. en Ciencias del Proc. de la Info. |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
72_Becerra_DelaRosa IEEEROPEC P1 2017.pdf | Becerra_DelaRosa IEEEROPEC P1 2017 | 373,94 kB | Adobe PDF | View/Open |
This item is licensed under a Creative Commons License