Selección de matabolitos como características de un modelo de bosques aleatorios para el diagnóstico del COVID-19

Torres Pasillas, Hugo Alexis

Please use this identifier to cite or link to this item: http://ricaxcan.uaz.edu.mx/jspui/handle/20.500.11845/3575

Full metadata record

DC Field	Value	Language
dc.contributor	1186284	en_US
dc.contributor.advisor	José María Celaya Padilla	en_US
dc.contributor.advisor	Yamilé López Hernández	en_US
dc.contributor.advisor	Carlor Eric Galván Tejada	en_US
dc.contributor.advisor	Alejandra García Hernández	en_US
dc.contributor.advisor	Pedro Daniel Alaniz Lumbreras	en_US
dc.contributor.advisor	Jorge Alejandro Morgan Benita	en_US
dc.coverage.spatial	Global	en_US
dc.creator	Torres Pasillas, Hugo Alexis	-
dc.date.accessioned	2024-06-12T18:20:06Z	-
dc.date.available	2024-06-12T18:20:06Z	-
dc.date.issued	2023-06-01	-
dc.identifier	info:eu-repo/semantics/publishedVersion	en_US
dc.identifier.issn	1870-4069	en_US
dc.identifier.uri	http://ricaxcan.uaz.edu.mx/jspui/handle/20.500.11845/3575	-
dc.identifier.uri	http://dx.doi.org/10.48779/ricaxcan-394	-
dc.description	COVID-19 is a recent disease that emerged in late 2019 caused by a new type of coronavirus. Despite advances in virus research and the development of both vaccines and potential treatments, early and accurate diagnosis of the disease remains one of the best tools to combat the disease and its transmission. The aim of this study is to select the best set of metabolites as potential biomarkers for diagnosis, which are used as features of a random forest model. To achieve this, four different feature selection techniques that are frequently used in Machine Learning, and a dataset containing measurements of 110 metabolites from 158 suspected COVID-19 patients (121 confirmed patients and 37 confirmed healthy by rt-PCR tests) were used. The results show four different sets of metabolites capable of diagnosing COVID-19 with high performance in six different metrics used. The set with the best performance in the training set consists of 15 metabolites and achieves high performance in blind validation (f1=0.921, balanced accuracy=0.875, AUC=0.910), while the set with the smallest number of features (5) obtains the second best performance in the training set but the best performance in blind validation (f1=0.931, balanced accuracy=0.896, AUC=0.858).	en_US
dc.description.abstract	El COVID-19 es una enfermedad reciente que surgió a finales de 2019 causado por un nuevo tipo de coronavirus. A pesar de los avances en la investigación del virus y el desarrollo tanto de vacunas como de posibles tratamientos, el diagnóstico de la enfermedad, especialmente de forma temprana, continúa siendo una de las mejores herramientas para combatir la enfermedad y su transmisión. El objetivo de este estudio es seleccionar el mejor conjunto de metabolitos como potenciales biomarcadores para el diagnóstico, que son utilizados como características de un modelo de bosques aleatorios. Para ello, se utilizaron 4 diferentes técnicas de selección de características que son utilizadas con frecuencia dentro del Aprendizaje Automático, y un conjunto de datos que contiene mediciones de 110 metabolitos de 158 pacientes sospechosos de COVID-19 (121 enfermos y 37 sanos confirmados por pruebas rt-PCR). Los resultados muestran cuatro distintos conjuntos de metabolitos capaces de diagnosticar el COVID-19 con un alto desempeño en 6 distintas métricas utilizadas. El conjunto con mejor rendimiento en el conjunto de entrenamiento consta de 15 metabolitos y logra tener un desempeño alto en la validación a ciegas (f1=0.921, exactitud balanceada=0.875, AUC=0.910), mientras que el conjunto con menor número de características (5) obtiene el segundo mejor rendimiento en el conjunto de entrenamiento pero el mejor desempeño en la validación a ciegas (f1=0.931, exactitud balanceada=0.896, AUC=0.858).	en_US
dc.language.iso	spa	en_US
dc.publisher	Research in Computing Science	en_US
dc.relation	https://rcs.cic.ipn.mx/2023_152_6/Seleccion%20de%20metabolitos%20como%20caracteristicas%20de%20un%20modelo%20de%20bosques%20aleatorios.pdf	en_US
dc.relation.uri	generalPublic	en_US
dc.rights	Attribution-NonCommercial-ShareAlike 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/3.0/us/	*
dc.source	Research in Computing Science Vol. 6, No.152, pp. 161-174	en_US
dc.subject.classification	INGENIERIA Y TECNOLOGIA [7]	en_US
dc.subject.other	COVID-19	en_US
dc.subject.other	Aprendizaje Automático	en_US
dc.subject.other	Metabolitos	en_US
dc.subject.other	Selección de característitcas	en_US
dc.subject.other	Diagnóstico	en_US
dc.title	Selección de matabolitos como características de un modelo de bosques aleatorios para el diagnóstico del COVID-19	en_US
dc.type	info:eu-repo/semantics/conferenceProceedings	en_US
Appears in Collections:	Documentos Académicos-- M. en Ciencias del Proc. de la Info.

Files in This Item:

File	Description	Size	Format
ARTICULO-MCPI_HugoAlexisTorresPasillas.pdf	Producto del programa de Maestría en Ciencias del Procesamiento de la Información, programa Categoría 1 del Sistema Nacional de Posgrados CONAHCYT.	3,38 MB	Adobe PDF	View/Open

Show simple item record

This item is licensed under a Creative Commons License