Laria de la Cruz, Juan CarlosAguilera Morillo, María del CarmenÁlvarez Castillo, Enrique LuisLillo Rodríguez, Rosa ElviraLópez Taruella, SaraDel Monte Millán, MaríaPicornell, Antonio C.Martín, MiguelRomo, Juan2021-03-182021-03-182021-01-23Laria, J.C.; Aguilera-Morillo, M.C.; Álvarez, E.; Lillo, R.E.; López-Taruella, S.; del Monte-Millán, M.; Picornell, A.C.; Martín, M.; Romo, J. Iterative Variable Selection for High-Dimensional Data: Prediction of Pathological Response in Triple-Negative Breast Cancer. Mathematics 2021, 9, 222.2227-7390https://hdl.handle.net/10016/32185Over the last decade, regularized regression methods have offered alternatives for performing multi-marker analysis and feature selection in a whole genome context. The process of defining a list of genes that will characterize an expression profile remains unclear. It currently relies upon advanced statistics and can use an agnostic point of view or include some a priori knowledge, but overfitting remains a problem. This paper introduces a methodology to deal with the variable selection and model estimation problems in the high-dimensional set-up, which can be particularly useful in the whole genome context. Results are validated using simulated data and a real dataset from a triple-negative breast cancer study.eng© 2021 by the authorsAtribución 3.0 EspañaVariable selectionHigh-dimensionRegularizationClassificationSparse-group LassoIterative variable selection for high-dimensional data: Prediction of pathological response in triple-negative breast cancerresearch articleEstadísticahttps://doi.org/10.3390/math9030222open access1314Mathematics9AR/0000027093