Publication:
Band depth based initialization of K-means for functional data clustering

dc.affiliation.dptoUC3M. Departamento de Matemáticases
dc.affiliation.dptoUC3M. Departamento de Estadísticaes
dc.affiliation.grupoinvUC3M. Grupo de Investigación: Modelización, Simulación Numérica y Matemática Industriales
dc.affiliation.institutoUC3M. Instituto Universitario sobre Modelización y Simulación en Fluidodinámica, Nanociencia y Matemática Industrial Gregorio Millán Barbanyes
dc.contributor.authorAlbert Smet, Javier
dc.contributor.authorTorrente Orihuela, Ester Aurora
dc.contributor.authorRomo, Juan
dc.contributor.funderMinisterio de Ciencia, Innovación y Universidades (España)es
dc.date.accessioned2023-05-23T07:27:58Z
dc.date.available2023-05-23T07:27:58Z
dc.date.issued2023-06
dc.description.abstractThe k-Means algorithm is one of the most popular choices for clustering data but is well-known to be sensitive to the initialization process. There is a substantial number of methods that aim at finding optimal initial seeds for k-Means, though none of them is universally valid. This paper presents an extension to longitudinal data of one of such methods, the BRIk algorithm, that relies on clustering a set of centroids derived from bootstrap replicates of the data and on the use of the versatile Modified Band Depth. In our approach we improve the BRIk method by adding a step where we fit appropriate B-splines to our observations and a resampling process that allows computational feasibility and handling issues such as noise or missing data. We have derived two techniques for providing suitable initial seeds, each of them stressing respectively the multivariate or the functional nature of the data. Our results with simulated and real data sets indicate that our Functional Data Approach to the BRIK method (FABRIk) and our Functional Data Extension of the BRIK method (FDEBRIk) are more effective than previous proposals at providing seeds to initialize k-Means in terms of clustering recovery.en
dc.description.sponsorshipOpen Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This work was partially supported by the Spanish Ministry of Education [collaboration grant in university departments, Archive ID 18C01/003730] and the Spanish Ministry of Science, Innovation and Universities [grants numbers PID2020-116567GB-C22 and PID2020-112796RB-C22].en
dc.format.extent22
dc.identifier.bibliographicCitationAlbert-Smet, J., Torrente, A., & Romo, J. (2023). Band depth based initialization of K-means for functional data clustering. Advances in Data Analysis and Classification, 17(2), 463–484.en
dc.identifier.doihttp://dx.doi.org/10.1007/s11634-022-00510-w
dc.identifier.issn1862-5347
dc.identifier.publicationfirstpage463
dc.identifier.publicationissue2
dc.identifier.publicationlastpage484
dc.identifier.publicationtitleAdvances in Data Analysis and Classificationen
dc.identifier.publicationvolume17
dc.identifier.urihttps://hdl.handle.net/10016/37332
dc.identifier.uxxiAR/0000032919
dc.language.isoeng
dc.publisherSpringeren
dc.relation.projectIDGobierno de España. PID2020-116567GB-C22es
dc.relation.projectIDGobierno de España. PID2020-112796RB-C22es
dc.relation.projectIDAT-2022es
dc.rights© The Author(s) 2022en
dc.rightsAtribución 3.0 España*
dc.rights.accessRightsopen accessen
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/es/*
dc.subject.ecienciaEstadísticaes
dc.subject.ecienciaMatemáticases
dc.subject.otherK-meansen
dc.subject.otherModified band depthen
dc.subject.otherB-splineen
dc.subject.otherFunctional dataen
dc.subject.otherBootstrapen
dc.titleBand depth based initialization of K-means for functional data clusteringen
dc.typeresearch article*
dc.type.hasVersionVoR*
dspace.entity.typePublication
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Band_ADAC_2023.pdf
Size:
782.06 KB
Format:
Adobe Portable Document Format