An adaptive algorithm for clustering cumulative probability distribution functions using the Kolmogorov–Smirnov two-sample test

Mora-López, Llanos; Mora-López, Juan

An adaptive algorithm for clustering cumulative probability distribution functions using the Kolmogorov–Smirnov two-sample test

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/53044

Registro completo de metadatos

Registro completo de metadatos
Campo DC	Valor	Idioma
dc.contributor	Economía Laboral y Econometría (ELYE)	es
dc.contributor.author	Mora-López, Llanos	-
dc.contributor.author	Mora-López, Juan	-
dc.contributor.other	Universidad de Alicante. Departamento de Fundamentos del Análisis Económico	es
dc.date.accessioned	2016-02-11T13:01:17Z	-
dc.date.available	2016-02-11T13:01:17Z	-
dc.date.issued	2015-05-15	-
dc.identifier.citation	Expert Systems with Applications. 2015, 42(8): 4016-4021. doi:10.1016/j.eswa.2014.12.027	es
dc.identifier.issn	0957-4174 (Print)	-
dc.identifier.issn	1873-6793 (Online)	-
dc.identifier.uri	http://hdl.handle.net/10045/53044	-
dc.description.abstract	This paper proposes an adaptive algorithm for clustering cumulative probability distribution functions (c.p.d.f.) of a continuous random variable, observed in different populations, into the minimum homogeneous clusters, making no parametric assumptions about the c.p.d.f.’s. The distance function for clustering c.p.d.f.’s that is proposed is based on the Kolmogorov–Smirnov two sample statistic. This test is able to detect differences in position, dispersion or shape of the c.p.d.f.’s. In our context, this statistic allows us to cluster the recorded data with a homogeneity criterion based on the whole distribution of each data set, and to decide whether it is necessary to add more clusters or not. In this sense, the proposed algorithm is adaptive as it automatically increases the number of clusters only as necessary; therefore, there is no need to fix in advance the number of clusters. The output of the algorithm are the common c.p.d.f. of all observed data in the cluster (the centroid) and, for each cluster, the Kolmogorov–Smirnov statistic between the centroid and the most distant c.p.d.f. The proposed algorithm has been used for a large data set of solar global irradiation spectra distributions. The results obtained enable to reduce all the information of more than 270,000 c.p.d.f.’s in only 6 different clusters that correspond to 6 different c.p.d.f.’s.	es
dc.description.sponsorship	This research has been partially supported by the Spanish Consejería de Economía, Innovación y Ciencia of the Junta de Andalucía under projects TIC-6441 and P11-RNM7115, and the Spanish MEC under project ECO2011–29751.	es
dc.language	eng	es
dc.publisher	Elsevier	es
dc.rights	© 2014 Elsevier Ltd.	es
dc.subject	Adaptive clustering	es
dc.subject	Cumulative probability distribution functions	es
dc.subject	Kolmogorov–Smirnov two-sample test	es
dc.subject.other	Fundamentos del Análisis Económico	es
dc.title	An adaptive algorithm for clustering cumulative probability distribution functions using the Kolmogorov–Smirnov two-sample test	es
dc.type	info:eu-repo/semantics/article	es
dc.peerreviewed	si	es
dc.identifier.doi	10.1016/j.eswa.2014.12.027	-
dc.relation.publisherversion	http://dx.doi.org/10.1016/j.eswa.2014.12.027	es
dc.rights.accessRights	info:eu-repo/semantics/openAccess	es
dc.relation.projectID	info:eu-repo/grantAgreement/MICINN//ECO2011-29751	-
Aparece en las colecciones:	INV - ELYE - Artículos de Revistas

Archivos en este ítem:

Archivos en este ítem:
Archivo	Descripción	Tamaño	Formato
2015_Mora_ESWA_final.pdf	Versión final (acceso restringido)	1,03 MB	Adobe PDF	Abrir Solicitar una copia
2015_Mora_ESWA_accepted.pdf	Versión revisada (acceso abierto)	278,62 kB	Adobe PDF	Abrir Vista previa Cerrar vista previa

Ver citas en Google Académico

Muestra el registro sencillo