Covariance Based Unsupervised Classification in Functional Data Analysis


Statistical learning
Health Analytics
Covariance Based Unsupervised Classification in Functional Data Analysis
Friday 13th June 2014
Ieva, F., Paganoni, A.M., Tarabelloni, N.
Download link:
In this paper we propose a new algorithm to perform unsupervised classification of multivariate and functional data when the difference between the two populations lies in their covariances, rather than in their means. The algorithm relies on a proper quantification of distance between the estimated covariance operators of the populations, and identifies as clusters those groups maximising the distance between their induced covariances. The naive implementation of such an algorithm is computationally forbidding, so we propose an heuristic formulation with a much lighter complexity and we study its convergence properties, along with its computational cost. We also propose to use an enhanced estimator for the estimation of discrete covariances of functional data, namely a linear shrinkage estimator, in order to improve the precision of the classification. We establish the effectiveness of our algorithm through applications to both synthetic data and a real dataset coming from a biomedical context, showing also how the use of shrinkage estimation may lead to substantially better results.
This report, or a modified version of it, has been also submitted to, or published on
Accepted for publication: "Journal of Machine Learning Research" (2016)