On Data Robustification in Functional Data Analysis


Statistical learning
Health Analytics
On Data Robustification in Functional Data Analysis
Friday 1st January 2016
Tarabelloni, N.; Ieva, F.
Download link:
The problem of outlier detection in high dimensional settings is nowadays a crucial point for a number of statistical analysis. Outliers are often considered as an error or noise, instead, they may carry important information on the phenomenon under study. If not properly identified, they may lead to model misspecification, biased parameter estimation and incorrect results, especially in those contexts where the number of available statistical units is lower than the number of parameters (for example, Functional Data Analysis). In this paper we introduce a robustly adjusted version of the functional boxplot, which is the most common tool adopted to perform outlier detection in Functional Data Analysis. A crucial element of the functional boxplot is the inflation factor of the fences, controlling the proportion of observations flagged as outlier. After an overview of the methods and tools currently available in the literature, we will describe a robust method to compute a data-driven value for such inflation factor. In doing so, we will make use of robust estimators of variance-covariance operators and the corresponding eigenvalues and eigenfunctions. Two simulation studies are proposed to give direct insights into the use of the proposed functional boxplot, and test both the robustness and accuracy of robust variance-covariance estimators, together with the performances of the functional boxplot in recognising truly outlying observations.