Detecting Outliers Using Modified Recursive PCA Algorithm For Dynamic Streaming Data
Loading...
Date
2023-12-31
Authors
Dani, Yasi
Gunawan, Agus Yodi
Khodra, Masayu Leylia
Indratno, Sapto Wahyu
ORCID
Advisor
Referee
Mark
Journal Title
Journal ISSN
Volume Title
Publisher
Institute of Automation and Computer Science, Brno University of Technology
Altmetrics
Abstract
Outlier analysis has been widely studied and has produced many methods. However, there is still rare a method to detect outliers for dynamically streaming batch data (online learning). In the present research, a novel online algorithm to detect outliers in such dataset is proposed. Data points are proceeded by applying a modified recursive PCA to predict sequentially parameters of the model; eigenvalues and eigenvectors of the statistical detection model are recursively updated using approximate values by perturbation methods. More specifically, the recursive eigenstructure is obtained from the derivation of the covariance matrix using the first-order perturbation technique. The Mahalanobis distance is then used as an outlier score. Our algorithm performances are evaluated using some metrics, namely accuration, precision, recall, F1-score, AUC-PR, and the execution time. Results show that the proposed online outlier detection is computationally efficient in time and the algorithm's performance effectiveness is comparable to that of the offline outlier detection algorithm via classical PCA.
Description
Citation
Mendel. 2023 vol. 29, č. 2, s. 237-244. ISSN 1803-3814
https://mendel-journal.org/index.php/mendel/article/view/276
https://mendel-journal.org/index.php/mendel/article/view/276
Document type
Peer-reviewed
Document version
Published version
Date of access to the full text
Language of document
en
Study field
Comittee
Date of acceptance
Defence
Result of defence
Document licence
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license
http://creativecommons.org/licenses/by-nc-sa/4.0
http://creativecommons.org/licenses/by-nc-sa/4.0