oddělení-TKO-SIX

Browse

Recent Submissions

Now showing 1 - 5 of 26
  • Item
    Changes in Phonation and Their Relations with Progress of Parkinson’s Disease
    (MDPI, 2019-01-22) Galáž, Zoltán; Mekyska, Jiří; Zvončák, Vojtěch; Mucha, Ján; Kiska, Tomáš; Smékal, Zdeněk; Eliášová, Ilona; Mráčková, Martina; Košťálová, Miroslava; Rektorová, Irena; Faúndez Zanuy, Marcos; Alonso-Hernandez, Jesus; Gomez-Vilda, Pedro
    Hypokinetic dysarthria, which is associated with Parkinson’s disease (PD), affects several speech dimensions, including phonation. Although the scientific community has dealt with a quantitative analysis of phonation in PD patients, a complex research revealing probable relations between phonatory features and progress of PD is missing. Therefore, the aim of this study is to explore these relations and model them mathematically to be able to estimate progress of PD during a two-year follow-up. We enrolled 51 PD patients who were assessed by three commonly used clinical scales. In addition, we quantified eight possible phonatory disorders in five vowels. To identify the relationship between baseline phonatory features and changes in clinical scores, we performed a partial correlation analysis. Finally, we trained XGBoost models to predict the changes in clinical scores during a two-year follow-up. For two years, the patients’ voices became more aperiodic with increased microperturbations of frequency and amplitude. Next, the XGBoost models were able to predict changes in clinical scores with an error in range 11–26%. Although we identified some significant correlations between changes in phonatory features and clinical scores, they are less interpretable. This study suggests that it is possible to predict the progress of PD based on the acoustic analysis of phonation. Moreover, it recommends utilizing the sustained vowel /i/ instead of /a/.
  • Item
    Identification and Monitoring of Parkinson’s Disease Dysgraphia Based on Fractional-Order Derivatives of Online Handwriting
    (MDPI, 2019-01-11) Mucha, Ján; Mekyska, Jiří; Galáž, Zoltán; Faúndez Zanuy, Marcos; Lopez-de-Ipina, Karmele; Zvončák, Vojtěch; Kiska, Tomáš; Smékal, Zdeněk; Brabenec, Luboš; Rektorová, Irena
    Parkinson’s disease dysgraphia affects the majority of Parkinson’s disease (PD) patients and is the result of handwriting abnormalities mainly caused by motor dysfunctions. Several effective approaches to quantitative PD dysgraphia analysis, such as online handwriting processing, have been utilized. In this study, we aim to deeply explore the impact of advanced online handwriting parameterization based on fractional-order derivatives (FD) on the PD dysgraphia diagnosis and its monitoring. For this purpose, we used 33 PD patients and 36 healthy controls from the PaHaW (PD handwriting database). Partial correlation analysis (Spearman’s and Pearson’s) was performed to investigate the relationship between the newly designed features and patients’ clinical data. Next, the discrimination power of the FD features was evaluated by a binary classification analysis. Finally, regression models were trained to explore the new features’ ability to assess the progress and severity of PD. These results were compared to a baseline, which is based on conventional online handwriting features. In comparison with the conventional parameters, the FD handwriting features correlated more significantly with the patients’ clinical characteristics and provided a more accurate assessment of PD severity (error around 12%). On the other hand, the highest classification accuracy (ACC = 97.14%) was obtained by the conventional parameters. The results of this study suggest that utilization of FD in combination with properly selected tasks (continuous and/or repetitive, such as the Archimedean spiral) could improve computerized PD severity assessment
  • Item
    Gabor frames and deep scattering networks in audio processing
    (MDPI, 2019-09-26) Bammer, Roswitha; Dörfler, Monika; Harár, Pavol
    This paper introduces Gabor scattering, a feature extractor based on Gabor frames and Mallat's scattering transform. By using a simple signal model for audio signals specific properties of Gabor scattering are studied. It is shown that for each layer, specific invariances to certain signal characteristics occur. Furthermore, deformation stability of the coefficient vector generated by the feature extractor is derived by using a decoupling technique which exploits the contractivity of general scattering networks. Deformations are introduced as changes in spectral shape and frequency modulation. The theoretical results are illustrated by numerical examples and experiments. Numerical evidence is given by evaluation on a synthetic and a "real" data set, that the invariances encoded by the Gabor scattering transform lead to higher performance in comparison with just using Gabor transform, especially when few training samples are available.
  • Item
    Towards robust voice pathology detection Investigation of supervised deep learning, gradient boosting, and anomaly detection approaches across four databases
    (Springer, 2020-10-02) Harár, Pavol; Galáž, Zoltán; Alonso-Hernandez, Jesus; Mekyska, Jiří; Burget, Radim; Smékal, Zdeněk
    Automatic objective non-invasive detection of pathological voice based on computerized analysis of acoustic signals can play an important role in early diagnosis, progression tracking and even effective treatment of pathological voices. In search towards such a robust voice pathology detection system we investigated 3 distinct classifiers within supervised learning and anomaly detection paradigms. We conducted a set of experiments using a variety of input data such as raw waveforms, spectrograms, mel-frequency cepstral coefficients (MFCC) and conventional acoustic (dysphonic) features (AF). In comparison with previously published works, this article is the first to utilize combination of 4 different databases comprising normophonic and pathological recordings of sustained phonation of the vowel /a/ unrestricted to a subset of vocal pathologies. Furthermore, to our best knowledge, this article is the first to explore gradient boosted trees and deep learning for this application. The following best classification performances measured by F1 score on dedicated test set were achieved: XGBoost (0.733) using AF and MFCC, DenseNet (0.621) using MFCC, and Isolation Forest (0.610) using AF. Even though these results are of exploratory character, conducted experiments do show promising potential of gradient boosting and deep learning methods to robustly detect voice pathologies.
  • Item
    Enhancement of Conventional Beat Tracking System Using Teager–Kaiser Energy Operator
    (MDPI, 2020-01-04) Ištvánek, Matěj; Smékal, Zdeněk
    Beat detection systems are widely used in the music information retrieval (MIR) research field for the computation of tempo and beat time positions in audio signals. One of the most important parts of these systems is usually onset detection. There is an understandable tendency to employ the most accurate onset detector. However, there are options to increase the global tempo (GT) accuracy and also the detection accuracy of beat positions at the expense of less accurate onset detection. The aim of this study is to introduce an enhancement of a conventional beat detector. The enhancement is based on the Teager–Kaiser energy operator (TKEO), which pre-processes the input audio signal before the spectral flux calculation. The proposed approach is first evaluated in terms of the ability to estimate the GT and beat positions accuracy of given audio tracks compared to the same conventional system without the proposed enhancement. The accuracy of the GT and average beat differences (ABD) estimation is tested on the manually labelled reference database. Finally, this system is used for analysis of a string quartet music database. Results suggest that the presence of the TKEO lowers onset detection accuracy but also increases the GT and ABD estimation. The average deviation from the reference GT in the reference database is 9.99 BPM (11.28%), which improves the conventional methodology, where the average deviation is 18.19 BPM (17.74%). This study has a pilot character and provides some suggestions for improving the beat tracking system for music analysis.