MAGERKOVÁ, T. Development of Automated Emotion Recognition System through Voice using Python [online]. Brno: Vysoké učení technické v Brně. Fakulta informačních technologií. 2024.

Posudky

Posudek vedoucího

Hussain, Yasir

Dílčí hodnocení
Kritérium Známka Body Slovní hodnocení
Informace k zadání The bachelor's thesis, titled "Development of Automated Emotion Recognition System through Voice using Python," focuses on designing and implementing a deep learning model for speech emotion recognition (SER). The project was challenging due to the complexity of accurately identifying emotions from voice data and the need to create a robust and generalizable model. The thesis built on existing methods in the field, utilizing datasets like RAVDESS and EmoDB for training and testing. The model achieved notable accuracy, and the results were satisfactory, given the project's scope. However, there is room for improvement, especially in handling cross-corpus validation and detecting mixed emotions, which were limitations in the current implementation.
Práce s literaturou The student was diligent in obtaining and using study materials and conducting a comprehensive literature review on existing emotion recognition techniques and relevant datasets. The sources included academic journals, previous studies, and publicly available datasets such as RAVDESS and EmoDB. This extensive background research was crucial for successfully designing and implementing the emotion recognition model.
Aktivita během řešení, konzultace, komunikace The student was highly active and responsive throughout the project. Despite being unable to meet face-to-face frequently, regular online meetings ensured continuous discussion and feedback. The student was well-prepared for consultations, presenting progress, and addressing shortcomings in the model. The agreed deadlines were consistently met, and the student effectively incorporated the supervisor's suggestions for system design and implementation improvements.
Aktivita při dokončování The work was completed according to the planned schedule, with ample final review and consultation time. The student ensured that the final content was thoroughly discussed with the supervisor, incorporating feedback and making necessary adjustments to enhance the quality and accuracy of the thesis. This consultation was done last week, but the SER-related literature review is still missing. It's not done properly.
Publikační činnost, ocenění No awards or formal publications were associated with this work at the time of submission. However, the student is planning to write a conference paper based on the thesis for future publication.
Navrhovaná známka
B
Body
84

Posudek oponenta

Malik, Aamir Saeed

Overall, the work presented in the thesis meets the requirements for a Bachelor thesis. The main shortcoming of this thesis is literature review related to SER. This also limited the results section as no comparison was provided with existing SER methods. However, the student did study the problem at hand, proposed a model for the problem, implemented it, and provided the results with sufficient number of quality metrics.

Dílčí hodnocení
Kritérium Známka Body Slovní hodnocení
Náročnost zadání It is a difficult project because it requires not only knowledge in signal processing and machine learning but also needs to incorporate psychological concepts of human emotions. The challenges in recognizing various different emotions from voice, irrespective of age and gender, makes it a difficult project.
Rozsah splnění požadavků zadání The main shortcoming of the thesis is literature review related to SER (Speech Emotion Recognition). Since the student did not provide details of the SER relevant articles in the thesis, it was not possible for the student to provide comparison of the results for the proposed method. Overall, it is satisfactory because the student did study the problem, conducted a shallow literature review, proposed a model for the problem, implemented it, and provided the results with good number of quality metrics.
Rozsah technické zprávy The thesis does not meet the minimum 40 standard pages requirement. It falls short by few pages. Though it is just shy of meeting the requirement, the student could have easily expanded to 40 pages by including literature review relevant to SER, comparison of the results with existing SER methods, and a separate chapter on conclusions and future work in the thesis. This information is lacking in this thesis. The student might have studied about it but it does not reflect in the thesis.
Prezentační úroveň technické zprávy 70 Overall the structure of thesis, in terms of chapters organization, is satisfactory. Chapter 1 provides introduction, followed by chapter on speech emotion recognition discussing the theory on emotions as well as sound, its features and classification. Chapters 3 and 4 provides details of the proposed method and the corresponding implementation of the proposed method. Results are discussed in chapter 5. One minor observation is that a concluding chapter discussing conclusions, future work and limitations of this work is included as a sub-section in last chapter. It should have been a separate chapter. The major observation is the missing flow between the sections in the chapters and this makes this thesis harder to read.
Formální úprava technické zprávy 80 In terms of language, the thesis is readable and language of the thesis is satisfactory. There are minor typos and grammatical mistakes. However, they are few and in general the thesis is fine. In addition, many tables and figures have not been referenced in the text. Also, the placement of figures and tables should have been within relevant sections and sub-sections.
Práce s literaturou 70 The student has provided good overview of the existing public datasets, features, and classifiers. However, what is missing is the literature review specifically related to SER. Further missing is the critical analysis highlighting the limitations and gaps in the existing work related to SER. This is major weakness of this chapter. Further, the chapter finishes suddenly without any conclusions.
Realizační výstup 80 The student has provided clear details of the proposed methodology. The feature extraction, feature selection and the corresponding use of deep learning model are explained satisfactorily. However, the rationale behind using the CNN based model is not clear. The corresponding implementation is also described clearly. The description of the results is also satisfactory. However, the student did not compare the results with existing SER methods and this is the major weakness of this thesis. Overall, the proposed model, its implementation, training, and validation are satisfactory for a BS thesis.
Využitelnost výsledků This thesis involves application of a deep learning model for SER (Speech Emotion Recognition). The student did not provide comparison with state-of-the-art in SER. Hence, it is not possible to assess the possibility of using the results in practice, though it is possible in general for such a thesis.
Navrhovaná známka
C
Body
75

eVSKP id 153453