GALBA, Š. Emotion Recognition from Analysis of a Person’s Speech using Deep Learning [online]. Brno: Vysoké učení technické v Brně. Fakulta informačních technologií. 2024.

Posudky

Posudek vedoucího

Malik, Aamir Saeed

The student has completed all the objectives specified in the project description. He has developed a deep learning-based approach for the SER problem. He has tested it with three different datasets and the results are promising. However, the weakness of this thesis is the lack of thorough comparison with other methods. The work is publishable as a journal article and the student has agreed to continue working on it for the writing of the journal paper and provide the related details.

Dílčí hodnocení
Kritérium Známka Body Slovní hodnocení
Informace k zadání The thesis was related to detection of human emotions from the speech signal using deep learning approach. The work was challenging because it not only required knowledge in signal processing and the machine learning but also required understanding of emotions from the perspective of human psychology. The challenges in recognizing various different emotions from voice, irrespective of age and gender, makes it a difficult project. The student optimized a deep learning model to achieve this objective and tested it on three different databases. The results are good and accuracy reported is above 90%. 
Práce s literaturou The student has done a good literature review. In chapter 2, various emotion theories and emotion models are discussed. Further, the student introduces audio signal processing in this chapter. Chapter 3 starts with the description of the various Speech Emotion Recognition (SER) datasets with table 3.1 summarizing the available datasets. Then the traditional machine learning techniques are discussed for solving the SER problem followed by the description of deep learning techniques. It is a well written chapter with a detailed description of the published work. In addition, the student has provided nice tables at the end of every section highlighting the summary of the important articles with their limitations.
Aktivita během řešení, konzultace, komunikace The student had a very positive attitude and started working on the topic in the winter semester of 2023. I had weekly meetings with the student and he regularly attended the meetings. He worked hard and I found him to be focused. He was generally prepared for the meetings.
Aktivita při dokončování The student regularly consulted me before the submission of the thesis. The work was challenging because three datasets were involved. The student was able to complete all the development in time and was able to submit the thesis within the give timeframe.
Publikační činnost, ocenění The thesis is well written and contains content which is publishable. The student is willing to assist in preparing the draft for a journal paper. The journal paper is being prepared to be submitted to IEEE Transactions on Affective Computing.
Navrhovaná známka
B
Body
88

Posudek oponenta

Kekely, Lukáš

Student tackled a complex assignment and managed to achieve promising results. He designed and implemented his own interesting extension to existing models. Based on his initial evaluation, the extensions seems to be beneficial in enhancing mood recognition precision from speech. Overall, a very good (B) achievement for master's thesis.  

Dílčí hodnocení
Kritérium Známka Body Slovní hodnocení
Náročnost zadání Assignment deals with a rather complex topic. Understanding current deep learning methods for speech processing to such extent, that an enhancement of mood/emotion recognition precision can be achieved, requires a broad insight into the field. Practical testing and training of these models also require a lot of computational time and even utilisation of super computers. These adds another layer of complexity.
Rozsah splnění požadavků zadání Author achieved all of the stated objectives. A slight deviation from points 5 is presented, as student rather re-purposed and fine-tuned an existing model, than creating his own from scratch. This deviation is fully within the thesis's main goal and seems to actually benefit the overall results. Furthermore, I would highlight the evaluation required by point 6. Presented scope goes beyond the typical level of master's thesis.
Rozsah technické zprávy The technical report is on the longer side of the required/standard length.
Prezentační úroveň technické zprávy 85 Author provides a thorough overview of the technical background and a clear description of his own contributions built on top of them. The text contains only relevant information in reasonable levels of detail. Continuity of chapters is easy to follow and the reader is guided to logical conclusions made by the author. However, I would suggest longer Introduction and Conclusion chapters to better highlight the purpose, structure, and results of the thesis. Similarly, some sections in the Implementation and the Results chapters seems to be finished in haste with unnecessary brevity. A more detailed and careful descriptions could be beneficial here.
Formální úprava technické zprávy 80 Technical report is completely written in English of sufficient quality. Overall readability and flow of the text can be enhanced by less frequent use of sub-headers and paragraph headers (on some pages there is a new header each few lines). Additional spell-check and/or proof-read pass would also help, especially in the second half. Typographically, the text is well formatted, only minor imperfections can be pointed out (single word lines i.e. runts, small font size in figure and graph labels, etc.).
Práce s literaturou 95 Author uses only relevant primary sources. The number of referenced works is above the standard for masters's thesis.  
Realizační výstup 85 Implementation is realised as an extension of existing models. Code structure and formating seems to be reasonable, however comments are rather rare. At least a README.md file is provided by the author. It contains instructions to installation, execution, and testing.
Využitelnost výsledků The work extends existing deep learning models. It shows their usability for voice recognition and enhance their classification precision. With some additional testing and evaluation, the achieved results might be presentable as a poster on relevant international conference.  
Navrhovaná známka
B
Body
87

Otázky

eVSKP id 153400