PÁLKA, P. Analýza a optimalizace klastrování embeddingů v diarizačním systému DiariZen [online]. Brno: Vysoké učení technické v Brně. Fakulta informačních technologií. 2025.

Posudky

Posudek vedoucího

Burget, Lukáš

In agreement with the student, we believe it would be unfortunate to proceed with the defense while the thesis remains in its unfinished state. I am confident that with additional time and effort, the work can be completed to a high standard and successfully defended during the August state exams.

Dílčí hodnocení
Kritérium Známka Body Slovní hodnocení
Informace k zadání Petr actively collaborates with our Speech@FIT research group, and his diploma thesis is closely related to ongoing research and the development of our diarization system, DiariZen. The assignment required him to engage deeply with the system’s second-stage speaker embedding clustering module, and to evaluate and integrate an alternative clustering approach. To do this, Petr first had to understand the fairly complex diarization pipeline, including modern neural network architectures for end-to-end diarization and speaker embedding extraction. Additionally, he needed to master the VBx method, which involves non-trivial variational inference in a Bayesian HMM framework. He successfully tackled all of these challenges. Through careful analysis of the clustering stage in our existing system, Petr managed to significantly improve its performance. The updated system now achieves state-of-the-art results on several challenging speaker diarization datasets, making this work highly relevant and valuable to the research community. I am very satisfied with the experimental results achieved. In this respect, the thesis fully meets, and in many ways exceeds, the original assignment.
Práce s literaturou Petr actively searched for and studied relevant literature, as well as the implementation details of the diarization system he was working on.
Aktivita během řešení, konzultace, komunikace Petr regularly consulted his progress, ideas, and results, both during our weekly research group meetings and through individual discussions with me. He maintained consistent communication throughout the project and was always well-prepared for consultations, which allowed for steady and focused development of the thesis.
Aktivita při dokončování Unfortunately, the student significantly underestimated the time required for writing the thesis. As a result, the work was not completed in a timely manner, and there was insufficient opportunity to consult and refine the final version. The submitted document feels unfinished and does not reflect the otherwise high quality of the technical work carried out during the project.
Publikační činnost, ocenění Petr has co-authored two accepted publications, although they are not directly related to the topic of this thesis. However, a journal paper that includes results from this thesis is already in preparation, and I believe this work has strong potential to contribute to additional future publications. Despite the shortcomings in the written document, the underlying research is of high quality and relevance.
Navrhovaná známka
F
Body
49

Posudek oponenta

Diez Sánchez, Mireia

This is a solid experimental work, which covers more than the proposed objectives and comes up with relevant set of results. The proposed modifications and extensive experimentation, have lead to build a diarization system which achieved state-of-the-art results across several diarization datasets.  Unfortunately, even if the work is well structured, some of the chapters are obviously rushed or incomplete, and the report needs to be improved. I believe that a good version of the report can be easily submitted in the August term.

Dílčí hodnocení
Kritérium Známka Body Slovní hodnocení
Náročnost zadání The thesis is demanding regarding the level of understanding that is needed of non-trivial diarization theory and systems. The author also needs to work with a system pipeline that consists of several independent complex modules. The work has been performed on a compound database, built out of 8 individual datasets, and analysis of the results is provided for individual databases. Experimentation is quite extensive.
Rozsah splnění požadavků zadání The author had fulfilled all the objectives, and further extended the work. Not only has the AHC been replaced by the VBx diarization, with extensive experimentation in several datasets. Besides, variants of AHC have been explored ("continious clustering") and different window sizes have been considered to enhance the local diarization performance. Moreover, the findings of the implications of the "pyannote constrain" and the proposed alternative method (”just constrain“ + reassgment of the DZE embeddings) is a very interesting extra contribution of this work.
Rozsah technické zprávy The report is lacking in some aspects, there is quite some imbalance on the quiality of the later chapters vs the first ones. It seems some chapters were left unfinished and they should be completed. As an example, chapter 4, where the pipeline is described, should be improved. There are several typos in this chapter. Some of the descriptions given are confusing on incomplete, for example in the "(re)assigmnent of global speaker labels", it is unclear, at this stage, what constrains means. Even if it is going to be better covered in the results section, an overview should be provided here. Or, also, it is unclear how the "stitching" works. A good description of figure 4.3.b should be given. In Chapter 6, explanations in section 6.2 should be improved. In general, desctiptions of figures and comments about results tables should be extended.
Prezentační úroveň technické zprávy 40 The work is well organized and provides a good introduction of the theory, diarization systems, evaluation metrics and databases used. As mentioned above, the comprehensibility of the work could be improved, by extending some of the later chapters: description of the pipeline, comments of results.
Formální úprava technické zprávy 75 The work is written in good English, although several typos can be found in some of the chapters.
Práce s literaturou 90 The literature, systems and databases are in general well referenced.
Realizační výstup 90 The work is mainly experimental, although some extensions of the code have been implemented. The code seems well organized. I do miss though, a clear statement on where are the extensions and what is part of the original (baseline) code. The thesis contains a relevant experimental insight of the clustering stage of the DiariZen pipeline. It studies the integration of the VBx as an alternative to the AHC and proposes an alternative implementation of the clustering method of the popular pyannote diarization tool.  The system has been evaluated across several standard diarization databases.
Využitelnost výsledků The presented modifications can be easily introduced in the mentioned state-of-the-art systems to improve their performance.
Navrhovaná známka
F
Body
40

eVSKP id 164934