Posudky závěrečné kvalifikační práce

Overall this is an interesting and important piece of work. Well conducted and analysed. The student was very punctual in consultations and showed discipline and interest throughout the work. My only minor concern was regarding literature and some parts of writing in the thesis - which could have explained in much simpler straightforward language.

Dílčí hodnocení
Kritérium	Známka	Body	Slovní hodnocení
Informace k zadání			The thesis work is mainly an industry oriented application. I rate the work as moderately difficult because it requires the use of state-of-the-art LLMs for a real-world problem - automatic analysis of user-agent conversations. The thesis work has met all the requirements specified in the assignment. In addition, the student has managed to submit a conference paper to SIGDial 2025.
Práce s literaturou			The student has worked fairly well with the literature, although I would have preferred a more little more effort in this regard.
Aktivita během řešení, konzultace, komunikace			The student was very punctual in consultations and kept track of the meetings and activities to be done.
Aktivita při dokončování			The work was completed well within time.
Publikační činnost, ocenění			The student has submitted a conference paper to SIGDial 2025. The implementations related to the annotation template and others were submitted as part of thesis.

Posudek oponenta

Sedláček, Šimon

The thesis presents a methodology for both applying and analyzing the usage of LLMs for user/agent conversation analysis in the medical spa domain. The work has some problems in terms of the presentation, but the presented methodology and experiments are sound, reasonably rigorous, and applicable to new domains.

Dílčí hodnocení
Kritérium	Body	Slovní hodnocení
Náročnost zadání		This assignment could be categorized as more difficult, since it requires thorough understanding of the challenges that come with applying modern LLMs to structured evaluation tasks, their shortcoming in this regard, and how one can compensate for them and address them in the evaluations.
Rozsah splnění požadavků zadání		The assignment was completed in full.
Rozsah technické zprávy		The thesis is of standard length.
Prezentační úroveň technické zprávy	68	Overall, though the thesis presents a sound methodology and rigorous experiments, there are some problems with the presentation aspect that unfortunately drag it down. First off, I find it commendable, that the thesis does not needlessly spend time going through theory. The thesis provides necessary references to prior and related works in the introduction, nicely setting the stage for the rest of the work, which wholly describes the student's original work, starting from chapter 2. At the same time, this is where the references to prior work end and throughout the rest of the thesis, no further connections to related work are made. This I find would make the text clearer and would help the reader better understand the choices and steps taken when developing the presented methodology. Overall, the presentation of the thesis is systematic, and the chapters and sections have good logical flow. However, there is one writing element that gets used throughout the whole text, which unfortunately makes certain passages and even whole chapters very difficult to digest, and breaks the flow of reading. This is the author's choice of using itemized lists for almost every presented concept in the thesis. Just in chapter 2, these lists make up all but two sections, each of them following the same pattern of a brief introduction followed by the list itself (each item in the list is further explained), followed by two or three closing sentences. The problem is twofold. Most of the lists could just be converted to simple text paragraphs, as not every concept has to have its "characteristics" listed and explicitly named, which only puts strain on the reader, as such categorizations are not too natural to read and think about. Second, the language used in these itemizations is often too complex even when describing very simple concepts, essentially making the lists very rich with words but very sparse in useful information at the same time, further impeding the clarity. Overall, it is not clear, whether the usage of lists was a deliberate stylistic choice made by the student, or an artifact of using AI tools (as declared by the student) for editing, though I suspect the latter. In contrast, chapter 4 discussing the experiment results is much easier to read and digest as it mostly breaks away from the list paradigm. Then, chapter 5 is unfortunately again very difficult to read, as the LLM error taxonomy (along with the category names) presented there is probably better understood with all the insights that the author had gathered during writing, but for the reader it is very difficult to follow. In my opinion, the chapter would benefit from a more careful introduction and a more systematic approach of defining the individual categories, perhaps with some visualizations, which would allow the reader to better follow the author's thought process.
Formální úprava technické zprávy	77	The thesis is generally written in very good English. The student declares usage of AI tools for grammar correction and paraphrasing, though one can still find a few spelling mistakes here and there (such as missing articles). The language unfortunately also gets needlessly complex at times, possibly due to the edits made by the AI tools, impeding the clarity. The thesis has some typographical error that should have been addressed. Figures 3.1 and 3.3 are not referenced from the text. There are some trailing single characters at the end of lines, and for some reason, starting from page 24 to the end of the thesis, the font size is smaller than the default. Also, all footnote links to external resources are only usable in the pdf, as they do not show the target URL, only the name tag.
Práce s literaturou	78	The references cited in the thesis are overall relevant, the student makes sure to cite the actual publications (proceedings, journals, not just arxiv preprints), which is commendable. Most of the cited papers also have the date of citation in the bibliography, which is not necessary as those are not online sources. I have to also reiterate what was written above, which is that more related works should be cited throughout the work, not just in the introduction. A lot of the techniques, methods and categorizations presented in the thesis have likely to various extents already been explored in other works and it would be of great value to have the distinction of what the thesis improves and builds on at hand, so that direct comparisons could be made.
Realizační výstup	95	The actual methodology and experiments conducted in this work are sound and yield interesting results that are analyzed and reflected on in a systematic way, and only minor objections could be raised. The delivered code is structured and well documented.
Využitelnost výsledků		Even though this work was done on proprietary data, the presented framework is generic and could be applied to analyzing user/agent conversations from virtually any domain, provided that there is some expert supervision. The results provide interesting insights in how LLMs can be used to automate key tasks in the analysis and where they might fail, while proposing potential points of improvement for future work. The work was also submitted as a paper to SIGDIAL 25 and is currently under review.

Posudky

Posudek vedoucího

Kesiraju, Santosh

Posudek oponenta

Sedláček, Šimon