KOČÍ, J. Automatické Oveřovaní Pravdivosti Dokumentů [online]. Brno: Vysoké učení technické v Brně. Fakulta informačních technologií. 2023.
The student worked thoroughly and continuously, respected deadlines, and even experimented with source reliability estimation, which turned out to be beyond the scope of the assignment. Hence I propose an A grade. A minor space for improvement was left for the usage of more recent NLP-oriented explainability methods, or building an interpretable system, which tries to explain itself.
Kritérium | Známka | Body | Slovní hodnocení |
---|---|---|---|
Informace k zadání | The assignment was of average difficulty. It is a well-known result that despite their simplicity, content-based classifiers achieve strong results in fact-checking/fake-news detection scenarios, due to simple textual cues. In this work, we decided to study this phenomenon, and while trying to focus on documents with less trivial lexical cues (either through domain selection or filtering methods). The work complements retrieval-based document-grounded fact-checking systems, such as the one we developed for MASAPI project. | ||
Práce s literaturou | Student actively studied recommended literature. Furthermore, he actively studied the portfolio of available dataset, interpretability methods, and their available literature. | ||
Aktivita během řešení, konzultace, komunikace | The student actively discussed his progress on the thesis weekly. This was done in person or through text reports. There were a few missed weeks, but he retained his working tempo and respected planned deadlines. | ||
Aktivita při dokončování | The work has been finished in advance and read by the supervisor. We re-iterated some formal concepts several times in order to not introduce confusion. | ||
Publikační činnost, ocenění | Student collected his small test set of fake news from many domains, including fake text from large language models, in the domains such as football, where there is unlikely fake news. This test set will be released publicly. |
The thesis fulfills the assignment and includes a significant extension therefore I propose to grade it with an A.
Kritérium | Známka | Body | Slovní hodnocení |
---|---|---|---|
Náročnost zadání | I consider the assignment moderately difficult because it requires an understanding and evaluation of advanced natural language processing techniques. | ||
Rozsah splnění požadavků zadání | The author completed every point of the assessment. In addition, he proposed and evaluated approaches to predicting the credibility of sources (Chapter 8) beyond the scope of the assignment. | ||
Rozsah technické zprávy | |||
Prezentační úroveň technické zprávy | 85 | The work is logically structured into chapters and is understandable for the reader. However, I would appreciate a more comprehensive overview of state-of-the-art content-based fact-checking systems , as their description is vague. | |
Formální úprava technické zprávy | 90 | The work contains several typographical and linguistic typos, especially missing spaces or redundant punctuation around references in the text. | |
Práce s literaturou | 85 | The bibliography contains 50 items , mostly relevant scientific articles. Unfortunately, some references lack important information (conference name, journal, link, ...) [4, 6, 7, ..] and are not in a uniform style. | |
Realizační výstup | 95 | The author collected a dataset and conducted a comprehensive case study of two selected baseline models. In addition, the author proposed two simple approaches using these models to predict the credibility of sources. The prediction is based solely on the style in which their articles are written without any extra information. | |
Využitelnost výsledků | The work forms a good basis for future research, especially in the source credibility prediction (Chapter 8). However, the thesis contains only initialization experiments, and much work needs to be done before the results can be published. |
eVSKP id 144936