HRKĽOVÁ, Z. Odstraňování reklam z videosekvencí [online]. Brno: Vysoké učení technické v Brně. Fakulta informačních technologií. 2025.
Celkově se jedná o práci, která splnila zadání jak v textové části, tak i v realizační části. Zadání práce, které skýtalo možnost tvůrčího rozvinutí, tvůrčím způsobem rozvinuto příliš nebylo, ale i tak jsou realizační část i textová část práce na solidní úrovni. Hodnotím proto stupněm dobře.
| Kritérium | Známka | Body | Slovní hodnocení |
|---|---|---|---|
| Informace k zadání | Zadání diplomové práce bylo středně obtížné, studentka musela nad rámec standardního obsahu studia získat znalosti ze zpracováíní videosekvencí a vyhledávání reklam. Samotný realizační výstup práce sice tak obtížný nebyl, ale skýtal možnost tvůrčího rozviunutí. Studentka toho využila jen zčásti a nakonec zpracovala dva algoritmy detekce reklam, které jsou vesměs funkční. Zadání bylo tedy splněno, ale studentka mohla zpracovat i více metod a doplnit je tvůrčím způsobem. | ||
| Práce s literaturou | Studentka využívala doporučené studijní prameny, ale vyhledávala i literaturu nad rámec zadání tvůrčím a aktivním způsobem. | ||
| Aktivita během řešení, konzultace, komunikace | Během řešení práce byla studentka střídavě aktivní, na konzultace chodívala, i když ne zdela pravidelně, dohodnuté termíny ale dodržovala a na konzultace bývala přípravena. | ||
| Aktivita při dokončování | Práce byla dokončena kvůli různým problémům, poměrně pozdě a studentka žádala i o odklad odevzdání. Přesto se povedlo konzultovat obsah práce a provést v něm i korekce. | ||
| Publikační činnost, ocenění | - |
Based on careful study, the author designed and developed a system for detecting commercials in TV broadcasts using information in images and audio. In the study part, the author intensively discusses topics, but the knowledge gained is not directly used in the solution. On the contrary, there is a lack of in-depth study and knowledge of the methods used. The resulting implementation makes heavy use of external solutions and tools, and the actual creative contribution in the result is rather small in scale. The author describes the reason for using the methods well, but the methods are applied without a deeper understanding. For a diploma thesis, it is advisable to go in-depth into the chosen area and demonstrate a good understanding. The choice of the topics studied, how the solution is implemented, and the evaluation indicate a rather superficial understanding of image processing and machine learning.
| Kritérium | Známka | Body | Slovní hodnocení |
|---|---|---|---|
| Náročnost zadání | The assignment is inherently quite difficult, but it depends on the chosen methods and the particular functionality chosen. The work is carried out using common procedures. More complex procedures are used only in application, without a deeper and more creative approach. | ||
| Rozsah splnění požadavků zadání | |||
| Rozsah technické zprávy | The scope is at a minimum, which in general might not be a bad thing; there is no need to write unnecessarily long treatises. The problem with this thesis is that the theory deals with a few topics relevant to the actual solution and neglects to properly study the topics that are then used in the solution, e.g. ML or model YOLO. Figure 6.2 is a bit of an extra; Figure 6.4 would have sufficed. Although the author repeatedly claims during the study that the knowledge will be used, this is not the case. Extreme discrepancy is when she describes the sampling of audio signals in detail (Section 2.4), but finally calls ffmpeg with the silencedetect parameter during implementation. | ||
| Prezentační úroveň technické zprávy | 65 | The text has a logical structure and is written in a clear and understandable way. The theoretical part is well prepared, but in the implementation part, the studied information is used only in a function-call way. The essential information about the author's specific solution is not totally clear until chapter 6. Until then, the text presents only the general and repeated claim that the system should be robust and general, and the reader still does not know how the author intends to achieve this. The YOLO model and key information on its use, retraining, etc., are not mentioned in the theory or elsewhere. It is only in ch. 6.1, when the model is extremely briefly described as being retrained and used. There is no description of the parameters, no information on the training set, etc. There are magic constants for monochrome frame detection, the same for silent frame detection and in the classification part. The methods in Eq. 3.1 and 3.3 are the same. The essence of the fingerprinting, template matching, classification and clustering methods is very general. The names of the methods FLANN, SURF, BIC, DTW, and EEUPC appear but are not explained at all. | |
| Formální úprava technické zprávy | 85 | The text is written in good quality English, it is clear and typographically very good. Figure 2.4 is at an inappropriately low resolution, and although the image is reproduced, it would be useful to replace the texts in the image with a vector presentation. Similarly, Figs. 6.1-2 and 6.4 should be vector, not raster format. | |
| Práce s literaturou | 85 | The thesis draws on an extensive list of high-quality professional sources, mostly scientific publications. The general discussion of works and methods focused on commercial content detection in TV broadcasting is relevant and of high quality. | |
| Realizační výstup | 55 | The solution is based on extensive use of external tools. Using OpenCV, it reads the images and detects the TV channel logo using the YOLO model, and the monochrome images. Results are exported to a txt file. Using the ffmpeg tool, it extracts silent frames to txt file. Then it retrieves everything from the txt files and classifies the segments as commercials when 2/3 of the detections are positive in a floating window with hysteresis (with magic constants). The way of combining the partial detectors in segments and their preprocessing is confused and unmethodical. The implementation part is minor, with the limited scope of the authoring parts. The source codes have a logical directory structure and are well commented. The author created a custom dataset for YOLO model retraining (Image Dataset) and classification validation (Video Dataset). The experiments do not use standard metrics (precision, recall). The discussion of results somewhere presents absolute values instead of percentages, and without knowing the extent of the validation dataset (in seconds), the relevant metrics cannot be calculated. It is not clear why the author presents results for each video separately (Table 7.1-3), and why she then does not better analyse characteristics of hard samples and summarise her findings, e.g. by video genre, problematic samples, etc. Graphs 7.1-2 would benefit from a much better and more professional interpretation. The author concludes that 75% precision is promising for such light-weight unsupervised methods. This cannot be refuted, but neither can it be confirmed, because the author does not provide a comparison with any existing method. | |
| Využitelnost výsledků | Without a proper evaluation, it is difficult to assess the quality of the proposed solution. |
eVSKP id 161460