Ústav počítačové grafiky a multimédií
Browse
Recent Submissions
- ItemAutomatic 3D-Display-Friendly Scene Extraction from Video Sequences and Optimal Focusing Distance Identification(Springer Nature, 2024-02-16) Chlubna, Tomáš; Milet, Tomáš; Zemčík, PavelThis paper proposes a method for an automatic detection of 3D-display-friendly scenes from video sequences. Manual selection of such scenes by a human user would be extremely time consuming and would require additional evaluation of the result on 3D display. The input videos can be intentionally captured or taken from other sources, such as films. First, the input video is analyzed and the camera trajectory is estimated. The optimal frame sequence that follows defined rules, based on optical attributes of the display, is then extracted. This ensures the best visual quality and viewing comfort. The following identification of a correct focusing distance is an important step to produce a sharp and artifact-free result on a 3D display. Two novel and equally efficient focus metrics for 3D displays are proposed and evaluated. Further scene enhancements are proposed to correct the unsuitably captured video. Multiple image analysis approaches used in the proposal are compared in terms of both quality and time performance. The proposal is experimentally evaluated on a state-of-the-art 3D display by Looking Glass Factory and is suitable even for other multi-view devices. The problem of optimal scene detection, which includes the input frames extraction, resampling, and focusing, was not addressed in any previous research. Separate stages of the proposal were compared with existing methods, but the results show that the proposed scheme is optimal and cannot be replaced by other state-of-the-art approaches.
- ItemAnalysis and interpretation of joint source separation and sound event detection in domestic environments(PUBLIC LIBRARY SCIENCE, 2024-07-05) de Benito Gorron, Diego; Žmolíková, Kateřina; Torre Toledano, DoroteoIn recent years, the relation between Sound Event Detection (SED) and Source Separation (SSep) has received a growing interest, in particular, with the aim to enhance the performance of SED by leveraging the synergies between both tasks. In this paper, we present a detailed description of JSS (Joint Source Separation and Sound Event Detection), our joint-training scheme for SSep and SED, and we measure its performance in the DCASE Challenge for SED in domestic environments. Our experiments demonstrate that JSS can improve SED performance, in terms of Polyphonic Sound Detection Score (PSDS), even without additional training data. Additionally, we conduct a thorough analysis of JSS's effectiveness across different event classes and in scenarios with severe event overlap, where it is expected to yield further improvements. Furthermore, we introduce an objective measure to assess the diversity of event predictions across the estimated sources, shedding light on how different training strategies impact the separation of sound events. Finally, we provide graphical examples of the Source Separation and Sound Event Detection steps, aiming to facilitate the interpretation of the JSS methods.
- ItemLMVSegRNN and Poseidon3D: Addressing Challenging Teeth Segmentation Cases in 3D Dental Surface Orthodontic Scans(MDPI, 2024-10-01) Kubík, Tibor; Španěl, MichalThe segmentation of teeth in 3D dental scans is difficult due to variations in teeth shapes, misalignments, occlusions, or the present dental appliances. Existing methods consistently adhere to geometric representations, omitting the perceptual aspects of the inputs. In addition, current works often lack evaluation on anatomically complex cases due to the unavailability of such datasets. We present a projection-based approach towards accurate teeth segmentation that operates in a detect-and-segment manner locally on each tooth in a multi-view fashion. Information is spatially correlated via recurrent units. We show that a projection-based framework can precisely segment teeth in cases with anatomical anomalies with negligible information loss. It outperforms point-based, edge-based, and Graph Cut-based geometric approaches, achieving an average weighted IoU score of 0.971220.038 and a Hausdorff distance at 95 percentile of 0.490120.571 mm. We also release Poseidon's Teeth 3D (Poseidon3D), a novel dataset of real orthodontic cases with various dental anomalies like teeth crowding and missing teeth.
- ItemOrbis Pictus: Zpřístupnění netextových dat z digitálních knihoven(Slovak Centre of Scientific and Technical Information, 2024-10-25) Lehečka, Dalibor; Jebavý, Filip; Kersch, Filip; Pavčík, Filip; Jana, Hrzinová; Fremrová, Květa; Kišš, Martin; Lhoták, Martin; Dvořáková, Martina; Bežová, Michaela; Hradiš, Michal; Žabička, Petr; Jiroušek, VáclavÚčel - Projekt "Orbis Pictus - oživení knihy pro kulturní a kreativní odvětví" si klade za cíl zpřístupnit netextový obsah českých digitálních knihoven, který je ve srovnání s textovými daty obtížně dosažitelný a neprohledatelný. Tento článek přináší přehled plánovaných výstupů projektu s důrazem na klíčové výsledky dosažené v prvních dvou letech. Metody - Zpřístupnění netextových objektů v digitalizovaných dokumentech lze rozdělit na tři úlohy: detekci, popis a vyhledání. Identifikaci, lokalizaci a kategorizaci objektů zajistí nástroj AnnoPage, který umožní extrakci popisů objektů a jejich uložení ve standardizovaném formátu. V dalších fázích projektu naváže na AnnoPage nástroj PeopleGator, který identifikuje osoby na fotografiích či kresbách a umožní propojení dokumentů s vyobrazením stejné osoby a vytvoření databáze identifikovaných osob. Projekt bude zakončen softwarovým řešením integrujícím všechny vyvinuté nástroje. Výsledky - V prvních dvou letech projektu byla vytvořena metodika pro zpracování obrazových dokumentů. Ta popisuje způsob detekce netextových objektů, jejich rozdělení do 25 kategorií a zápis informací pomocí mezinárodních standardů, čímž pokládá základ pro nástroj AnnoPage. K detekci objektů je využíván detektor trénovaný na vlastní datové sadě. Detekované objekty jsou popsány pomocí vektorových reprezentací a textových popisů. Originalita/hodnota - Výstupy projektu budou integrovány do České digitální knihovny, což umožní využívání vyvinutých nástrojů širokému spektru knihoven, které platforma agreguje. Orbis Pictus je unikátní projekt v oblasti digital humanities díky rozsáhlému shromáždění netextových dat. Výsledky najdou uplatnění nejen v identifikaci objektů a metadat, ale i ve výzkumu a kulturním a kreativním průmyslu, kde mohou zpřístupněné objekty sloužit jako inspirace pro marketing, vzdělávání, gamifikaci nebo umělou inteligenci.
- ItemExploring the benefits and challenges of AI-driven large language models in gastroenterology: Think out of the box(PALACKY UNIV, MEDICAL FAC, 2024-12-01) Král, Jan; Hradiš, Michal; Bužga, Marek; Kunovský, LumírArtificial Intelligence (AI) has evolved significantly over the past decades, from its early concepts in the 1950s to the present era of deep learning and natural language processing. Advanced large language models (LLMs), such as Chatbot Generative Pre-Trained Transformer (ChatGPT) is trained to generate human-like text responses. This technology has the potential to revolutionize various aspects of gastroenterology, including diagnosis, treatment, education, and The benefits of using LLMs in gastroenterology could include accelerating diagnosis and treatment, providing personalized care, enhancing education and training, assisting in decision-making, and improving communication with patients. However, drawbacks and challenges such as limited AI capability, training on possibly biased data, data errors, security and privacy concerns, and implementation costs must be addressed to ensure the responsible and effective use of this technology. The future of LLMs in gastroenterology relies on the ability to process and analyse large amounts of data, identify patterns, and summarize information and thus assist physicians in creating personalized treatment plans. As AI advances, LLMs will become more accurate and efficient, allowing for faster diagnosis and treatment of gastroenterological conditions. Ensuring effective collaboration between AI developers, healthcare professionals, and regulatory bodies is essential for the responsible and effective use of this technology. By finding the right balance between AI and human expertise and addressing the limitations and risks associated with its use, LLMs can play an increasingly significant role in gastroenterology, contributing to better patient care and supporting doctors in their work.
- «
- 1 (current)
- 2
- 3
- »