TEMBO, M. Transformer Neural Networks for Natural Language Processing [online]. Brno: Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií. 2025.
The thesis explores the use of retrieval-augmented generation (RAG) to improve the alignment of large language models with technical documentation, addressing the issue of hallucination in command syntax. The student proposed a modular architecture and developed a pre-processing and annotation framework to convert network configuration manuals into semantically segmented, retrievable units. The system was evaluated on tasks like question answering and code generation, where it showed improved factual consistency compared to non-retrieval baselines. While the technical implementation is solid and the concept relevant, the work remains at an initial proof-of-concept stage. The scope of experimentation is somewhat limited, and the annotation pipeline requires significant manual effort and also the proposed solution was verified through only few tasks. Additionally, more critical reflection on the system’s generalizability and deployment challenges would have strengthened the analysis. Overall, the thesis demonstrates good understanding and effort, though there is room for deeper exploration and refinement. The student regularly consulted with supervisors throughout the development process and incorporated feedback effectively. From a stylistic perspective, the thesis is clearly written, with only minor formal inconsistencies. The literature review is sufficient, though some references are outdated and there is limited engagement with prior or similar approaches, which somewhat weakens the contextual grounding of the work. In overall, the thesis is clearly written, methodologically sound, and makes a meaningful contribution to the field of applied NLP in documentation-constrained environments. Final Grade: Good (C)
The thesis is focused on practical applications of Large Language Models (LLMs) to network engineering tasks. Specifically, it addresses the problem of enhancing LLM output quality for configuring Arista EOS network devices using Retrieval-Augmented Generation (RAG). The work convincingly demonstrates improved performance and reduced hallucination through a novel preprocessing approach applied to input data, evaluated within a real-world networking context – a significant strength highlighting its practical utility. The presented results clearly showcase the benefits of RAG in this domain, offering a promising solution for automating network management tasks. However, while technically sound, the textual part would benefit from a greater focus on the main objective of the thesis, which should be more articulated especially in the first two chapters. The introduction would be more impactful by prioritizing a more in-depth description of the problem being addressed in the thesis, rather than diving into general NLP. Furthermore, Chapter 2’s historical overview of NLP feels somewhat disconnected from the core transformer-based implementation; a more focused discussion about the transformer architecture and RAG would better bridge the theoretical and practical parts of the work. Additionally, several sections, such as 2.3 (Deep Learning), lack sufficient literature citations considering the depth of the deep learning field, and Chapter 3 entirely lacks references to existing state-of-the-art solutions for similar problems. Finally, the evaluation pipeline feels limited by its reliance on a single LLM + RAG model (GPT4 + RAG). Expanding the evaluation to include more LLMs (e.g. recent open-source alternatives) and comparing RAG performance across different models would provide a more robust assessment of the proposed method.
eVSKP id 168320