A Multi-Dimensional DNS Domain Intelligence Dataset for Cybersecurity Research

dc.contributor.authorHranický, Radekcs
dc.contributor.authorOndryáš, Ondřejcs
dc.contributor.authorHorák, Adamcs
dc.contributor.authorPouč, Petrcs
dc.contributor.authorJeřábek, Kamilcs
dc.contributor.authorEbert, Tomášcs
dc.contributor.authorPolišenský, Jancs
dc.coverage.issueOctobercs
dc.coverage.volume62cs
dc.date.accessioned2025-10-30T17:05:14Z
dc.date.available2025-10-30T17:05:14Z
dc.date.issued2026-01-01cs
dc.description.abstractThe escalating sophistication and frequency of cyber threats require advanced solutions in cybersecurity research. Particularly, phishing and malware detection have become increasingly reliant on data-driven approaches. This paper presents a unique dataset precisely curated to bolster research in network security, focusing on the classification and analysis of internet domains. This dataset contains information for over a million internet domains with detailed labels distinguishing between phishing, malware, and benign traffic. Our dataset is distinctive due to its comprehensive compilation of metainformation derived from multiple sources, including DNS records, TLS handshakes and certificates, WHOIS and RDAP services, IP-related data, and geolocation details. Such rich, multi-dimensional data allows for a deeper analysis and understanding of domain characteristics that are critical in identifying and categorizing cyber threats. The integration of information from diverse sources enhances the dataset's utility, providing a holistic view of each domain's footprint and its potential security implications. The data is formatted in JSON, ensuring versatility, accessibility for researchers, and easy integration into various analytical tools and platforms, facilitating ease of use in statistical analysis, machine learning, and other computational analyses. Our dataset's extensive volume and variety surpass any known publicly available resources in this field, making it an invaluable asset for both academic and practical development and testing of cybersecurity solutions. This paper thoroughly describes the value of the data, details the comprehensive methodology employed in the collection process, and provides a clear description of the data structure. Such documentation is crucial for ensuring that the dataset can be effectively utilized and reapplied in a variety of research contexts. Its structured format and the broad range of included features are critical for developing robust cybersecurity solutions and can be adapted for emerging threats.en
dc.description.abstractThe escalating sophistication and frequency of cyber threats require advanced solutions in cybersecurity research. Particularly, phishing and malware detection have become increasingly reliant on data-driven approaches. This paper presents a unique dataset precisely curated to bolster research in network security, focusing on the classification and analysis of internet domains. This dataset contains information for over a million internet domains with detailed labels distinguishing between phishing, malware, and benign traffic. Our dataset is distinctive due to its comprehensive compilation of metainformation derived from multiple sources, including DNS records, TLS handshakes and certificates, WHOIS and RDAP services, IP-related data, and geolocation details. Such rich, multi-dimensional data allows for a deeper analysis and understanding of domain characteristics that are critical in identifying and categorizing cyber threats. The integration of information from diverse sources enhances the dataset's utility, providing a holistic view of each domain's footprint and its potential security implications. The data is formatted in JSON, ensuring versatility, accessibility for researchers, and easy integration into various analytical tools and platforms, facilitating ease of use in statistical analysis, machine learning, and other computational analyses. Our dataset's extensive volume and variety surpass any known publicly available resources in this field, making it an invaluable asset for both academic and practical development and testing of cybersecurity solutions. This paper thoroughly describes the value of the data, details the comprehensive methodology employed in the collection process, and provides a clear description of the data structure. Such documentation is crucial for ensuring that the dataset can be effectively utilized and reapplied in a variety of research contexts. Its structured format and the broad range of included features are critical for developing robust cybersecurity solutions and can be adapted for emerging threats.en
dc.formattextcs
dc.format.extent1-13cs
dc.format.mimetypeapplication/pdfcs
dc.identifier.citationData in Brief. 2026, vol. 62, issue October, p. 1-13.en
dc.identifier.doi10.1016/j.dib.2025.112062cs
dc.identifier.issn2352-3409cs
dc.identifier.orcid0000-0001-6315-8137cs
dc.identifier.orcid0009-0007-5400-8584cs
dc.identifier.orcid0000-0002-5317-9222cs
dc.identifier.orcid0009-0000-8525-3194cs
dc.identifier.other194220cs
dc.identifier.researcheridKRR-2050-2024cs
dc.identifier.researcheridJFA-4159-2023cs
dc.identifier.scopus57189302660cs
dc.identifier.scopus59536362400cs
dc.identifier.scopus57208510810cs
dc.identifier.urihttps://hdl.handle.net/11012/255608
dc.language.isoencs
dc.relation.ispartofData in Briefcs
dc.relation.urihttps://www.sciencedirect.com/science/article/pii/S235234092500784Xcs
dc.rightsCreative Commons Attribution 4.0 Internationalcs
dc.rights.accessopenAccesscs
dc.rights.sherpahttp://www.sherpa.ac.uk/romeo/issn/2352-3409/cs
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/cs
dc.subjectDomainen
dc.subjectDNSen
dc.subjectTLSen
dc.subjectWHOISen
dc.subjectRDAPen
dc.subjectIPen
dc.subjectGeolocationen
dc.subjectMalwareen
dc.subjectPhishingen
dc.subjectDomain
dc.subjectDNS
dc.subjectTLS
dc.subjectWHOIS
dc.subjectRDAP
dc.subjectIP
dc.subjectGeolocation
dc.subjectMalware
dc.subjectPhishing
dc.titleA Multi-Dimensional DNS Domain Intelligence Dataset for Cybersecurity Researchen
dc.title.alternativeA Multi-Dimensional DNS Domain Intelligence Dataset for Cybersecurity Researchen
dc.type.driverarticleen
dc.type.statusPeer-revieweden
dc.type.versionpublishedVersionen
sync.item.dbidVAV-194220en
sync.item.dbtypeVAVen
sync.item.insts2025.10.30 18:05:13en
sync.item.modts2025.10.30 09:33:12en
thesis.grantorVysoké učení technické v Brně. Fakulta informačních technologií. Ústav informačních systémůcs
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
1s2.0S235234092500784Xmain.pdf
Size:
1.67 MB
Format:
Adobe Portable Document Format
Description:
file 1s2.0S235234092500784Xmain.pdf