Semi-supervised deep learning approach to break common CAPTCHAs

Loading...
Thumbnail Image
Date
2021-04-12
Authors
Boštík, Ondřej
Horák, Karel
Kratochvíla, Lukáš
Zemčík, Tomáš
Bilík, Šimon
Advisor
Referee
Mark
Journal Title
Journal ISSN
Volume Title
Publisher
Springer
Altmetrics
Abstract
Manual data annotation is a time consuming activity. A novel strategy for automatic training of the CAPTCHA breaking system with no manual dataset creation is presented in this paper. We demonstrate the feasibility of the attack against a text-based CAPTCHA scheme utilizing similar network infrastructure used for Denial of Service attacks. The main goal of our research is to present a possible vulnerability in CAPTCHA systems when combining the brute-force attack with transfer learning. The classification step utilizes a simple convolutional neural network with 15 layers. Training stage uses automatically prepared dataset created without any human intervention and transfer learning for fine-tuning the deep neural network classifier. The designed system for breaking text-based CAPTCHAs achieved 80% classification accuracy after 6 fine-tuning steps for a 5 digit text-based CAPTCHA system. The results presented in this paper suggest, that even the simple attack with a large number of attacking computers can be an effective alternative to current CAPTCHA breaking systems.
Description
Citation
NEURAL COMPUTING & APPLICATIONS. 2021, vol. 33, issue 20, p. 13333-13343.
https://link.springer.com/article/10.1007%2Fs00521-021-05957-0
Document type
Peer-reviewed
Document version
Accepted version
Date of access to the full text
Language of document
en
Study field
Comittee
Date of acceptance
Defence
Result of defence
Document licence
(C) Springer
Citace PRO