Semi-supervised deep learning approach to break common CAPTCHAs

Manual data annotation is a time consuming activity. A novel strategy for automatic training of the CAPTCHA breaking system with no manual dataset creation is presented in this paper. We demonstrate the feasibility of the attack against a text-based CAPTCHA scheme utilizing similar network infrastructure used for Denial of Service attacks. The main goal of our research is to present a possible vulnerability in CAPTCHA systems when combining the brute-force attack with transfer learning. The classification step utilizes a simple convolutional neural network with 15 layers. Training stage uses automatically prepared dataset created without any human intervention and transfer learning for fine-tuning the deep neural network classifier. The designed system for breaking text-based CAPTCHAs achieved 80% classification accuracy after 6 fine-tuning steps for a 5 digit text-based CAPTCHA system. The results presented in this paper suggest, that even the simple attack with a large number of attacking computers can be an effective alternative to current CAPTCHA breaking systems.
Manual data annotation is a time consuming activity. A novel strategy for automatic training of the CAPTCHA breaking system with no manual dataset creation is presented in this paper. We demonstrate the feasibility of the attack against a text-based CAPTCHA scheme utilizing similar network infrastructure used for Denial of Service attacks. The main goal of our research is to present a possible vulnerability in CAPTCHA systems when combining the brute-force attack with transfer learning. The classification step utilizes a simple convolutional neural network with 15 layers. Training stage uses automatically prepared dataset created without any human intervention and transfer learning for fine-tuning the deep neural network classifier. The designed system for breaking text-based CAPTCHAs achieved 80% classification accuracy after 6 fine-tuning steps for a 5 digit text-based CAPTCHA system. The results presented in this paper suggest, that even the simple attack with a large number of attacking computers can be an effective alternative to current CAPTCHA breaking systems.

Keywords

CAPTCHA , Semi-supervised learning , Convolutional Neural Networks , CAPTCHA , Semi-supervised learning , Convolutional Neural Networks

Citation

NEURAL COMPUTING & APPLICATIONS. 2021, vol. 33, issue 20, p. 13333-13343.
https://link.springer.com/article/10.1007%2Fs00521-021-05957-0

Document type

Peer-reviewed

Document version

Accepted version

Language of document

en

DOI

10.1007/s00521-021-05957-0

URI

http://hdl.handle.net/11012/203005

Collections

Ústav automatizace a měřicí techniky

Citace PRO

Full item page

Semi-supervised deep learning approach to break common CAPTCHAs

Files

Date

Authors

Advisor

Referee

Mark

Journal Title

Journal ISSN

Volume Title

Publisher

ORCID

Altmetrics

Abstract

Description

Keywords

Citation

Document type

Document version

Date of access to the full text

Language of document

Study field

Comittee

Date of acceptance

Defence

Result of defence

DOI

URI

Collections

Endorsement

Review

Supplemented By

Referenced By

Citace PRO