GitHub - DFGANDP/Rotnet-Captcha-Solver: TikTok Captcha, RotNet, Pytorch, OpenCV, PIL

Neural netwrok and statistical analysis for solving TikTok captcha (NOT FINISHED)

🚩 Table of Contents

Environment
Features
Examples
Pipeline explained
Changes
TODO
License

📦 Environment

Python version and libraries

Version	Info
Python 3.11.7	Jupyterlab code

Main libraries used in project

Name	Description
numpy	Tensors
torch torchvision	Uses cuda 124
pandas	Dataframes
sklearn	EDA and thershold
matplotlib	Visualize charts
PIL, opencv	Image preprocessing

For more info look at requirements.txt and torch_version.txt

🎨 Features

🤖 What is RotNet captcha solver?

It is a prototype of a tool focused on solving captcha. Whether you're looking to preprocess images, build neural networks, train them or make EDA of output tool has got you covered. Future updates will introduce advanced features such as real time worker with selenium and better performance.

Available now:

Image preprocessing
Different model builder and trainer (even vision-transformers)
EDA of network output

🐾 Examples

Preprocessing of images

This section provides an example output demonstrating the preprocessed images.

Image examples

Output data

Actually available models: EfficientNetb3, Resnet(any), SwinV2T

🌏 Pipeline explained

INPUT

Data on which model was trained is subset of ImageNet

Files

File	desc
1. images/version1/RotNetDataGenerator.ipynb	Old version
2. RotNetSecondVersion.ipynb	Actual version

Process

Just look at files

Output

For any given image, after preprocessing, there is classification pred if it was rotated

🔧 Changes

Things improved in comparison to first version

Accuracy - By using EfficientNetb3 and more complex classificator head accuracy improved from 78 to 85 % (Without fine-tuning)
Cod is refactored, easier to read and all in english
Problem with bad color channels resolved
Model was trained using much more data over 100k images
Possibility of using vision transofmrers
Faster inference
EDA of output in order to better understand limitations and path of development
Explained way of how to use it with selenium

📈 Result

First test model was trained with ResNet-18 and got:
* HIGHREST ACCURACY: 0.7892497518082542

Newest model was trained on Efficentnet and got:
* HIGHREST ACCURACY: ~ 85%

Example image

Preprocessed Image

Angle = 90

Train/Test loss

Accuracy of model

💬 TODO

Real time solver with selenium
Augumentation like little distortion in image

Looking at EDA (look at code for more info) it seems to be reasonable to have 2 networks one small which will in general classify stuff and second which will be responsible in looking only on very hard examples. Or something like this. For sure there is some paper which resolved similiar issue
Make finetuning with W&B - easy improvement
Train with different models
The way it is trained can be improved by other function loss and way model is feeded with data, after initial training it could be best to train it further only on very hard examples something like online-mining in FaceNet? (looking at loss and confidence score)
Visualization of original/preprocessed bad examples for better understanding of limitation

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
images		images
.gitignore		.gitignore
EfficentNet.png		EfficentNet.png
LICENSE		LICENSE
README.md		README.md
RotNetSecondVersion.ipynb		RotNetSecondVersion.ipynb
edaideads.txt		edaideads.txt
requirements.txt		requirements.txt
torch_version.txt		torch_version.txt
training_results.csv		training_results.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚩 Table of Contents

📦 Environment

Python version and libraries

Main libraries used in project

🎨 Features

🤖 What is RotNet captcha solver?

🐾 Examples

Preprocessing of images

Actually available models: EfficientNetb3, Resnet(any), SwinV2T

🌏 Pipeline explained

INPUT

Files

Process

Output

🔧 Changes

Things improved in comparison to first version

📈 Result

Example image

Preprocessed Image

Train/Test loss

Accuracy of model

💬 TODO

📜 License

About

Releases

Packages

Languages

License

DFGANDP/Rotnet-Captcha-Solver

Folders and files

Latest commit

History

Repository files navigation

🚩 Table of Contents

📦 Environment

Python version and libraries

Main libraries used in project

🎨 Features

🤖 What is RotNet captcha solver?

🐾 Examples

Preprocessing of images

Actually available models: EfficientNetb3, Resnet(any), SwinV2T

🌏 Pipeline explained

INPUT

Files

Process

Output

🔧 Changes

Things improved in comparison to first version

📈 Result

Example image

Preprocessed Image

Train/Test loss

Accuracy of model

💬 TODO

📜 License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages