Awesome

What is pycapt

GitHub Welcome to submit PRs, if there are bugs or new requests please feedback in issues

pycapt is a collection of image processing algorithms I created for handling CAPTCHAs. You can use it to denoise images, remove interference lines, and segment CAPTCHAs. pycapt encapsulates methods for manipulating image matrices, such as splitting images into standardized matrices and generating the required training images, which is helpful for using deep learning in image recognition. In 2024, pycapt released a new version that added some logo generation methods to generate Android or iOS logos with one click.

pycapt includes both CAPTCHA processing and generation. Special thanks to my friends exqlnet and ZhouYingSASA for their support in releasing pycapt version 1.0.1.

Dependencies and Installation

pip3 install Pillow numpy pycapt

or use poetry to install:

poetry add Pillow numpy pycapt

Directory Structure

frcc0

Using pycapt for CAPTCHA Image Processing

Importing

import pycapt
from PIL import Image

Image Binarization

two_value: This method binarizes the image. The required parameter img is the image, and the optional parameter Threshold is the gray threshold, where you can choose an appropriate value (default is 100). Returns a newly processed image.

img = Image.open('./img/frcc0.png')
img = pycapt.two_value(img, Threshold=100)
img.show()

frcc0

frcc1

Noise Reduction

dele_noise: This method removes noise using an eight-neighborhood denoising technique. N is the number of neighborhood outliers, and Z is the number of processing iterations; more iterations will result in a smoother image.

img = pycapt.dele_noise(img, N=5, Z=2)
img.show()

frcc2

Removing Interference Lines

dele_line: This method removes interference lines by deleting N consecutive vertical pixels. It works best when used in conjunction with the dele_noise method.

img = pycapt.dele_line(img, N=4)
img.show()

For better results, you can first transpose the image using the tran_90(img) method, then apply the line removal method, and finally transpose it back.

img = pycapt.tran_90(img)
img.show()
img = pycapt.dele_line(img, 3)
img = pycapt.dele_line(img, 2)
img = pycapt.dele_line(img, 1)
img = pycapt.tran_90(img)
img.show()

frcc2

Slant Correction

The purpose of slant correction is to improve segmentation and recognition. The principle involves shifting each row left or right by different distances to create a correction effect. The pans list contains the shift values, where positive numbers shift left and negative numbers shift right. The number of elements in the pans list must equal the image height.

rectify_img(img, pans) returns a new image.

pan = [18, 18, 18, 18, 17, 17, 17,
        16, 16, 16, 15, 15, 15, 15, 14,
        14, 14, 14, 13, 13, 10, 10,
        10, 9, 9, 8, 7, 6, 5, 5, 4,
        4, 4, 4, 4, 3, 1, 0, 0, 0]
img = pycapt.rectify_img(img, pans=pan)
img.show()

frcc2

If you find it too unappealing, you can apply correction first and then use dele_line and dele_noise. Of course, addressing issues later is also acceptable.

img = pycapt.rectify_img(img, pans=pan)
img = pycapt.dele_line(img, 3)
img = pycapt.dele_line(img, 2)
img = pycapt.dele_line(img, 1)
img.show()

frcc2

Image Segmentation

cut_img_to_img_list sets a suitable length for the single image before cutting, returning the segmented image. The length can be set relatively large, and this method will pad the cut images on both sides. You can use this as a method for standardizing images.

img = Image.open('1.png')
img_list = pycapt.cut_img_to_img_list(img, max_width=30, background=255)
for i in img_list:
    i.show()

frcc2

When using deep learning, you can also use cut_img_to_mode_list(image, max_width) to obtain a standardized array.

Image Cropping

When your image height can be compressed, you can use small_img(img, box) to crop the image, reducing the computational load for later learning.

Using pycapt to Generate CAPTCHA Training Sets

do_captcha for Generating CAPTCHA Training Sets

width is the length of the CAPTCHA image, height is the height, num_of_str is the number of characters in the CAPTCHA (default is 4), font is the font size (default is 30), gray_value is the background gray value (default is 255), and font_family is the font file. You can choose the thickness, style, etc., but the font must be installed on your computer.

If you're unsure about which fonts are installed on your computer, please click here.

name, img = pycapt.do_captcha(
    my_str_list=['A', 'B', 'C', 'D', '1', '2', '3'],
    width=160,
    height=40,
    num_of_str=4,
    font=30,
    gray_value=255,
    font_family=None
)

print(name)
img.show()

# output: ['C', 'D', '2', 'A']

frcc2

Adding Noise

more_noise: N is the noise rate (0 < N < 1), and Z is the number of processing iterations.

img = pycapt.more_noise(img, N=0.5, Z=2)

frcc2

Panning

img = pycapt.img_pan(img, 10, 3)

frcc2

Inclining

As before, use rectify_img.

pan = [18, 18, 18, 18, 17, 17, 17,
        16, 16, 16, 15, 15, 15, 15, 14,
        14, 14, 14, 13, 13, 10, 10,
        10, 9, 9, 8, 7, 6, 5, 5, 4,
        4, 4, 4, 4, 3, 1, 0, 0, 0]
img = pycapt.rectify_img(img, pans=pan)

frcc2

Denoising for Smoothness

clear_train_img effectively applies dele_line(line, N) sequentially for N=4, 3, 2, 1, smoothing the image vertically.

img = pycapt.show_noise_img(img, 0.1, 1)
img = pycapt.dele_noise(img, 5, 2)
img = pycapt.clear_train_img(img)

frcc2

Here, you can fully utilize pycapt to generate a training set for CAPTCHA with deep learning.

If you want something more convenient, please see below.

Directly Generating Training Set Method

easy_train_img returns training set images. my_str_list is your character set list, width and height are the dimensions, and num_of_str is the number of characters displayed in the CAPTCHA image, which will be randomly selected from your my_str_list.

filename, img = pycapt.easy_train_img(
    my_str_list=['A', 'B', 'C', 'D', 'E'],
    width=30,
    height=32,
    num_of_str=1,
    font=30,
    xpan=3,
    ypan=2,
    rotate=15,
    noise_N=0.3,
    noise_Z=2,
    gray_value=255,
    font_family=None
)

You just need to write a loop like img.save('train_img/{}.png'.format(file_name)) to generate thousands of training images

, and you can obtain the label simply as name = file_name[0].

frcc2

2024 Update: One-click Generation of Android or iOS Logos

current_dir = os.path.dirname(os.path.abspath(__file__))

# Android
pycapt.generate_android_icon_assets(
    f"{current_dir}/appstore.png", f"{current_dir}/output_directory"
)

# iOS
pycapt.generate_ios_icon_assets(
    f"{current_dir}/appstore.png",
    f"{current_dir}/Assets.xcassets/AppIcon.appiconset",
)

The second parameter is your icon output directory, which will be created by default if it doesn't exist.

Conclusion

Theoretically, as long as you use pycapt to process images, call various methods, and use the easy_train_img method, you can solve 90% of CAPTCHA processing and generation problems. Feel free to star, PR, and submit issues. If you want to learn more about the underlying principles, click here. I look forward to hearing your thoughts or PR.

Small Donations

If you found this helpful, buy me a cup of tea~