Keep Moving Forward

Sunday, April 23, 2017

Building a simple SUDOKU Solver from scratch - Part 2: Digit Number Recognition using SVM

11:30 PM Posted by Cáp Hữu Quân , , 1 comment
Hello everyone, in this post, we will continue the previous tutorial to recognize digital number were extracted from SUDOKU board. See part 1
In part 1 of this tutorial, we learned how to detect lines and extract the digital number from the SUDOKU board. I think we can easily extract digital number and save it as images by using Python code from part 1. (Uncomment the code from Part 1 to save your own digit blocks). Now, we will talk about training those extracted digital numbers with SVM

With the code in Part 1, we can easily save more than 1 thousand digit blocks in a second. Maybe looks like this:

To train the dataset with a supervised machine learning method like SVM, we need to label the dataset beforehand. With this dataset, we have 10 label (from 0 -> 9). So, the idea is that we will create 10 folders with the name from 0 to 9 and put each image into the corresponding folder.
It's quite cost the time because we have a lot of images and we need to process those images by hand :(
Don't worry! I already created a labeled dataset for you. Check out my GitHub repository to download them: https://github.com/huuquan1994/Sudoku-Solver 
I divided it into 2 parts, training set and testing set. 5000 images in training set (500 images in each folder) and ~3000 test images in testing set.
The training set folder will look like this:

Folder with label as 1 

We have all we need! Now, let's train SVM to recognize digital number.
We will use LinearSVM in Sklearn to train the dataset, using joblib to save the SVM model after training.
For training, first, we resize the size of each training image to [36x36]. After that, we train those resized images with LinearSVM. You can see the code below


import numpy as np
from sklearn.svm import LinearSVC
import os
import cv2
import joblib

# Generate training set
TRAIN_PATH = "Dataset\Train"
list_folder = os.listdir(TRAIN_PATH)
trainset = []
for folder in list_folder:
    flist = os.listdir(os.path.join(TRAIN_PATH, folder))
    for f in flist:
        im = cv2.imread(os.path.join(TRAIN_PATH, folder, f))
        im = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY )
        im = cv2.resize(im, (36,36))
        trainset.append(im)
# Labeling for trainset
train_label = []
for i in range(0,10):
    temp = 500*[i]
    train_label += temp

# Generate testing set
TEST_PATH = "Dataset\Test"
list_folder = os.listdir(TEST_PATH)
testset = []
test_label = []
for folder in list_folder:
    flist = os.listdir(os.path.join(TEST_PATH, folder))
    for f in flist:
        im = cv2.imread(os.path.join(TEST_PATH, folder, f))
        im = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY )
        im = cv2.resize(im, (36,36))
        testset.append(im)
        test_label.append(int(folder))
trainset = np.reshape(trainset, (5000, -1))

# Create an linear SVM object
clf = LinearSVC()

# Perform the training
clf.fit(trainset, train_label)
print("Training finished successfully")

# Testing
testset = np.reshape(testset, (len(testset), -1))
y = clf.predict(testset)
print("Testing accuracy: " + str(clf.score(testset, test_label)))

joblib.dump(clf, "classifier.pkl", compress=3)

After training, we will have a SVM model named: "classifier.pkl". We can use this file to recognize block images.

See this code for more details:

import cv2
import numpy as np
import joblib

font = cv2.FONT_HERSHEY_SIMPLEX
ratio2 = 3
kernel_size = 3
lowThreshold = 30

clf = joblib.load('classifier.pkl')

is_print = True

cv2.namedWindow("SUDOKU Solver")
vc = cv2.VideoCapture(0)
if vc.isOpened(): # try to get the first frame
    rval, frame = vc.read()
else:
    rval = False
while rval:
    sudoku1 = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    sudoku1 = cv2.blur(sudoku1, (1,1))
    edges = cv2.Canny(sudoku1, lowThreshold, lowThreshold*ratio2, kernel_size)
    lines = cv2.HoughLines(edges, 2, cv2.cv.CV_PI /180, 300, 0, 0)
    if (lines is not None):
        lines = lines[0]
        lines = sorted(lines, key=lambda line:line[0])
        diff_ngang = 0
        diff_doc = 0
        lines_1=[]
        Points=[]
        for rho,theta in lines:
            a = np.cos(theta)
            b = np.sin(theta)
            x0 = a*rho
            y0 = b*rho
            x1 = int(x0 + 1000*(-b))
            y1 = int(y0 + 1000*(a))
            x2 = int(x0 - 1000*(-b))
            y2 = int(y0 - 1000*(a))
            if (b>0.5):
                if(rho-diff_ngang>10):
                    diff_ngang=rho
                    lines_1.append([rho,theta, 0])
            else:
                if(rho-diff_doc>10):
                    diff_doc=rho
                    lines_1.append([rho,theta, 1])
        for i in range(len(lines_1)):
            if(lines_1[i][2] == 0):
                for j in range(len(lines_1)):
                    if (lines_1[j][2]==1):
                        theta1=lines_1[i][1]
                        theta2=lines_1[j][1]
                        p1=lines_1[i][0]
                        p2=lines_1[j][0]
                        xy = np.array([[np.cos(theta1), np.sin(theta1)], [np.cos(theta2), np.sin(theta2)]])
                        p = np.array([p1,p2])
                        res = np.linalg.solve(xy, p)
                        Points.append(res)
        if(len(Points)==100):
            sudoku1 = cv2.adaptiveThreshold(sudoku1, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY_INV, 101, 1)
            for i in range(0,9):
                for j in range(0,9):
                    y1=int(Points[j+i*10][1]+5)
                    y2=int(Points[j+i*10+11][1]-5)
                    x1=int(Points[j+i*10][0]+5)
                    x2=int(Points[j+i*10+11][0]-5)
                    X = sudoku1[y1:y2,x1:x2]
                    if(X.size!=0):
                        X = cv2.resize(X, (36,36))
                        num = clf.predict(np.reshape(X, (1,-1)))
                        if (num[0] != 0):
                            cv2.putText(frame,str(num[0]),(int(Points[j+i*10+10][0]+10), 
                                                                     int(Points[j+i*10+10][1]-30)),font,1,(225,0,0),2)
                        else:
                            cv2.putText(frame,str(num[0]),(int(Points[j+i*10+10][0]+10), 
                                                                 int(Points[j+i*10+10][1]-15)),font,1,(225,0,0),2)
    cv2.imshow("SUDOKU Solver", frame)
    rval, frame = vc.read()
    key = cv2.waitKey(20)
    if key == 27: # exit on ESC
        break
vc.release()
cv2.destroyAllWindows()

Yeahhhh! Now we know how to train a machine learning method and recognize a digital image. The next article, we will discuss how to solve a SUDOKU matrix by backtracking algorithm.
If you have any questions or comments, please let me know. See ya :)

1 comment: