In part 1 of this tutorial, we learned how to detect lines and extract the digital number from the SUDOKU board. I think we can easily extract digital number and save it as images by using Python code from part 1. (Uncomment the code from Part 1 to save your own digit blocks). Now, we will talk about training those extracted digital numbers with SVM
With the code in Part 1, we can easily save more than 1 thousand digit blocks in a second. Maybe looks like this:
To train the dataset with a supervised machine learning method like SVM, we need to label the dataset beforehand. With this dataset, we have 10 label (from 0 -> 9). So, the idea is that we will create 10 folders with the name from 0 to 9 and put each image into the corresponding folder.
It's quite cost the time because we have a lot of images and we need to process those images by hand :(
Don't worry! I already created a labeled dataset for you. Check out my GitHub repository to download them: https://github.com/huuquan1994/Sudoku-Solver
I divided it into 2 parts, training set and testing set. 5000 images in training set (500 images in each folder) and ~3000 test images in testing set.
The training set folder will look like this:
Folder with label as 1
We have all we need! Now, let's train SVM to recognize digital number.
We will use LinearSVM in Sklearn to train the dataset, using joblib to save the SVM model after training.
For training, first, we resize the size of each training image to [36x36]. After that, we train those resized images with LinearSVM. You can see the code below
import numpy as np from sklearn.svm import LinearSVC import os import cv2 import joblib # Generate training set TRAIN_PATH = "Dataset\Train" list_folder = os.listdir(TRAIN_PATH) trainset = [] for folder in list_folder: flist = os.listdir(os.path.join(TRAIN_PATH, folder)) for f in flist: im = cv2.imread(os.path.join(TRAIN_PATH, folder, f)) im = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY ) im = cv2.resize(im, (36,36)) trainset.append(im) # Labeling for trainset train_label = [] for i in range(0,10): temp = 500*[i] train_label += temp # Generate testing set TEST_PATH = "Dataset\Test" list_folder = os.listdir(TEST_PATH) testset = [] test_label = [] for folder in list_folder: flist = os.listdir(os.path.join(TEST_PATH, folder)) for f in flist: im = cv2.imread(os.path.join(TEST_PATH, folder, f)) im = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY ) im = cv2.resize(im, (36,36)) testset.append(im) test_label.append(int(folder)) trainset = np.reshape(trainset, (5000, -1)) # Create an linear SVM object clf = LinearSVC() # Perform the training clf.fit(trainset, train_label) print("Training finished successfully") # Testing testset = np.reshape(testset, (len(testset), -1)) y = clf.predict(testset) print("Testing accuracy: " + str(clf.score(testset, test_label))) joblib.dump(clf, "classifier.pkl", compress=3)
After training, we will have a SVM model named: "classifier.pkl". We can use this file to recognize block images.
See this code for more details:
import cv2 import numpy as np import joblib font = cv2.FONT_HERSHEY_SIMPLEX ratio2 = 3 kernel_size = 3 lowThreshold = 30 clf = joblib.load('classifier.pkl') is_print = True cv2.namedWindow("SUDOKU Solver") vc = cv2.VideoCapture(0) if vc.isOpened(): # try to get the first frame rval, frame = vc.read() else: rval = False while rval: sudoku1 = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) sudoku1 = cv2.blur(sudoku1, (1,1)) edges = cv2.Canny(sudoku1, lowThreshold, lowThreshold*ratio2, kernel_size) lines = cv2.HoughLines(edges, 2, cv2.cv.CV_PI /180, 300, 0, 0) if (lines is not None): lines = lines[0] lines = sorted(lines, key=lambda line:line[0]) diff_ngang = 0 diff_doc = 0 lines_1=[] Points=[] for rho,theta in lines: a = np.cos(theta) b = np.sin(theta) x0 = a*rho y0 = b*rho x1 = int(x0 + 1000*(-b)) y1 = int(y0 + 1000*(a)) x2 = int(x0 - 1000*(-b)) y2 = int(y0 - 1000*(a)) if (b>0.5): if(rho-diff_ngang>10): diff_ngang=rho lines_1.append([rho,theta, 0]) else: if(rho-diff_doc>10): diff_doc=rho lines_1.append([rho,theta, 1]) for i in range(len(lines_1)): if(lines_1[i][2] == 0): for j in range(len(lines_1)): if (lines_1[j][2]==1): theta1=lines_1[i][1] theta2=lines_1[j][1] p1=lines_1[i][0] p2=lines_1[j][0] xy = np.array([[np.cos(theta1), np.sin(theta1)], [np.cos(theta2), np.sin(theta2)]]) p = np.array([p1,p2]) res = np.linalg.solve(xy, p) Points.append(res) if(len(Points)==100): sudoku1 = cv2.adaptiveThreshold(sudoku1, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY_INV, 101, 1) for i in range(0,9): for j in range(0,9): y1=int(Points[j+i*10][1]+5) y2=int(Points[j+i*10+11][1]-5) x1=int(Points[j+i*10][0]+5) x2=int(Points[j+i*10+11][0]-5) X = sudoku1[y1:y2,x1:x2] if(X.size!=0): X = cv2.resize(X, (36,36)) num = clf.predict(np.reshape(X, (1,-1))) if (num[0] != 0): cv2.putText(frame,str(num[0]),(int(Points[j+i*10+10][0]+10), int(Points[j+i*10+10][1]-30)),font,1,(225,0,0),2) else: cv2.putText(frame,str(num[0]),(int(Points[j+i*10+10][0]+10), int(Points[j+i*10+10][1]-15)),font,1,(225,0,0),2) cv2.imshow("SUDOKU Solver", frame) rval, frame = vc.read() key = cv2.waitKey(20) if key == 27: # exit on ESC break vc.release() cv2.destroyAllWindows()
Yeahhhh! Now we know how to train a machine learning method and recognize a digital image. The next article, we will discuss how to solve a SUDOKU matrix by backtracking algorithm.
If you have any questions or comments, please let me know. See ya :)
It's really useful to me! Thank you so much!
ReplyDelete