今回は、点数計算の実アプリケーション作成に向けて、以下の処理を検討します。

画像1枚を受け取ってからの一連の処理
点数計算

処理フロー

点数計算アプリケーションは、ストリーミングされている画像データから、1枚取り出して計算処理を実施・表示、というのを繰り返すイメージで考えます。
処理の流れは、

画像を1枚取得
点数文字の候補となる輪郭を検出
各輪郭について、一致度ベクトルを計算
SVMに通して、どの数字なのか、それとも数字の輪郭でないのか判定
判定結果の数字を足して、総点数を計算
結果の表示

最後の結果の表示ですが、おそらく誤認識は発生するので、総点数を表示するだけではなく、どこの輪郭を何点と認識したか、ということも表示したいと思います。

下準備

今までは、作成した処理関数の定義をJupyter notebookが変わるたびにやっていましたが、前回スクリプト(harupan.py)にまとめたので、必要な準備はだいぶ減ります。

スクリプトの実行は、importを使えばいいとのこと。

https://note.nkmk.me/python-import-usage/

スクリプトのディレクトリと実行ディレクトリが分かれている場合の対応:

https://techacademy.jp/magazine/23279

実行するのは、

スクリプト読み込み
テンプレートデータ、SVMデータの読み込み
処理対象画像の読み込み

といったところです。

from harupan_data.harupan import *

svm = load_svm('harupan_data/harupan_svm.dat')
templates2019 = load_templates('harupan_data/templates2019.json')
templates2020 = load_templates('harupan_data/templates2020.json')
templates2021 = load_templates('harupan_data/templates2021.json')

img1 = cv2.imread('harupan_190428_1.jpg')
img2 = cv2.imread('harupan_190428_2.jpg')
img3 = cv2.imread('harupan_200317_1.jpg')
img4 = cv2.imread('harupan_210227_2.jpg')
img5 = cv2.imread('harupan_210402_1.jpg')
img6 = cv2.imread('harupan_210402_2.jpg')
img7 = cv2.imread('harupan_210414_1.jpg')

一連の計算処理実装

実際に処理を作っていきます。

この中で点数計算も行っていきます。

"1"、"2"、"3"の数字を検出したら、その数をそのまま点数として足す
"5"を検出したら、0.5点を足す

def calc_harupan(img, templates, svm):
    ctrs, resized_img = detect_candidate_contours(img)
    print('Number of candidates: ', len(ctrs))
    subctr_datasets = [contour_dataset(create_contour_area_image(resized_img, ctr)[1]) for ctr in ctrs]
    ########
    #### Simple code
    # similarities = [get_similarities(d, templates)[0] for d in subctr_datasets]
    #### Code printing progress
    similarities = []
    for i,d in enumerate(subctr_datasets):
        print(i, end=' ')
        similarities += [get_similarities(d, templates)[0]]
    print('')
    ########
    _, result = svm.predict(np.array(similarities, 'float32'))
    result = result.astype('int')
    score = 0.0
    texts = {0:'0', 1:'1', 2:'2', 3:'3', 5:'.5'}
    font = cv2.FONT_HERSHEY_SIMPLEX
    for res, ctr in zip(result, ctrs):
        if res[0] == 5:
            score += 0.5
        elif res[0] != -1:
            score += res[0]
        
        # Annotating recognized numbers for confirmation
        if res[0] != -1:
            resized_img = cv2.drawContours(resized_img, [ctr], -1, (0,255,0), 3)
            x,y,_,_ = cv2.boundingRect(ctr)
            resized_img = cv2.putText(resized_img, texts[res[0]], (x,y), font, 1, (230,230,0), 5)
    return score, resized_img

処理実施

処理時間計測も合わせて実施します。

import time

t0 = time.time()
score, result_img = calc_harupan(img1, templates2019, svm)
t1 = time.time()
print('Score: ', score)
print('Elapsed time: ', t1 - t0)
plt.figure(figsize=(6.4,4.8), dpi=200)
plt.imshow(cv2.cvtColor(result_img, cv2.COLOR_BGR2RGB)), plt.xticks([]), plt.yticks([])
plt.show()

    Number of candidates:  38
    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 
    Score:  26.0
    Elapsed time:  10.60127878189087

f:id:nokixa:20220328023900p:plain:w400

まあまあな感じです。この画像では、シールのてかりで点数シール(0.5点)が1つ認識に失敗しています。また、点数ではない"5"の文字を計算に入れてしまっています。

たまたま計算結果は合ってしまったが…

処理時間は10秒近くかかっているので、あまり実用に堪える感じではないかな…

他の画像もやってみます。

t0 = time.time()
score, result_img = calc_harupan(img2, templates2019, svm)
t1 = time.time()
print('Score: ', score)
print('Elapsed time: ', t1 - t0)
plt.figure(figsize=(6.4,4.8), dpi=200)
plt.imshow(cv2.cvtColor(result_img, cv2.COLOR_BGR2RGB)), plt.xticks([]), plt.yticks([])
plt.show()

    Number of candidates:  46
    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 
    Score:  26.0
    Elapsed time:  11.16847562789917

f:id:nokixa:20220328023905p:plain:w400

t0 = time.time()
score, result_img = calc_harupan(img3, templates2020, svm)
t1 = time.time()
print('Score: ', score)
print('Elapsed time: ', t1 - t0)
plt.figure(figsize=(6.4,4.8), dpi=200)
plt.imshow(cv2.cvtColor(result_img, cv2.COLOR_BGR2RGB)), plt.xticks([]), plt.yticks([])
plt.show()

    Number of candidates:  48
    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 
    Score:  25.0
    Elapsed time:  7.951741933822632

f:id:nokixa:20220328023910p:plain:w400

t0 = time.time()
score, result_img = calc_harupan(img4, templates2021, svm)
t1 = time.time()
print('Score: ', score)
print('Elapsed time: ', t1 - t0)
plt.figure(figsize=(6.4,4.8), dpi=200)
plt.imshow(cv2.cvtColor(result_img, cv2.COLOR_BGR2RGB)), plt.xticks([]), plt.yticks([])
plt.show()

    Number of candidates:  41
    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 
    Score:  20.0
    Elapsed time:  6.7907874584198

f:id:nokixa:20220328023915p:plain:w400

t0 = time.time()
score, result_img = calc_harupan(img5, templates2021, svm)
t1 = time.time()
print('Score: ', score)
print('Elapsed time: ', t1 - t0)
plt.figure(figsize=(6.4,4.8), dpi=200)
plt.imshow(cv2.cvtColor(result_img, cv2.COLOR_BGR2RGB)), plt.xticks([]), plt.yticks([])
plt.show()

    Number of candidates:  35
    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 
    Score:  28.0
    Elapsed time:  6.287672996520996

f:id:nokixa:20220328023920p:plain:w400

t0 = time.time()
score, result_img = calc_harupan(img6, templates2021, svm)
t1 = time.time()
print('Score: ', score)
print('Elapsed time: ', t1 - t0)
plt.figure(figsize=(6.4,4.8), dpi=200)
plt.imshow(cv2.cvtColor(result_img, cv2.COLOR_BGR2RGB)), plt.xticks([]), plt.yticks([])
plt.show()

    Number of candidates:  33
    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 
    Score:  28.0
    Elapsed time:  4.226221799850464

f:id:nokixa:20220328023926p:plain:w400

t0 = time.time()
score, result_img = calc_harupan(img7, templates2021, svm)
t1 = time.time()
print('Score: ', score)
print('Elapsed time: ', t1 - t0)
plt.figure(figsize=(6.4,4.8), dpi=200)
plt.imshow(cv2.cvtColor(result_img, cv2.COLOR_BGR2RGB)), plt.xticks([]), plt.yticks([])
plt.show()

    Number of candidates:  24
    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 
    Score:  24.0
    Elapsed time:  6.231278896331787

f:id:nokixa:20220328023931p:plain:w400

1つ目の画像以外は、点数計算は合っていそうです。
処理時間はそれほど変わらず、やっぱりかなりかかっています。

スクリプト更新

点数計算はこれでいいとして、この点数計算処理をスクリプト(harupan.py)に追加しておきます。
以下に全部再掲します。


######################################################
# Importing libraries
######################################################
import cv2
import numpy as np
from matplotlib import pyplot as plt
import math
import copy
import random
import json

######################################################
# Detecting contours
######################################################
def detect_candidate_contours(image, res_th=800):
    h, w, chs = image.shape
    if h > res_th or w > res_th:
        k = float(res_th)/h if w > h else float(res_th)/w
    else:
        k = 1.0
    img = cv2.resize(image, None, fx=k, fy=k, interpolation=cv2.INTER_AREA)
    hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
    # Convert hue value (rotation, mask by saturation)
    hsv[:,:,0] = np.where(hsv[:,:,0] < 50, hsv[:,:,0]+180, hsv[:,:,0])
    hsv[:,:,0] = np.where(hsv[:,:,1] < 100, 0, hsv[:,:,0])
    # Thresholding with cv2.inRange()
    th_hue = cv2.inRange(hsv[:,:,0], 135, 190)
    # Retrieve all points on the contours (cv2.CHAIN_APPROX_NONE)
    contours, hierarchy = cv2.findContours(th_hue, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
    indices0 = [i for i,hier in enumerate(hierarchy[0,:,:]) if hier[3] == -1]
    indices1 = [i for i,hier in enumerate(hierarchy[0,:,:]) if hier[3] in indices0]
    contours1 = [contours[i] for i in indices1]
    contours1_filtered = [ctr for ctr in contours1 if cv2.contourArea(ctr) > float(res_th)*float(res_th)/4000]
    return contours1_filtered, img


######################################################
# Auxiliary functions
######################################################
def create_contour_area_image(img, ctr):
    x,y,w,h = cv2.boundingRect(ctr)
    rtn_img = img[y:y+h,x:x+w,:].copy()
    rtn_ctr = ctr.copy()
    origin = np.array([x,y])
    for c in rtn_ctr:
        c[0,:] -= origin
    return rtn_img, rtn_ctr

# ctr: Should be output of create_contour_area_image() (Origin of points is the origin of bounding box)
# img_shape: Optional, tuple of (image_height, image_width), if omitted, calculated from ctr
def create_solid_contour(ctr, img_shape=(int(0),int(0))):
    if img_shape == (int(0),int(0)):
        _,_,w,h = cv2.boundingRect(ctr)
    else:
        h,w = img_shape
    img = np.zeros((h,w), 'uint8')
    img = cv2.drawContours(img, [ctr], -1, 255, -1)
    return img

# ctr: Should be output of create_contour_area_image() (Origin of points is the origin of bounding box)
def create_upright_solid_contour(ctr):
    (cx,cy),(w,h),angle = cv2.minAreaRect(ctr)
    M = cv2.getRotationMatrix2D((cx,cy), angle, 1)
    for i in range(ctr.shape[0]):
        ctr[i,0,:] = ( M @ np.array([ctr[i,0,0], ctr[i,0,1], 1]) ).astype('int')
    rect = cv2.boundingRect(ctr)
    img = np.zeros((rect[3],rect[2]), 'uint8')
    ctr -= rect[0:2]
    M[:,2] -= rect[0:2]
    img = cv2.drawContours(img, [ctr], -1, 255,-1)
    return img, M, ctr


######################################################
# Dataset classes
######################################################
class contour_dataset:
    def __init__(self, ctr):
        self.ctr = ctr.copy()
        self.rrect = cv2.minAreaRect(ctr)
        self.box = cv2.boxPoints(self.rrect)
        self.solid = create_solid_contour(ctr)
        self.pts = np.array([p for p in ctr[:,0,:]])

class template_dataset:
    def __init__(self, ctr, num, selected_idx=[0]):
        self.ctr = ctr.copy()
        self.num = num
        self.rrect = cv2.minAreaRect(ctr)
        self.box = cv2.boxPoints(self.rrect)
        if num == 0:
            self.solid,_,_ = create_upright_solid_contour(ctr)
        else:
            self.solid = create_solid_contour(ctr)
        self.pts = np.array([ctr[idx,0,:] for idx in selected_idx])


######################################################
# ICP
######################################################
# pts: list of 2D points, or ndarray of shape (n,2)
# query: 2D point to find nearest neighbor
def find_nearest_neighbor(pts, query):
    min_distance_sq = float('inf')
    min_idx = 0
    for i, p in enumerate(pts):
        d = np.dot(query - p, query - p)
        if(d < min_distance_sq):
            min_distance_sq = d
            min_idx = i
    return min_idx, np.sqrt(min_distance_sq)

# src, dst: ndarray, shape is (n,2) (n: number of points)
def estimate_affine_2d(src, dst):
    n = min(src.shape[0], dst.shape[0])
    x = dst[0:n].flatten()
    A = np.zeros((2*n,6))
    for i in range(n):
        A[i*2,0] = src[i,0]
        A[i*2,1] = src[i,1]
        A[i*2,2] = 1
        A[i*2+1,3] = src[i,0]
        A[i*2+1,4] = src[i,1]
        A[i*2+1,5] = 1
    M = np.linalg.inv(A.T @ A) @ A.T @ x
    return M.reshape([2,3])

# Find optimum affine matrix using ICP algorithm
# src_pts: ndarray, shape is (n_s,2) (n_s: number of points)
# dst_pts: ndarray, shape is (n_d,2) (n_d: number of points, n_d should be larger or equal to n_s)
# initial_matrix: ndarray, shape is (2,3)
def icp(src_pts, dst_pts, max_iter=20, initial_matrix=np.array([[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]])):
    default_affine_matrix = np.array([[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]])
    if dst_pts.shape[0] < src_pts.shape[0]:
        print("icp: Insufficient destination points")
        return default_affine_matrix, False
    if initial_matrix.shape != (2,3):
        print("icp: Illegal shape of initial_matrix")
        return default_affine_matrix, False
    M = initial_matrix
    # Store indices of the nearest neighbor point of dst_pts to the converted point of src_pts
    nn_idx = []
    for i in range(max_iter):
        nn_idx_tmp = []
        dst_pts_list = [p for p in dst_pts]
        idx_list = list(range(0,dst_pts.shape[0]))
        for p in src_pts:
            p2 = M @ np.array([p[0], p[1], 1])
            idx, d = find_nearest_neighbor(dst_pts_list, p2)
            nn_idx_tmp += [idx_list[idx]]
            del dst_pts_list[idx]
            del idx_list[idx]
        if nn_idx != [] and nn_idx == nn_idx_tmp:
            break
        dst_pts2 = np.zeros_like(src_pts)
        for j,idx in enumerate(nn_idx_tmp):
            dst_pts2[j,:] = dst_pts[idx,:]
        M = estimate_affine_2d(src_pts, dst_pts2)
        nn_idx = nn_idx_tmp
        if i == max_iter -1:
            return M, False
    return M, True


######################################################
# Calculating similarity and determining the number
######################################################
def binary_image_similarity(img1, img2):
    if img1.shape != img2.shape:
        print('binary_image_similarity: Different image size')
        return 0.0
    xor_img = cv2.bitwise_xor(img1, img2)
    return 1.0 - np.float(np.count_nonzero(xor_img)) / (img1.shape[0]*img2.shape[1])

# src, dst: contour_dataset or template_dataset (holding member variables box, solid)
def get_transform_by_rotated_rectangle(src, dst):
    # Rotated patterns are created when starting index is slided
    dst_box2 = np.vstack([dst.box, dst.box])
    max_similarity = 0.0
    max_converted_img = np.zeros((dst.solid.shape[1], dst.solid.shape[0]), 'uint8')
    for i in range(4):
        M = cv2.getAffineTransform(src.box[0:3], dst_box2[i:i+3])
        converted_img = cv2.warpAffine(src.solid, M, dsize=(dst.solid.shape[1], dst.solid.shape[0]), flags=cv2.INTER_NEAREST)
        similarity = binary_image_similarity(converted_img, dst.solid)
        if similarity > max_similarity:
            M_rtn = M
            max_similarity = similarity
            max_converted_img = converted_img
    return M_rtn, max_similarity, max_converted_img

def get_similarity_with_template(target_data, template_data, sim_th_high=0.95, sim_th_low=0.7):
    _,(w1,h1), _ = target_data.rrect
    _,(w2,h2), _ = template_data.rrect
    r = w1/h1 if w1 < h1 else h1/w1
    r = r * h2/w2 if w2 < h2 else r * w2/h2
    M, sim_init, _ = get_transform_by_rotated_rectangle(template_data, target_data)
    if sim_init > sim_th_high or sim_init < sim_th_low or r > 1.4 or r < 0.7:
        dsize = (template_data.solid.shape[1], template_data.solid.shape[0])
        flags = cv2.INTER_NEAREST|cv2.WARP_INVERSE_MAP
        converted_img = cv2.warpAffine(target_data.solid, M, dsize=dsize, flags=flags)
        return sim_init, converted_img
    M, _ = icp(template_data.pts, target_data.pts, initial_matrix=M)
    Minv = cv2.invertAffineTransform(M)
    converted_ctr = np.zeros_like(target_data.ctr)
    for i in range(target_data.ctr.shape[0]):
        converted_ctr[i,0,:] = (Minv[:,0:2] @ target_data.ctr[i,0,:]) + Minv[:,2]
    converted_img = create_solid_contour(converted_ctr, img_shape=template_data.solid.shape)
    val = binary_image_similarity(converted_img, template_data.solid)
    return val, converted_img

def get_similarity_with_template_zero(target_data, template_data):
    dsize = (template_data.solid.shape[1], template_data.solid.shape[0])
    converted_img = cv2.resize(target_data.solid, dsize=dsize, interpolation=cv2.INTER_NEAREST)
    val = binary_image_similarity(converted_img, template_data.solid)
    return val, converted_img

def get_similarities(target, templates):
    similarities = []
    converted_imgs = []
    for tmpl in templates:
        if tmpl.num == 0:
            sim,converted_img = get_similarity_with_template_zero(target, tmpl)
        else:
            sim,converted_img = get_similarity_with_template(target, tmpl)
        similarities += [sim]
        converted_imgs += [converted_img]
    return similarities, converted_imgs

def calc_harupan(img, templates, svm):
    ctrs, resized_img = detect_candidate_contours(img)
    print('Number of candidates: ', len(ctrs))
    subctr_datasets = [contour_dataset(create_contour_area_image(resized_img, ctr)[1]) for ctr in ctrs]
    ########
    #### Simple code
    # similarities = [get_similarities(d, templates)[0] for d in subctr_datasets]
    #### Code printing progress
    similarities = []
    for i,d in enumerate(subctr_datasets):
        print(i, end=' ')
        similarities += [get_similarities(d, templates)[0]]
    print('')
    ########
    _, result = svm.predict(np.array(similarities, 'float32'))
    result = result.astype('int')
    score = 0.0
    texts = {0:'0', 1:'1', 2:'2', 3:'3', 5:'.5'}
    font = cv2.FONT_HERSHEY_SIMPLEX
    for res, ctr in zip(result, ctrs):
        if res[0] == 5:
            score += 0.5
        elif res[0] != -1:
            score += res[0]
        
        # Annotating recognized numbers for confirmation
        if res[0] != -1:
            resized_img = cv2.drawContours(resized_img, [ctr], -1, (0,255,0), 3)
            x,y,_,_ = cv2.boundingRect(ctr)
            resized_img = cv2.putText(resized_img, texts[res[0]], (x,y), font, 1, (230,230,0), 5)
    return score, resized_img

######################################################
# Loading template data and SVM model
######################################################
def load_svm(filename):
    return cv2.ml.SVM_load(filename)

def load_templates(filename):
    with open(filename, mode='r') as f:
        load_data = json.load(f)
        templates_rtn = []
        for d in load_data:
            templates_rtn += [template_dataset(np.array(d['ctr']), d['num'], d['pts'])]
    return templates_rtn