Sunday, February 12, 2017

Measuring the dimension of an object using OpenCV and Python:SOLVING The Bugs!

Just a quick review, here are the list of parameters that we need to manually pass to our program. Without any of these parameters, our program will not run. So here they are:
  1. Working directory
  2. a set of suffixes of the image filename
  3. coin's diameter and the measurement unit
  4. width of the resized image
  5. rotating angulation
  6. Gaussian kernel size
  7. minimum and maximum value of Canny Hysteresis Thresholdings
  8. iterations of morphological transformation
Now we need to write them down in the .txt file. Here are the parameters that I wrote down in my .txt file.

directory = D:\arnold\blog\pydonesia.blogspot.co.id\2017\02. Feb\week2\sample_images
image_file_extensions = jpg, jpeg, tif, tiff, bmp, png
coin_diameter = 24
unit = mm
resize_width = 700
rotate_angle = 0
blur = 5
cannyMin = 50
cannyMax = 100
edge_iterations = 1

Now we feed this .txt file into the highest level of our program, measure_object_dimension.py. Here is the modified version of measure_object_dimension.py.

1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
###############################################
# usage: (python) measure_object_dimension.py #
###############################################

from pydonesia import ComputerVision
import os

def get_parameters_from_txt(txt_file):
    d = dict() # initializing an empty dictionary
    with open(txt_file) as f:
        content = f.readlines()
        for s in content:
            temp = s.split('=')
            if not ',' in temp[1]: d[temp[0].strip()] = temp[1].strip()
            else:
                lnew = []
                l = temp[1].split(',')
                for item in l:
                    item = item.strip()
                    lnew.append(item)    
                d[temp[0].strip()] = lnew
    return d # return a dictionary with the predetermined parameters

cv = ComputerVision()
d = get_parameters_from_txt('parameters.txt')

wd = os.path.join(d['directory']) # new on Feb 3rd, 2017
file_all = os.listdir(wd) # new on Feb 3rd, 2017

images = []
for f in file_all:
    if any(valid_file_extension.lower() in f.lower() for valid_file_extension in d['image_file_extensions']): images.append(f) # new on Feb 3rd, 2017

for i in images:
    image = os.path.join(d['directory'], i) # new on Feb 3rd, 2017
    cv.measure_object_dimension(image, coin_diameter = int(d['coin_diameter']), unit = d['unit'],
                             resize_width=int(d['resize_width']), rotate_angle=int(d['rotate_angle']), blur=(int(d['blur']),int(d['blur'])),
                                cannyMin=int(d['cannyMin']), cannyMax=int(d['cannyMax']), edge_iterations=int(d['edge_iterations'])) # new on Feb 3rd, 2017

We need to add a new method in this python script. I named it get_parameters_from_txt (line 8-22). Basically what this method does is parsing our .txt file and make a dictionary out of it.

Next let us take a look at the codes in pydonesia.py script. Here is the full code of pydonesia.py.

1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
###############################
# last updated: Feb 3rd, 2017 #
###############################

from cv_utilities import Utilities as utils
import cv2
import os

class ComputerVision:
    def __init__(self):
        self.utils = utils()
        self.pixelsPerMetric = None
        
    def measure_object_dimension(self, image, coin_diameter, unit,
                                 resize_width, rotate_angle, blur, cannyMin, cannyMax, edge_iterations): # updated on Feb 3rd, 2017
        
        utils = self.utils
        pixelsPerMetric = self.pixelsPerMetric
        
        # I. GET ALL OBJECTS IN THE IMAGE
        # step I.1: load the image, convert it to grayscale, and blur it slightly
        resized, blurred = utils.optimize_image(image, resize_width, rotate_angle, blur)
        
        # step I.2: perform edge detection, then perform a dilation + erotion to close gaps in between object edges
        edge = utils.detect_edge(blurred, cannyMin, cannyMax)
        
        # step I.3: find and sort objects (sort from left-to-right)
        objs = utils.detect_and_sort_objects(edge)
        
        # II. LOOP OVER THE OBJECTS IDENTIFIED
        for obj in objs:
            # step II.1: compute the bounding box of the object and draw the box (rectangle)
            box, original_image = utils.create_bounding_box(resized, obj)
            
            # step II.2: mark the corners of the box
            utils.mark_corners(box, original_image)
            
            # step II.3: compute the midpoints and mark them
            tltrX, tltrY, blbrX, blbrY, tlblX, tlblY, trbrX, trbrY = utils.get_midpoints(box, original_image)
            
            # step II.4: compute the Euclidean distance between the midpoints
            dA, dB = utils.get_distances(tltrX, tltrY, blbrX, blbrY, tlblX, tlblY, trbrX, trbrY)
            
            # step II.5: perform the calibration pixel to millimeters if the pixels per metric has not been initialized
            if pixelsPerMetric is None: pixelsPerMetric = dB / coin_diameter
                
            # step II.6: compute the dimension of the object and show them on the image
            utils.get_dimensions(dA, dB, pixelsPerMetric, original_image, unit, tltrX, tltrY, trbrX, trbrY)
            
            cv2.imshow(image, original_image)
            cv2.waitKey(0)
        
        cv2.destroyAllWindows()

Not much changes. Just at few lines, we have to update, so that there is no hard-coded part in the script. Thanks to object-oriented-programming.

Okay so far so good! Let us test our new code!
Click to enlarge the picture
Ooopppsss... The bugs are still there! If you recall the flow of how the object in the image is being detected, firstly we need to slightly blur the original image. The purpose of the Gaussian blur is to avoid false-positive detection or in other words to prevent detecting noise as an object. You might see in the edge window, there is a white dot that I circled with red circle. This is the noise.

So what should we do? What I usually do, I usually play with the Gaussian kernel size first. Previously I set to 5, however, the noise is still there. So now I will set it to 9, with the other parameters are still in the default values.

directory = D:\arnold\blog\pydonesia.blogspot.co.id\2017\02. Feb\week2\sample_images
image_file_extensions = jpg, jpeg, tif, tiff, bmp, png
coin_diameter = 24
unit = mm
resize_width = 700
rotate_angle = 0
blur = 9
cannyMin = 50
cannyMax = 100
edge_iterations = 1

Click to enlarge the picture
Great! Now the noise is gone! Therefore the first object will be coin. No more misdetecting object. But what is going to happen if we keep increasing the Gaussian kernel size? Let us have a try!

The coin is gone because the image is too smooth
Oh come on! Where is the coin? Well at the situation like this, you may want to undo in increasing the Gaussian kernel size. However, at some cases we need to keep the Gaussian kernel size relatively high. Hence the image would tend to be oversmooth. One of the reasons is to significantly lower the noises in the image, for example, if the image was taken with relatively high ISO.

If that is the case, then we might adjust down the maximum and minimum Canny Hysteresis thresholdings. So here is my parameters now.

directory = D:\arnold\blog\pydonesia.blogspot.co.id\2017\02. Feb\week2\sample_images
image_file_extensions = jpg, jpeg, tif, tiff, bmp, png
coin_diameter = 24
unit = mm
resize_width = 700
rotate_angle = 0
blur = 13
cannyMin = 30
cannyMax = 70
edge_iterations = 1

Click to enlarge the picture
Now the coin is visible again to our program. However, the mechanoreceptors in the second glass slide are gone. So it is better to keep the Canny Hysteresis thresholdings in the default values, and adjust just the Gaussian kernel size values for our current case. But please keep in your mind that in some cases you need to adjust both the Canny Hysteresis values as well as the Gaussian kernel size to get the optimum outcome.

Sounds troublesome? Well, indeed it is a trial and error process. Hence, in order to let the user easily adjust these numbers, we create the .txt file. It lets user to adjust the above values without having to change anything in the script. So once you freeze the program (for example like converting it into .exe), you are still able to adjust the parameters to tweak or to optimize your program, even though it is not "on the fly" process. From my perspective it is more practical rather than changing the values right in your scripts.

As for our lowest level python script has no changes at all. So I will just copy and paste it.

1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
##################################
# last updated: February 3rd 2017 #
##################################

from __future__ import print_function, division
from builtins import input
import imutils
import numpy as np
import cv2

class Utilities:
    def __init__(self):
        print('for more details kindly visit www.pydonesia.blogspot.co.id')
        
    def optimize_image(self, filename, resize_width, rotate_angle, blur):
        image = cv2.imread(filename)
        image = imutils.resize(image, width = resize_width)
        image = imutils.rotate(image, angle = rotate_angle)
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        gray = cv2.GaussianBlur(gray, blur ,0)
        return image, gray
    
    def detect_edge(self, image, cannyMin, cannyMax):
        edged = cv2.Canny(image, cannyMin, cannyMax)
        edged = cv2.dilate(edged, None, iterations=1)
        edged = cv2.erode(edged, None, iterations=1)
        return edged
    
    def detect_and_sort_objects(self, image):
        from imutils import contours
        
        cnts = cv2.findContours(image.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
        cnts = cnts[0] if imutils.is_cv2() else cnts[1]
        (cnts, _) = contours.sort_contours(cnts)
        return cnts
    
    def create_bounding_box(self, image, target_object, draw=True):
        from imutils import perspective
        
        orig = image.copy()
        box = cv2.minAreaRect(target_object)
        box = cv2.cv.BoxPoints(box) if imutils.is_cv2() else cv2.boxPoints(box)
        box = np.array(box, dtype = 'int')

        '''
        order the points in the object such that they appear in top-left, top-right,
        bottom-right, and bottom-left order, then draw the outline of the rotated
        bounding box
        '''
        box = perspective.order_points(box)
        if draw==True: cv2.drawContours(orig, [box.astype('int')], -1, (0, 255, 0), 1)
        return box, orig
    
    def mark_corners(self, box, image):
        for (x, y) in box:
            cv2.circle(image, (int(x), int(y)), 3, (0,0,255), -1)
            
    def get_midpoints(self, box, image, draw=True):
        def midpoint(ptA, ptB):
            return ((ptA[0] + ptB[0]) * 0.5, (ptA[1] + ptB[1]) * 0.5)
        
        # unpack the ordered bounding box
        (tl, tr, br, bl) = box
        
        # compute the midpoint between the top-left and top-right, followed by the midpoint between bottom-left and bottom-right
        (tltrX, tltrY) = midpoint(tl, tr)
        (blbrX, blbrY) = midpoint(bl, br)
        
        # compute the midpoint between the top-left and bottom-left points, followed by the midpoint between the top-right and bottom-right
        (tlblX, tlblY) = midpoint(tl, bl)
        (trbrX, trbrY) = midpoint(tr, br)
        
        if draw:
            # draw the midpoints on the image
            cv2.circle(image, (int(tltrX), int(tltrY)), 3, (255, 0, 0), -1)
            cv2.circle(image, (int(blbrX), int(blbrY)), 3, (255, 0, 0), -1)
            cv2.circle(image, (int(tlblX), int(tlblY)), 3, (255, 0, 0), -1)
            cv2.circle(image, (int(trbrX), int(trbrY)), 3, (255, 0, 0), -1)
            
            # draw lines between the midpoints
            cv2.line(image, (int(tltrX), int(tltrY)), (int(blbrX), int(blbrY)), (255, 0, 255), 1)
            cv2.line(image, (int(tlblX), int(tlblY)), (int(trbrX), int(trbrY)), (255, 0, 255), 1)
            
        return tltrX, tltrY, blbrX, blbrY, tlblX, tlblY, trbrX, trbrY
    
    def get_distances(self, tltrX, tltrY, blbrX, blbrY, tlblX, tlblY, trbrX, trbrY):
        from scipy.spatial import distance as dist
        dA = dist.euclidean((tltrX, tltrY), (blbrX, blbrY))
        dB = dist.euclidean((tlblX, tlblY), (trbrX, trbrY))
        return dA, dB
    
    def get_dimensions(self, dA, dB, ratio, image, unit, tltrX, tltrY, trbrX, trbrY):
        dimA = dA / ratio
        dimB = dB / ratio
        # draw the dimensions on the image
        cv2.putText(image, "{:.1f}{}".format(dimA, unit), (int(tltrX - 15), int(tltrY - 10)),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 1)
        cv2.putText(image, "{:.1f}{}".format(dimB, unit), (int(trbrX + 10), int(trbrY)),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 1)

Well, that is all for this week. Hope you guys enjoy reading my blog post. As always, thanks for reading. If you guys have any comment or queries, please leave them down in the comment section below. See you in my next post!

2 comments:

  1. Hello , First of all thanks for taking time to write out the beautiful stuff. However when i was using the code above to calculate the dimension i was not getting correct results .

    For e.g for a coke 600ml bottle i should get (length = 250mm and width = 64mm ) but i am not getting this.

    Could you please help me out in this.
    Thanks in advance

    ReplyDelete
  2. Is this only working on a coin or I can use it for any flat object?

    ReplyDelete