I always like Pseudo code. It makes my life easier in writing the complete code. Just give it a try. |
So here is the code, I named the python file as measure_object_dimension.py.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | ############################################### # usage: (python) measure_object_dimension.py # ############################################### from pydonesia import ComputerVision import os cv = ComputerVision() cwd = os.getcwd() file_all = os.listdir(cwd) images = [] for f in file_all: if f.lower().endswith('jpg'): images.append(f) for i in images: image = i cv.measure_object_dimension(image, coin_diameter = 24, unit = 'mm') |
Basically this script is the highest level of my program. It passes the image in a certain folder (in this case, I limited the file considered as image is the image with suffix '.jpg'. You might want to expand the other suffixes, such as '.png', '.bmp', etc) to the method called measure_object_dimension in ComputerVision class which is written in a python script called pydonesia.py. What is the output of this code? Our final goal.
Way of working of measure_object_dimension.py |
Now let us take a look at the python script called pydonesia.py.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | ############################### # last updated: Jan 3rd, 2017 # ############################### from cv_utilities import Utilities as utils import cv2 import os class ComputerVision: def __init__(self): self.utils = utils() self.pixelsPerMetric = None def measure_object_dimension(self, image, coin_diameter, unit, resize_width=700, rotate_angle=0, blur=(5,5), cannyMin=50, cannyMax=100, edge_iterations=1): utils = self.utils pixelsPerMetric = self.pixelsPerMetric # I. GET ALL OBJECTS IN THE IMAGE # step I.1: load the image, convert it to grayscale, and blur it slightly resized, blurred = utils.optimize_image(image, resize_width, rotate_angle, blur) # step I.2: perform edge detection, then perform a dilation + erotion to close gaps in between object edges edge = utils.detect_edge(blurred, cannyMin, cannyMax) # step I.3: find and sort objects (sort from left-to-right) objs = utils.detect_and_sort_objects(edge) # II. LOOP OVER THE OBJECTS IDENTIFIED for obj in objs: # step II.1: compute the bounding box of the object and draw the box (rectangle) box, original_image = utils.create_bounding_box(resized, obj) # step II.2: mark the corners of the box utils.mark_corners(box, original_image) # step II.3: compute the midpoints and mark them tltrX, tltrY, blbrX, blbrY, tlblX, tlblY, trbrX, trbrY = utils.get_midpoints(box, original_image) # step II.4: compute the Euclidean distance between the midpoints dA, dB = utils.get_distances(tltrX, tltrY, blbrX, blbrY, tlblX, tlblY, trbrX, trbrY) # step II.5: perform the calibration pixel to millimeters if the pixels per metric has not been initialized if pixelsPerMetric is None: pixelsPerMetric = dB / coin_diameter # step II.6: compute the dimension of the object and show them on the image utils.get_dimensions(dA, dB, pixelsPerMetric, original_image, unit, tltrX, tltrY, trbrX, trbrY) cv2.imshow(image, original_image) cv2.waitKey(0) cv2.destroyAllWindows() |
Now let us take a look at the output of each step. I started by creating a new class called ComputerVision. Every time this class is called, then it initializes the utilities under cv_utilities.py as utils. Further, it also initializes the pixel-metric-ratio into None. In my case the metric is millimeter.
1 2 3 4 | class ComputerVision: def __init__(self): self.utils = utils() self.pixelsPerMetric = None |
Next in order to make the calculation less expensive, we have to convert the image into grayscale. Further, in order to avoid any false-positive object the image is slightly blurred.
1 2 3 | # I. GET ALL OBJECTS IN THE IMAGE # step I.1: load the image, convert it to grayscale, and blur it slightly resized, blurred = utils.optimize_image(image, resize_width, rotate_angle, blur) |
The output from the above code: it becomes grayscale and slightly blurred. |
Now we are trying to get all the objects in the image.
1 2 | # step I.2: perform edge detection, then perform a dilation + erotion to close gaps in between object edges edge = utils.detect_edge(blurred, cannyMin, cannyMax) |
Here they are the so-called objects in the image. |
The last step in order to get all the objects in the image is as follows.
1 2 | # step I.3: find and sort objects (sort from left-to-right) objs = utils.detect_and_sort_objects(edge) |
Now we have all the objects in the image. Time to loop over the object one by one from left to right, which means from coin to the objects. When we are looping over the coin, we will perform the calibration in order to get the ratio of pixel-metric-ratio. In other words, we would like to know one pixel equivalents to how much millimeters.
1 2 3 4 | # II. LOOP OVER THE OBJECTS IDENTIFIED for obj in objs: # step II.1: compute the bounding box of the object and draw the box (rectangle) box, original_image = utils.create_bounding_box(resized, obj) |
The above code draws the bounding box around the object with green line on the original image, not in the image that was converted to grayscale and blurred. |
This step is optional, we may skip it if you want. The below code draws a "dot" at each corner of the bounding box.
1 2 | # step II.2: mark the corners of the box utils.mark_corners(box, original_image) |
Again, this is optional. |
Now we have to determine the midpoints of each corner in order to get the dimension of the object by measuring the distance between every combination of the midpoints.
1 2 | # step II.3: compute the midpoints and mark them tltrX, tltrY, blbrX, blbrY, tlblX, tlblY, trbrX, trbrY = utils.get_midpoints(box, original_image) |
Corners are in red and midpoints are in blue. |
Next step is to compute the distance between the midpoints (in pixels) and then convert it to metric (millimeters). If the pixelPerMetric is still None then we use the already-known dimension (coin in this case) as the baseline. So the below code will fulfill our objective.
1 2 3 4 5 6 7 8 | # step II.4: compute the Euclidean distance between the midpoints dA, dB = utils.get_distances(tltrX, tltrY, blbrX, blbrY, tlblX, tlblY, trbrX, trbrY) # step II.5: perform the calibration pixel to millimeters if the pixels per metric has not been initialized if pixelsPerMetric is None: pixelsPerMetric = dB / coin_diameter # step II.6: compute the dimension of the object and show them on the image utils.get_dimensions(dA, dB, pixelsPerMetric, original_image, unit, tltrX, tltrY, trbrX, trbrY) |
Here we go. We get the object's dimension by calculating the number of pixels in the picture and then convert it to millimeter. |
The last piece of code from pydonesia.py is to produce a windows contains the resulted image and then wait for any key to be pressed. Once any key on the keyboard is pressed, then it is going to measure the next object. It will loop over all objects.
1 2 | cv2.imshow(image, original_image) cv2.waitKey(0) |
Once the objects are finished being looped over, it will get the next image. However, before it gets the next image, the window will be automatically closed first.
1 | cv2.destroyAllWindows()
|
Well, it is simple, no? I hope that until at this point, my explanation is clear enough. Here is the more low level code in the cv_utilities.py.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | ################################## # last updated: January 3rd 2017 # ################################## from __future__ import print_function, division from builtins import input import imutils import numpy as np import cv2 class Utilities: def __init__(self): print('for more details kindly visit www.pydonesia.blogspot.co.id') def optimize_image(self, filename, resize_width, rotate_angle, blur): image = cv2.imread(filename) image = imutils.resize(image, width = resize_width) image = imutils.rotate(image, angle = rotate_angle) gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) gray = cv2.GaussianBlur(gray, blur ,0) return image, gray def detect_edge(self, image, cannyMin, cannyMax): edged = cv2.Canny(image, cannyMin, cannyMax) edged = cv2.dilate(edged, None, iterations=1) edged = cv2.erode(edged, None, iterations=1) return edged def detect_and_sort_objects(self, image): from imutils import contours cnts = cv2.findContours(image.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) cnts = cnts[0] if imutils.is_cv2() else cnts[1] (cnts, _) = contours.sort_contours(cnts) return cnts def create_bounding_box(self, image, target_object, draw=True): from imutils import perspective orig = image.copy() box = cv2.minAreaRect(target_object) box = cv2.cv.BoxPoints(box) if imutils.is_cv2() else cv2.boxPoints(box) box = np.array(box, dtype = 'int') ''' order the points in the object such that they appear in top-left, top-right, bottom-right, and bottom-left order, then draw the outline of the rotated bounding box ''' box = perspective.order_points(box) if draw==True: cv2.drawContours(orig, [box.astype('int')], -1, (0, 255, 0), 1) return box, orig def mark_corners(self, box, image): for (x, y) in box: cv2.circle(image, (int(x), int(y)), 3, (0,0,255), -1) def get_midpoints(self, box, image, draw=True): def midpoint(ptA, ptB): return ((ptA[0] + ptB[0]) * 0.5, (ptA[1] + ptB[1]) * 0.5) # unpack the ordered bounding box (tl, tr, br, bl) = box # compute the midpoint between the top-left and top-right, followed by the midpoint between bottom-left and bottom-right (tltrX, tltrY) = midpoint(tl, tr) (blbrX, blbrY) = midpoint(bl, br) # compute the midpoint between the top-left and bottom-left points, followed by the midpoint between the top-right and bottom-right (tlblX, tlblY) = midpoint(tl, bl) (trbrX, trbrY) = midpoint(tr, br) if draw: # draw the midpoints on the image cv2.circle(image, (int(tltrX), int(tltrY)), 3, (255, 0, 0), -1) cv2.circle(image, (int(blbrX), int(blbrY)), 3, (255, 0, 0), -1) cv2.circle(image, (int(tlblX), int(tlblY)), 3, (255, 0, 0), -1) cv2.circle(image, (int(trbrX), int(trbrY)), 3, (255, 0, 0), -1) # draw lines between the midpoints cv2.line(image, (int(tltrX), int(tltrY)), (int(blbrX), int(blbrY)), (255, 0, 255), 1) cv2.line(image, (int(tlblX), int(tlblY)), (int(trbrX), int(trbrY)), (255, 0, 255), 1) return tltrX, tltrY, blbrX, blbrY, tlblX, tlblY, trbrX, trbrY def get_distances(self, tltrX, tltrY, blbrX, blbrY, tlblX, tlblY, trbrX, trbrY): from scipy.spatial import distance as dist dA = dist.euclidean((tltrX, tltrY), (blbrX, blbrY)) dB = dist.euclidean((tlblX, tlblY), (trbrX, trbrY)) return dA, dB def get_dimensions(self, dA, dB, ratio, image, unit, tltrX, tltrY, trbrX, trbrY): dimA = dA / ratio dimB = dB / ratio # draw the dimensions on the image cv2.putText(image, "{:.1f}{}".format(dimA, unit), (int(tltrX - 15), int(tltrY - 10)), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 1) cv2.putText(image, "{:.1f}{}".format(dimB, unit), (int(trbrX + 10), int(trbrY)), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 1) |
That is all for this time. I hope that you enjoy it. As always thanks for reading. Should you have any comment please leave in the comment section. See you in the next post!
P.S. Here are my sources of inspirations PyImageSearch, Sentdex, and Learn Python The Hard Way
i am a beginner learning openCV, can we run this application through videocapture and aplly this to every single frame from video >?
ReplyDeleteyes you can take the frame and run the the whole script on while loop with a waitkey at end
DeleteHello very useful information, i am unable to run on windows. Any suggestions will be highly appreciated.
ReplyDeleteThanks in advance.
no library in pycharm module installer called pydonesia..please help!!!
ReplyDeleteIt's a Py file (pydonesia.py) from which you import ComputerVision
Delete