Simple segmentation in computer vision
Image segmentation is used in many applications including extracting text, medical imaging, and industrial imaging.
In the field of AI, especially in autonomous driving, image segmentation is mostly used for object detection, obstacle recognition, and so on.
This article mainly explains the visualization of segmentation and how to generate a segmentation-processed image and clarify the significance of the segmentation in computer vision by a case study.
A segmentation application
Astronomy enthusiasts tend to look up at the stars and are interested in star with sparticularly brightness.
I once participated in a deep learning project where the organizers wanted to use a model to automatically identify what they considered to be valuable stars, and one of the conditions for identification was brightness (there are dozens of other conditions, of course, which I won’t over-interpret here).
Our input image captured by the telescope are as follows
The organizers are interested in similar all-out highlights
The task of the model is to be able to identify these highlights automatically.
Label definition
This article doesn’t have much to do with deep learning, but why use this case? Here I’m going to point out that the segmentation of images is what can be used here. Let me explain as follows.
- The training of a supervised deep learning model requires target labels.
- Our prediction works for each pixel of the image based on different brightness.
Based on the above, we have to localize the label or labels on each pixel. If we define only two types of stars (bright enough: 1, otherwise: 0), then the label on each pixel is either 0 or 1.
Let’s define one toy image as example
Suppose we have the picture or a digital image
[[0, 2, 2],
[1, 1, 1],
[1, 1, 2]]
According to 👆, our label swill be set to 1 on every pixel when the color of that pixel has color value > 1 otherwise to replace with 0
[[0, 1, 1],
[0, 0, 0],
[0, 0, 1]]
In order to plot this image easily, we will transform this label to an grayscale one
[[0, 255, 255],
[0, 0, 0],
[0, 0, 255]]
so, we need one algorithm to give us label and grayscale image.
Threshold, basic idea and algorithm
Thresholding an image takes a threshold; If a particular pixel (i,j) is greater than that threshold it will set that pixel to some value usually 1 or 255, otherwise, it will set it to another value, usually 0. We can write a Python function that will perform thresholding and output a new image given some input grayscale image:
def thresholding(input_img,threshold,max_value=255, min_value=0):
N,M=input_img.shape
image_out=np.zeros((N,M),dtype=np.uint8)
label_out=np.zeros((N,M),dtype=np.uint8) for i in range(N):
for j in range(M):
if input_img[i,j]> threshold:
image_out[i,j]=max_value
label_out[i,j]=1
else:
image_out[i,j]=min_value
label_out[i,j]=1 return image_out, label_out
That’s it, use this method we can generate label for our DL training. When we have 1000 image as training samples, then we must call this method 1000 times to get 1000 greyscale images (or ignore) to visualize and label_out for each training labels.
Check out the code
Do segmentation from scratch.
Use openCV threshold to do the segmentation and plot intensity histogram.