Simple segmentation in computer vision

3 min readFeb 18, 2022

Image segmentation is used in many applications including extracting text, medical imaging, and industrial imaging.

In the field of AI, especially in autonomous driving, image segmentation is mostly used for object detection, obstacle recognition, and so on.

This article mainly explains the visualization of segmentation and how to generate a segmentation-processed image and clarify the significance of the segmentation in computer vision by a case study.

A segmentation application

Astronomy enthusiasts tend to look up at the stars and are interested in star with sparticularly brightness.

I once participated in a deep learning project where the organizers wanted to use a model to automatically identify what they considered to be valuable stars, and one of the conditions for identification was brightness (there are dozens of other conditions, of course, which I won’t over-interpret here).

Our input image captured by the telescope are as follows

The organizers are interested in similar all-out highlights

The task of the model is to be able to identify these highlights automatically.

Label definition

This article doesn’t have much to do with deep learning, but why use this case? Here I’m going to point out that the segmentation of images is what can be used here. Let me explain as follows.

- The training of a supervised deep learning model requires target labels.
- Our prediction works for each pixel of the image based on different brightness.
Based on the above, we have to localize the label or labels on each pixel. If we define only two types of stars (bright enough: 1, otherwise: 0), then the label on each pixel is either 0 or 1.

Let’s define one toy image as example

Suppose we have the picture or a digital image

[[0, 2, 2],
[1, 1, 1],
[1, 1, 2]]

According to 👆, our label swill be set to 1 on every pixel when the color of that pixel has color value > 1 otherwise to replace with 0

[[0, 1, 1],
[0, 0, 0],
[0, 0, 1]]

In order to plot this image easily, we will transform this label to an grayscale one

[[0, 255, 255],
[0, 0, 0],
[0, 0, 255]]

so, we need one algorithm to give us label and grayscale image.

Threshold, basic idea and algorithm

Thresholding an image takes a threshold; If a particular pixel (i,j) is greater than that threshold it will set that pixel to some value usually 1 or 255, otherwise, it will set it to another value, usually 0. We can write a Python function that will perform thresholding and output a new image given some input grayscale image:

def thresholding(input_img,threshold,max_value=255, min_value=0):
    N,M=input_img.shape
    image_out=np.zeros((N,M),dtype=np.uint8)
    label_out=np.zeros((N,M),dtype=np.uint8)    for i  in range(N):
       for j in range(M):
           if input_img[i,j]> threshold:
               image_out[i,j]=max_value
               label_out[i,j]=1 
          else: 
               image_out[i,j]=min_value
               label_out[i,j]=1    return image_out, label_out

That’s it, use this method we can generate label for our DL training. When we have 1000 image as training samples, then we must call this method 1000 times to get 1000 greyscale images (or ignore) to visualize and label_out for each training labels.

Check out the code

Do segmentation from scratch.

Use openCV threshold to do the segmentation and plot intensity histogram.