What is Image Segmentation?
To understand the image segmentation let’s first understand the segmentation. What is segmentation?
So the process to segment the image into a variety of regions and the regions that you want to create depends a little bit on what you are actually trying to do.We can divide the image into various parts called segments. we can not process the entire image at the same time as there will be regions in the image which do not contain any information.
Why do we need Image Segmentation?
Object detection builds a bounding box to each class in the image. But it tells us nothing about the shape of the object. whereas Image segmentation creates a pixel-wise mask for each object in the image. This technique gives us a far more good understanding of the object(s) in the image.
So the image segmentation is important where the shape of the object makes a major difference like detecting the cancerous cell because those plays a vital role in detecting the severity of the cancer. Here object detection will not be very useful.
So the image segmentation is making a massive impact here.There are many other applications where image segmentation is showing great impact.
The Different Types of Image Segmentation
We can divide image segmentation techniques into two types. Consider the below images:
In image 1, all the pixels belonging to a particular class and particular class are represented by the same color. This is an example of semantic segmentation.
Image 2 , Assigned a particular class to each pixel of the image. However, different objects have different colors of the same class(Person 1 as red, Person 2 as green, background as black, etc.). This is called instance segmentation.
So what we learnt is that. so there are 5 people in this image, semantic segmentation will focus on classifying all the people aS a single instance. Instance segmentation, on the other hand.
We will combine learning concepts with implementing them in Python. Let’s drive into this
Region based segmentation
Separating regions and performing operations on it with the help of given rules in the operation. A technique to determine the region directly.
Advantages
a. Simple calculations.
b. Fast operation speed.
c.This method performs really well,When the object and background have high contrast.
Let’s implement what we’ve learned in this section. Download this image and then Run Code.
First, we will import the libraries.Let’s plot it:
It is a three-channel image (RGB). We convert it into grayscale so that we only have a single channel .
Now, apply a certain threshold to this image. This threshold separates the image into two parts, the foreground and the background. let’s check the shape of this image:
The height and width is 192 and 263 respectively. the threshold value is the mean of the pixels value. If the value of a pixel is more than the threshold value, we can say that it belongs to an object. If the value of the pixel is less than the threshold value, it will be treated as the background. Let’s code this:
There are four different segments in the image. The different threshold values can set and check how the segments are made. Some of the advantages of this method are:
Calculations are simpler.
Fast operation speed.
This method performs really well When the object and background have high contrast.
Edge Detection Segmentation
What divides two objects in an image? The edges are considered as the discontinuous local features of an image.
We can make use of this discontinuity to detect edges and hence define a boundary of the object. How can we detect these edges? This is where we can make use of filters and convolutions.
The step-by-step process of how this works:
Take the weight matrix
Put it on top of the image
Perform element-wise multiplication and get the output
Weight matrix is move as per the stride chosen
The Convolve continue until all the pixels of the input are used
The output of the convolution is defined by the values of the weight matrix. It is typically used to detect edges. Two weight matrices of the sobel operator are – one for detecting horizontal edges and the other for detecting vertical edges. Now implement them in python.
It should be simple for us to understand how the edges are detected in this image. Let’s convert it into grayscale and define the sobel filter (both horizontal and vertical) that will be convolved over this image:
Let’s plot these results:
Mask R-CNN
Mask R-CNN stands for Regional Convolutional Neural Network is an Instance segmentation model. In this tutorial, we’ll see how to implement this in python with the help of the OpenCV library.
Mask R-CNN is an extension of the Faster R-CNN object detection architecture. Mask R-CNN adds a branch to theFaster R-CNN outputs. The R-CNN method generates two things for each object in the image:
Its class
The bounding box coordinates
The image is taken as input and passed to Convnet that returns the feature map for that image.
Region proposal network (RPN) is applied on these feature maps. This returns the object proposals along with their objectnes score.
A RoI pooling layer is applied on these proposals to bring down all the proposals to the same size.
Finally, the proposals are passed to a fully connected layer to classify and output the bounding boxes for objects. It also returns the mask for each proposal.
Article By: Ekta shakya
If you are Interested In Machine Learning You Can Check Machine Learning Internship Program
Also Check Other technical and Non Technical Internship Programs