K-Means is a very important and powerful algorithm for data clustering. It is an Unsupervised Machine Learning technique which we can apply to find new patterns in our data. What's interesting about this algorithm is that we can also use it for image processing tasks. And in the same manner as with other type of data, we can find pixel patterns in our images that will allow us to process them in a faster and more efficient way.
In this article we are going to use the k-means clustering algorithm to perform image segmentation on a picture. And to make it more fun and personal, I'm going to use an image of my cat. I'm going to use k-means on that image to perform image segmentation. Then I'm going to use canny edge detection to further process the image, and then finally I'm going to find contours in that image in order to search for my cat.
Interested in more stories like this? Follow me on Twitter at @b_dmarius and I'll post there every new article.
- K-Means clustering explained
- What is image segmentation
- Python k-means image segmentation with opencv
- Canny edge detection in opencv
- Finding contours using opencv
K-Means clustering explained
K-Means is a data clustering algorithm that tries to assign every data point in a dataset to exactly one of K possible clusters – hence the name. The main idea here is that the algorithm tries to build the clusters in such way that two data points from the same cluster are as similar as possible, while two data points from two different clusters are as different as possible.
With this double restriction, the k-means algorithm will iterate through the dataset many times and we need to find a proper condition for it to stop. We can either say "hey, let's stop when you've gone through the dataset 100 times!" or we can stop when we see that between the current iteration and the last one, no significant change has been made; since the dataset remains the same, there's no chance that the algorithm will discover a new way to cluster the data if we run it thousands of times. For the purpose of this article, we are going to use a combination of the 2 conditions.
An important notion related to kmeans and one we are going to take advantage of today is the notion of centroids. Mathematically speaking, the centroid of a cluster is the mean value of all the values present in that cluster. Working on the assumption that we have obtained a homogenous cluster, we can consider all values in that cluster to be located somewhere around the centroid. If let's say we wanted to thoroughly simplify our dataset, we could replace all the values in the cluster with the centroid. You'll see in a minute why we'd want that.
Before we move on, I strongly encourage you to read more in-depth information about k-means here.
What is image segmentation
Image segmentation is the process of transforming in image so that we can partition it into simpler regions of similar pixels. The goal here is to take a very complicated image and reduce it to a much simpler form. This is an intermediate task that we can take in order to transform an image and use it in advanced Machine Learning tasks like object detection, object tracking, object recognition and so on.
There are various approaches to image segmentation, but the main goal is to eliminate less important features of an image so that an object detection algorithm will be allowed to focus only on more important features of an image. A successful process will result in better results, but also in a faster and less computationally-expensive training and testing phase.
How can clustering help us with segmentation then? Another way to look at image segmentation is that we try to group the pixels in an image so that more similar pixels are in the same group, while more different pixels are placed in different groups. If that sounds familiar to you, it is because it sounds exactly like data clustering.
Think of it this way: I want to find my cat in the picture. Most of my cat kind of looks the same: black or shades of black and gray. I don't need to analyze every pixel in that photo and classify it as cat or not cat. If I can cluster the pixels in the photo in a proper way, then most of the black pixels(or almost black pixels) will fall into the same category. From there, I can replace all the pixels in that category with a perfectly black color and now I'll be sure where the cat is in the photo and fur detection – sorry, object detection – will be much easier.
Python k-means image segmentation with opencv
The last thing we need to do before we can actually start writing code is to install our dependencies for this project. The only stuff we need to install for this is opencv-python because that will also install numpy for us.
pip3 install opencv-python
Ready when you are! Now let's load our image and transform the encoding to RGB.
import cv2 as cv import numpy as np # Load original image originalImage = cv.imread("cat.jpg") originalImage = cv.cvtColor(originalImage, cv.COLOR_BGR2RGB)
Our image has a width w and a height h, and we need to transform the shape of the image into a Nx3 shape, where N is the w*h product, and 3 is for the 3 colors. This is needed so that we can pass the image to the kmeans method of opencv.
reshapedImage = np.float32(originalImage.reshape(-1, 3))
Next thing we need to do is define our number of clusters we want to find. I'll go with 2 clusters for this one.
And next we also have to define when to stop our clustering algorithm. As mentioned earlier, we said we can either stop when there's no significant change after an iteration or when a big number of iterations has passed. We are going to use a combination of these 2 conditions and specifically we'll settle for whichever happens first.
numberOfClusters = 2 stopCriteria = (cv.TERM_CRITERIA_EPS + cv.TERM_CRITERIA_MAX_ITER, 100, 0.1)
It's now time to apply the kmeans clustering algorithm. This will return us a list with the centroids and a list with all the pixels, in such way that every pixel is assigned to one of the centroids.
From here, we can go on and replace all the values in a cluster with the value of the centroid of that cluster. We then need to reshape our current image to the shape of the original image and save our progress.
ret, labels, clusters = cv.kmeans(reshapedImage, numberOfClusters, None, stopCriteria, 10, cv.KMEANS_RANDOM_CENTERS) clusters = np.uint8(clusters) intermediateImage = clusters[labels.flatten()] clusteredImage = intermediateImage.reshape((originalImage.shape)) cv.imwrite("clusteredImage.jpg", clusteredImage)
Let's take a look at our intermediate result: our clustered image.
The image looks a lot simpler now, but we haven't finished yet.
Canny edge detection in opencv
Now we need to apply the canny edge detection algorithm(using the utility function provided by the opencv library). We will also remove all the values from one cluster(make them perfectly black) before applying the algorithm, so that our image becomes even simpler.
Finally, we will save our new progress in a different file.
# Remove 1 cluster from image and apply canny edge detection removedCluster = 1 cannyImage = np.copy(originalImage).reshape((-1, 3)) cannyImage[labels.flatten() == removedCluster] = [0, 0, 0] cannyImage = cv.Canny(cannyImage,100,200).reshape(originalImage.shape) cv.imwrite("cannyImage.jpg", cannyImage)
Let's now take a look at our new result.
My cat definitely looks scarier now, but we are getting closer to our result.
Finding contours using opencv
The next thing we need to do is to find all the contours in the image. The findContours method from opencv-python will help us get the coordinates of the contours. Next, we will draw in red all this contours so that we can save a new progress and see what we've got until now.
initialContoursImage = np.copy(cannyImage) imgray = cv.cvtColor(initialContoursImage, cv.COLOR_BGR2GRAY) _, thresh = cv.threshold(imgray, 50, 255, 0) contours, hierarchy = cv.findContours(thresh, cv.RETR_TREE, cv.CHAIN_APPROX_SIMPLE) cv.drawContours(initialContoursImage, contours, -1, (0,0,255), cv.CHAIN_APPROX_SIMPLE) cv.imwrite("initialContoursImage.jpg", initialContoursImage)
Let's look at the image.
We can there's a lot of red here, and we need to focus on our furry target! We need to ignore all these small contours and focus on the biggest ones. By looking at the image, I'd say we need to extract the biggest contour and that's our cat, right?
Well, not really..if you look closely, the image above has a red border. That's right, the findContours method also finds the contour of the entire image.
So what we want to do is try to find the second biggest contour in the image and hope that is my cat.
cnt = contours largest_area=0 index = 0 for contour in contours: if index > 0: area = cv.contourArea(contour) if (area>largest_area): largest_area=area cnt = contours[index] index = index + 1 biggestContourImage = np.copy(originalImage) cv.drawContours(biggestContourImage, [cnt], -1, (0,0,255), 3) cv.imwrite("biggestContourImage.jpg", biggestContourImage)
And for the last time, let's take a look at the result.
There it is! We've found the cat! And we've also reached the end of this article.
In this article we've discussed about image segmentation tasks and how important it is nowadays with all the image processing that needs to be done. Next we understood how clustering, and specifcally, how kmeans clustering can help us when performing image segmentation. Then we took a more fun, practical approach with a simple opencv implementation. I hope you've had as much fun as I've had!
Thank you so much for reading this! Interested in more stories like this? Follow me on Twitter at @b_dmarius and I'll post there every new article.