K-Nearest Neighbours algorithm which also called KNN is the supervised learning algorithm which easy to implement which can be used in both classifications as well as regression problems. The supervised learning algorithm is a type of algorithm used in labelled data Ex. Image data has labelled like Cats image, Dogs images Reviews of movies, whether the movie liked most or disliked most.
however, Supervised learning algorithm used in two kinds of problems one is classification and other one is regression problems In the classification problems output will in discrete form like the image is of dog or cat, you like a movie of dislike. so, In the Regression problems output will be continued.
In the above datasets, we are thus having a discrete output of weights which are mapping with heights and according to the above data, height is independent data and weights are dependents data which are depending on the height of any person
K-Nearest Neighbours
According to the KNN algorithm similar, this is near to each other. Suppose we have some datasets of two classes one class belongs to class A and another one belongs to class B and now we have some test data:
Now we have assumed the value of k (Number datasets point ) is 3 which means we will calculate the distance of test dataset (black star in our case) from 3 points of both classes and a maximum number of data points from particular class near to the test data. The test data will belongs to that class which has the maximum number of nearest data. As in the above test data (black star) is nearest to class B because in 3 points 2 points are belonging to class B.
steps used in the KNN Algorithm
- Load the training data
- Initialize the K(with an odd number) to your chosen number of neighbours
- For every example in data:
- firstly, Calculate the distance between the query point and the current point from the data.
- Add the distance and the index of the data to an ordered collection
- Sort the ordered list of calculated distances and indices from smallest to largest (in ascending order) by the distances 5. Pick the first K numbers entries from the sorted collection
- Get the labels K numbers entries
- If regression, return the mean of the K labels
- If classification, come to the mode of the K labels
Advantage of KNN algorithm
Easy to implement and can be used as Classification as well as regression.
disadvantage of the KNN algorithm
Algorithm get slower on large size of datasets.
Written By: Amit Gupta
Reviewed By: Savya Sachi
If you are Interested In Machine Learning You Can Check Machine Learning Internship Program
Also Check Other Technical And Non Technical Internship Programs