knn algorithm in machine learning

8
Machine learning Knn algorithm for superstore data analysis Yash Asti [email protected]

Upload: student

Post on 09-Feb-2017

269 views

Category:

Education


11 download

TRANSCRIPT

Page 1: Knn algorithm in Machine learning

Machine learning Knn algorithm for superstore data analysis

Yash [email protected]

Page 2: Knn algorithm in Machine learning

Abstract

Machine learning is a branch in computer science that studies the design of algorithms that can learn.

Typical machine learning tasks are concept learning, function learning or “predictive modeling”, clustering and finding predictive patterns.

In this project we will use R to work with machine learning algorithms called “KNN” or k-nearest neighbors. 

R has a wide variety of functions for cluster analysis. In my project I will test machine learning cluster algorithm K-nn.

Page 3: Knn algorithm in Machine learning

What is Machine learning?

Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed. 

Machine learning focuses on the development of computer programs that can teach themselves to grow and change when exposed to new data.

The process of machine learning is similar to that of data mining. Both systems search through data to look for patterns.

The primary goal of machine learning research is to develop general purpose algorithms of practical value. Such algorithms should be efficient. As usual, as computer scientists, we care about time and space efficiency. But in the context of learning, we also care a great deal about another precious resource, namely, the amount of data that is required by the learning algorithm.

Page 4: Knn algorithm in Machine learning

What is knn algorithm?

In pattern recognition, the k-Nearest Neighbors algorithm (or k-NN for short) is a non-parametric method used for classification and regression. In both cases, the input consists of the k closest training examples in the feature space.

Page 5: Knn algorithm in Machine learning

How does the KNN algorithm work? Let’s take a simple case to understand this algorithm. Following is a

spread of red circles (RC) and green squares (GS):

Page 6: Knn algorithm in Machine learning

You intend to find out the class of the blue star (BS) . BS can either be RC or GS and nothing else. The “K” is KNN algorithm is the nearest neighbors we wish to take vote from. Let’s say K = 3. Hence, we will now make a circle with BS as center just as big as to enclose only three data points on the plane. Refer to following diagram for more details:

The three closest points to BS is all RC. Hence, with good confidence level we can say that the BS should belong to the class RC. Here, the choice became very obvious as all three votes from the closest neighbor went to RC. The choice of the parameter K is very crucial in this algorithm. Next we will understand what are the factors to be considered to conclude the best K.

Page 7: Knn algorithm in Machine learning

Result and conclusion

KNN algorithm is one of the simplest classification algorithm.  Even with such simplicity, it can give highly competitive results. KNN

algorithm can also be used for regression problems.    The only difference from the discussed methodology will be using

averages of nearest neighbors rather than voting from nearest neighbors.   KNN can be coded in a single line on R.

Page 8: Knn algorithm in Machine learning