K-Nearest Neighbors

TeeTracker
1 min readJan 2, 2022

A supervised machine learning algorithm.

Find the closest data result to the query target within a finite data set. Handle “regression” or “classification” or even “recommendation” in a small range.

Idea

The algorithm is actually very intuitive, which is to calculate the eulerian distance between the query target and all the data in the data set.

Algorithm

  1. Calculate the distance between the query target and all data.
  2. Sort distance from small to large.
  3. Using a global K value, the first K data sets after sorting will be returned.

Regression:

Using the values of the returned dataset, calculate the mean.

Classification:

The value (label) with the most occurrences in the K returned datasets will become the predicted value (label).

Recommendation:

The most basic use is to use the K return values directly and present them to the client.

I’m not good at writing too much detail, here’s a copy of the code, and the notes in it say it all.

https://gist.github.com/XinyueZ/91aee5bf6f7aeae5c5ead76f5ff5cff7

--

--