k-Means Clustering (Python)
This section is a simple example of the section: Unsupervised Learning, I recommend reading the theory first before moving on to this section.
When you have unlabeled data, you may use K-means clustering, a form of unsupervised learning (i.e., data without defined categories or groups). This algorithm’s objective is to identify groups in the data; K is a variable that indicates how many groups there are. The program uses supplied attributes to iteratively assign each data point to one of K groups. Based on the similarity of their features, data points are grouped. The K-means clustering technique yields the following results:
Labels for training data can be applied to fresh data using the centroids of the K clusters (each data point is assigned to a single cluster)
Let’s now talk about how the KMeans algorithm works. The goal is to simplify the explanation as much as feasible.
- Import libraries
import numpys np
import pandas as pd
import os
import matplotlib.pyplot as plt
import seaborn as sns
2. Importing the dataset
You can download the dataset click here
data_set = pd.read_csv('Mall_Customers.csv',sep = ";")
data_set
data_set.info()