k-Means Clustering (Python)

Patrizia Castagno
4 min readDec 27, 2022

This section is a simple example of the section: Unsupervised Learning, I recommend reading the theory first before moving on to this section.

When you have unlabeled data, you may use K-means clustering, a form of unsupervised learning (i.e., data without defined categories or groups). This algorithm’s objective is to identify groups in the data; K is a variable that indicates how many groups there are. The program uses supplied attributes to iteratively assign each data point to one of K groups. Based on the similarity of their features, data points are grouped. The K-means clustering technique yields the following results:

Labels for training data can be applied to fresh data using the centroids of the K clusters (each data point is assigned to a single cluster)

Let’s now talk about how the KMeans algorithm works. The goal is to simplify the explanation as much as feasible.

  1. Import libraries
import numpys np
import pandas as pd
import os
import matplotlib.pyplot as plt
import seaborn as sns

2. Importing the dataset

You can download the dataset click here

data_set = pd.read_csv('Mall_Customers.csv',sep = ";")
data_set
Image by the Author
data_set.info()

--

--

Patrizia Castagno
Patrizia Castagno

Written by Patrizia Castagno

Physics and Data Science.Eagerly share insights and learn collaboratively in this growth-focused space.LinkedIn:www.linkedin.com/in/patrizia-castagno-diserafino