Email Spam Detection using Natural Language Processing with Python

Humans master millions of words, but computationally speaking: how can we manipulate large amounts of text using programming techniques?

The idea that computers can understand ordinary languages and hold conversations with human beings has been a staple of science fiction. However, the first half of the twentieth century and was envisaged in a classic paper by Alan Turing (1950) as a hallmark of computational intelligence.

This article will focus on how computer systems can analyze and interpret texts, using the Natural Language Processing (NLP). For that, you should install Natural Language Toolkit, you can do it from …

Before you read this article, your should know that I am using Python in Jupyter Notebook.

Numpy is one of the most important and library in Python and the reason is that you can performing operations on n-arrays in Python. Also, It is very useful for fundamental scientific computations like: Machine Learning, Linear Algebra, Fourier Transform, etc.

If you don’t have install Numpy in Python yet, please refer to: “Installing Numpy”.

For more details about Numpy you can check Numpy Documentation

import numpy as np

Arrays in Numpy

Array in 1D

Suppose that we have the following list in python ( if you are not familiar…

In this section, we will see some basic concepts in Python. We use the free and open-source distribution of the Python called Jupyter Notebook.

So, let’s start our learning !! and have some fun

Arithmetic Operations

The arithmetic operators in Python are used to perform math operations, such as addition, subtraction, multiplication, and division. For example:

What does mean in In[6] 4%2=0? let me explain you:

The Symbol % in Python is called Modulus Function and it’s useful to see the remain of the division. So, in this case if we divide 4%2 obviously the remain is 0 .

Now, suppose that…

In the previous article, we have studied Autoregressive Model (AR) and also we have seen a Simple Example of AR . Now, we will see what is Moving Average and the difference with AR.

What is the difference between AR and MA?

Moving Average model, or MA, like the AR is a linear regression model, but instead of being a linear regression model in lag observations, it is a linear regression of the lag residual errors.

Moving Average model is represented by MA(q), where q is the order, i.e:



Machine Learning (ML)

Artificial Intelliegence (AI)

According to Gevarter William B. in the book:“ Intelligent Machines: An Introductory Perspective of Artificial Intelligence and Robotics”, Artificial Intelligence (AI) is an area of computer science that emphasizes the creation of intelligent machines that work and react like humans. In recent years, it has been advancing at an exponential rate.

AI has various subsets which contribute it, such as Machine learning, Deep Learning, NLP, Expert System, etc.

Figure 1: Deep learning is a subset of Machine Learning and Machine Learning is a subset of AI. (

On the other hand, we can think that Deep Learning, Machine Learning and Artificial Intelligence as a set of Russian dolls nested within each other, beginning with the smallest, that is, Deep…

For those who don’t know it yet ,Akinator is a computer game and mobile app created by French Company: Elokence.

Akinator’s goal is to guess a real or fictional characters. To guess the character the player is thinking, Akinator asks a series of questions and the player can answer with ‘Yes’,‘ Don’t know’, ‘No’, ‘Probably ’and ‘Probably’ not , then the program determines the best question.

This section is a simple example of AR(2) of the section: Autoregressive Model (AR). Therefore, I recommend reading the theory first before moving on to this section.

Preliminary Steps

From the Shillers webpage download “U.S. Stock Markets 1871-Present and CAPE Ratio” and take the stock price index (column B). Import the data of the following form:

Then, we transform the series to get log returns. We use the function diff(log). In general, to obtain the log-return of a time serie (S(t)). We should compute log(S(t)) − log(S(t − 1)). To simplify, in Matlab this difference can be obtained with diff(log(S(t)). …

Autoregressive model or AR model, is a representation of a type of random process. This model is useful to predict the future based on the past behaviour. For example this model can be used to describe certain time-varying processes in nature, economics, etc.

Autoregressive Model is represented by AR(p) where p is the order, that is:


Example: Compute the Impurity using Entropy and Gini Index.

Tip: This article is the continuation of Tree Models. Therefore, I recommend that you read this carefully.

The example that we will see next is taken from the book: Machine Learning: “The Art and Science of Algorithms that make Sense of Data”, Flach Peter.

Suppose you come across a number of sea animals that you suspect belong to the same species. You observe their length in metres, whether they have gills, whether they have a prominent beak and whether they have few or many teeth. …

Tree models are among the most popular models in Machine Learning, because they are expressive and easy to understand and also they are attractive to computer scientists. This model use an algorithm called “Divide and Conquer” nature, that is , an algorithm that divides the data into subsets, builds a tree for each of those and then combines those subtrees into a single tree.

It is important to note that Tree Models are not limited to classification, but can be used to solve almost any machine learning task, including Classification, Probability Estimation, Regression and Clustering. …

Patrizia Castagno

Physics. Data Science. This is my space where I can contribute my knowledge and also learn from others.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store