How to Normalize a List of Numbers in Python
- the Formula for Normalization
-
Normalize a List of Numbers Using the
MinMaxScaler()
Function in Pythonsklearn
- Normalize a List of Numbers Manually in Python
- Conclusion
Normalization is a crucial data preprocessing technique that involves converting data into a standardized scale. The goal is to rescale the data to fit within a specified range, often [0, 1], for optimal performance in various applications, such as machine learning algorithms.
This article will delve into the concept of normalization, its formula, and methods for achieving it, both using built-in functions and manually.
the Formula for Normalization
Normalization is the process of transforming data into a specific scale, typically between two defined values, like 0 and 1. The motivation behind normalization is to enhance the performance of machine learning algorithms, which tend to operate more effectively when working with smaller data values.
For example, consider a simple list of numbers: {1, 2, 3}. After normalizing to a scale of 0 to 1, the list becomes {0, 0.5, 1}. We can also customize the normalization scale, like normalizing to a range between 2 and 6, resulting in {2, 4, 6}.
To understand how normalization works, let’s have a look at its formula. We subtract the minimum value from every number and divide it by the range i-e: max-min. So, in output, we get the normalized value of that specific number.
Where:
- (x): The original value.
- (x_{\text{min}}): The minimum value in the dataset.
- (x_{\text{max}}): The maximum value in the dataset.
This formula is a fundamental representation of how min-max normalization is calculated, ensuring that the scaled values fall within the range [0, 1].
We can use two methods to normalize a list. Either we can use the built-in function, which is available in the preprocessing module of the sklearn
package, or we can make our logic for it, which works on the same formula as discussed above.
Normalize a List of Numbers Using the MinMaxScaler()
Function in Python sklearn
The MinMaxScaler()
function within the preprocessing
module of the scikit-learn library is a powerful tool for normalizing a list of numbers. This process involves scaling the values to a specified range, which is often set between 0 and 1, although the range can be customized to suit your specific needs.
The syntax for creating an instance of MinMaxScaler()
is as follows:
sklearn.preprocessing.MinMaxScaler(feature_range=(min, max), copy=True)
Parameters:
feature_range
: The desired range for the transformed features. By default, it’s set to (0, 1). You can specify a different range by providing a tuple of minimum and maximum values (e.g., (new_min
,new_max
)).copy
: A boolean (True
by default) indicating whether a copy of the original array should be created or not.
Once you have created an instance of MinMaxScaler
with the desired parameters, you can use its fit()
and transform()
methods to scale the data accordingly. The fit()
method computes the minimum and maximum values needed for scaling, while the transform()
method applies the scaling based on the computed minimum and maximum values.
Let’s take a look at an example demonstrating the usage of MinMaxScaler()
:
import numpy as np
from sklearn import preprocessing
numbers = np.array([6, 1, 0, 2, 7, 3, 8, 1, 5]).reshape(-1, 1)
print("Original List:", numbers)
scaler = preprocessing.MinMaxScaler()
normalized_numbers = scaler.fit_transform(numbers)
print("Normalized List:", normalized_numbers)
The output will be:
Original List: [[6]
[1]
[0]
[2]
[7]
[3]
[8]
[1]
[5]]
Normalized List: [[0.75 ]
[0.125]
[0. ]
[0.25 ]
[0.875]
[0.375]
[1. ]
[0.125]
[0.625]]
In the provided example code, we first import the necessary libraries: numpy
for numerical array handling and preprocessing
from scikit-learn
for the MinMaxScaler
function.
Next, we create a sample list of numbers using NumPy, reshaping it into a column vector for consistent processing. The original list of numbers is printed to provide a reference.
We then create an instance of MinMaxScaler()
. Using the fit_transform
method of the scaler instance, we normalize the original list of numbers, and the resulting normalized list is printed for examination.
Customizing the Normalization Range
If you want to define a specific range for the normalization, you can achieve this by specifying the feature_range
parameter in MinMaxScaler()
. By default, the range is set to 0 and 1, but you have the flexibility to tailor it to your requirements.
Here’s an example where we set the range to 0 and 3:
import numpy as np
from sklearn import preprocessing
numbers = np.array([6, 1, 0, 2, 7, 3, 8, 1, 5]).reshape(-1, 1)
print("Original List:", numbers)
scaler = preprocessing.MinMaxScaler(feature_range=(0, 3))
normalized_numbers = scaler.fit_transform(numbers)
print("Normalized List:", normalized_numbers)
Output:
Original List: [[6]
[1]
[0]
[2]
[7]
[3]
[8]
[1]
[5]]
Normalized List: [[2.25 ]
[0.375]
[0. ]
[0.75 ]
[2.625]
[1.125]
[3. ]
[0.375]
[1.875]]
In the additional example provided, we import the necessary libraries and create a sample list of numbers, similar to the initial example. We then print the original list of numbers.
To customize the normalization range, we create a MinMaxScaler()
instance with a custom range defined as 0 to 3 using the feature_range
parameter.
We proceed to normalize the list using this custom range with the fit_transform
method, and the resulting normalized list within the specified range is printed.
Normalize a List of Numbers Manually in Python
Normalizing a list of numbers can also be achieved manually using a simple formula.
Recall the following formula for Min-Max scaling:
Where (x) is the original value, (x_{\text{min}}) is the minimum value in the dataset, and (x_{\text{max}}) is the maximum value in the dataset.
Let’s demonstrate how to manually normalize a list of numbers in Python using the given formula.
numbers = [6, 1, 0, 2, 7, 3, 8, 1, 5]
print("Original List:", numbers)
xmin = min(numbers)
xmax = max(numbers)
normalized_numbers = [(x - xmin) / (xmax - xmin) for x in numbers]
print("Normalized List:", normalized_numbers)
Output:
Original List: [6, 1, 0, 2, 7, 3, 8, 1, 5]
Normalized List: [0.75, 0.125, 0.0, 0.25, 0.875, 0.375, 1.0, 0.125, 0.625]
Here, we define the original list of numbers, numbers
. We then determine the minimum (xmin
) and maximum (xmax
) values in the list.
Using list comprehension, we normalize each number using the provided formula. The normalized values are stored in the list normalized_numbers
, and we print both the original and normalized lists.
We can also manually normalize a list of numbers in Python using NumPy. Let’s demonstrate this with an example:
import numpy as np
def min_max_normalization(data):
xmin = np.min(data)
xmax = np.max(data)
normalized_data = [(x - xmin) / (xmax - xmin) for x in data]
return normalized_data
numbers = [6, 1, 0, 2, 7, 3, 8, 1, 5]
print("Original List:", numbers)
normalized_numbers = min_max_normalization(numbers)
print("Normalized List:", normalized_numbers)
Output:
Original List: [6, 1, 0, 2, 7, 3, 8, 1, 5]
Normalized List: [0.75, 0.125, 0.0, 0.25, 0.875, 0.375, 1.0, 0.125, 0.625]
Here, we first import the NumPy library to utilize its functions for numerical operations. Then, we create a function, min_max_normalization
, to perform the Min-Max scaling.
We then use the NumPy functions np.min
and np.max
to find the minimum and maximum values in the list. After this, we use a list comprehension to apply the Min-Max scaling formula to normalize each number based on the minimum and maximum values.
Lastly, we print both the original and normalized lists to showcase the effect of the normalization.
Conclusion
Normalization is a powerful tool in data preprocessing, aiding in achieving consistency and optimal performance in various data-driven applications. Whether using built-in functions like MinMaxScaler()
or implementing custom normalization, understanding and applying this technique is crucial for effective data analysis and modeling.
I am Fariba Laiq from Pakistan. An android app developer, technical content writer, and coding instructor. Writing has always been one of my passions. I love to learn, implement and convey my knowledge to others.
LinkedInRelated Article - Python List
- How to Convert a Dictionary to a List in Python
- How to Remove All the Occurrences of an Element From a List in Python
- How to Remove Duplicates From List in Python
- How to Get the Average of a List in Python
- What Is the Difference Between List Methods Append and Extend
- How to Convert a List to String in Python