How to Normalize the Values in Matrix R
-
Use the
sweep
Function to Sweep Out Arrays in R -
Use the
scale
Function to Normalize the Values in R Matrix
This article will introduce how to normalize the values in an R matrix.
Use the sweep
Function to Sweep Out Arrays in R
The sweep
function is used to sweep out the summary statistic from the array. It takes the input array as the first argument and the summary statistic as the third argument. The second argument of the function represents the vector of indices that need to correspond to the elements of the third argument vector. The fourth argument denotes the function that’s used to sweep out the array. In this case, we pass the division operator, which can be supplied with quoted notation - "/"
. The function returns the array with the same shape as the input array. We utilize the colSums
function to calculate the column sums of the given input array and pass the result as the summary statistic.
require(stats)
v1 <- c(1.1, 1.2, 4.3, 1.3, 3.9, 2.1, 5.3, 3.8, 7.7, 8.8, 6.7, 2.6)
m1 <- matrix(v1, ncol = 4)
sweep(m1, 2, colSums(m1), FUN = "/")
Output:
[,1] [,2] [,3] [,4]
[1,] 0.1666667 0.1780822 0.3154762 0.4861878
[2,] 0.1818182 0.5342466 0.2261905 0.3701657
[3,] 0.6515152 0.2876712 0.4583333 0.1436464
Note that, sweep
function can also with the default value of the function parameter. If the user does not supply it explicitly, the function is assumed to be the subtraction operator. Mind that when the custom function object is passed, it should have two arguments. The following code snippet subtracts the median of each column from the elements in the corresponding column of the matrix.
require(stats)
v1 <- c(1.1, 1.2, 4.3, 1.3, 3.9, 2.1, 5.3, 3.8, 7.7, 8.8, 6.7, 2.6)
m1 <- matrix(v1, ncol = 4)
med.att <- apply(m1, 2, median)
sweep(m1, 2, med.att)
Output:
[,1] [,2] [,3] [,4]
[1,] -0.1 -0.8 0.0 2.1
[2,] 0.0 1.8 -1.5 0.0
[3,] 3.1 0.0 2.4 -4.1
Use the scale
Function to Normalize the Values in R Matrix
Another useful function for matrix data normalization is scale
, which divides each column of the input matrix by the corresponding value from the third argument named - scale
. Note that scale
takes the center
argument that is used for column centering (more details can be found on this page). In this case, we assign FALSE
to the latter argument, indicating that column centering needs not to be done. The colSums
function is utilized to calculate the sums for each column of the input matrix and pass it as the scale
argument.
require(stats)
v1 <- c(1.1, 1.2, 4.3, 1.3, 3.9, 2.1, 5.3, 3.8, 7.7, 8.8, 6.7, 2.6)
m1 <- matrix(v1, ncol = 4)
c1 <- colSums(m1)
scale(m1, center = FALSE, scale = c1)
Output:
[,1] [,2] [,3] [,4]
[1,] 0.1666667 0.1780822 0.3154762 0.4861878
[2,] 0.1818182 0.5342466 0.2261905 0.3701657
[3,] 0.6515152 0.2876712 0.4583333 0.1436464
attr(,"scaled:scale")
[1] 6.6 7.3 16.8 18.1
Founder of DelftStack.com. Jinku has worked in the robotics and automotive industries for over 8 years. He sharpened his coding skills when he needed to do the automatic testing, data collection from remote servers and report creation from the endurance test. He is from an electrical/electronics engineering background but has expanded his interest to embedded electronics, embedded programming and front-/back-end programming.
LinkedIn Facebook