How to Remove Nan Values From a NumPy Array
-
Remove Nan Values Using
logical_not()andisnan()Methods in NumPy -
Remove Nan Values Using the
isfinite()Method in NumPy -
Remove Nan Values Using the
math.isnanMethod -
Remove Nan Values Using the
pandas.isnullMethod
This article will discuss some in-built NumPy functions that you can use to delete nan values.
Remove Nan Values Using logical_not() and isnan() Methods in NumPy
logical_not() is used to apply logical NOT to elements of an array. isnan() is a boolean function that checks whether an element is nan or not.
Using the isnan() function, we can create a boolean array that has False for all the non nan values and True for all the nan values. Next, using the logical_not() function, We can convert True to False and vice versa.
Lastly, using boolean indexing, We can filter all the non nan values from the original NumPy array. All the indexes with True as their value will be used to filter the NumPy array.
To learn more about these functions in-depth, refer to their official documentation and here, respectively.
Refer to the following code snippet for the solution.
import numpy as np
myArray = np.array([1, 2, 3, np.nan, np.nan, 4, 5, 6, np.nan, 7, 8, 9, np.nan])
output1 = myArray[np.logical_not(np.isnan(myArray))] # Line 1
output2 = myArray[~np.isnan(myArray)] # Line 2
print(myArray)
print(output1)
print(output2)
Output:
[ 1. 2. 3. nan nan 4. 5. 6. nan 7. 8. 9. nan]
[1. 2. 3. 4. 5. 6. 7. 8. 9.]
[1. 2. 3. 4. 5. 6. 7. 8. 9.]
Line 2 is a simplified version of Line 1.
Remove Nan Values Using the isfinite() Method in NumPy
As the name suggests, the isfinite() function is a boolean function that checks whether an element is finite or not. It can also check for finite values in an array and returns a boolean array for the same. The boolean array will store False for all the nan values and True for all the finite values.
We will use this function to retrieve a boolean array for the target array. Using boolean indexing, We will filter all the finite values. Again, as mentioned above, indexes with True values will be used to filter the array.
Here’s the example code.
import numpy as np
myArray1 = np.array([1, 2, 3, np.nan, np.nan, 4, 5, 6, np.nan, 7, 8, 9, np.nan])
myArray2 = np.array([np.nan, np.nan, np.nan, np.nan, np.nan, np.nan])
myArray3 = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
output1 = myArray1[np.isfinite(myArray1)]
output2 = myArray2[np.isfinite(myArray2)]
output3 = myArray3[np.isfinite(myArray3)]
print(myArray1)
print(myArray2)
print(myArray3)
print(output1)
print(output2)
print(output3)
Output:
[ 1. 2. 3. nan nan 4. 5. 6. nan 7. 8. 9. nan]
[nan nan nan nan nan nan]
[ 1 2 3 4 5 6 7 8 9 10]
[1. 2. 3. 4. 5. 6. 7. 8. 9.]
[]
[ 1 2 3 4 5 6 7 8 9 10]
To learn more about this function, refer to the official documentation
Remove Nan Values Using the math.isnan Method
Apart from these two NumPy solutions, there are two more ways to remove nan values. These two ways involve isnan() function from math library and isnull function from pandas library. Both these functions check whether an element is nan or not and return a boolean result.
Here is the solution using isnan() method.
import numpy as np
import math
myArray1 = np.array([1, 2, 3, np.nan, np.nan, 4, 5, 6, np.nan, 7, 8, 9, np.nan])
myArray2 = np.array([np.nan, np.nan, np.nan, np.nan, np.nan, np.nan])
myArray3 = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
booleanArray1 = [not math.isnan(number) for number in myArray1]
booleanArray2 = [not math.isnan(number) for number in myArray2]
booleanArray3 = [not math.isnan(number) for number in myArray3]
print(myArray1)
print(myArray2)
print(myArray3)
print(myArray1[booleanArray1])
print(myArray2[booleanArray2])
print(myArray3[booleanArray3])
Output:
[ 1. 2. 3. nan nan 4. 5. 6. nan 7. 8. 9. nan]
[nan nan nan nan nan nan]
[ 1 2 3 4 5 6 7 8 9 10]
[1. 2. 3. 4. 5. 6. 7. 8. 9.]
[]
[ 1 2 3 4 5 6 7 8 9 10]
Remove Nan Values Using the pandas.isnull Method
Below is the solution using the isnull() method from pandas.
import numpy as np
import pandas as pd
myArray1 = np.array([1, 2, 3, np.nan, np.nan, 4, 5, 6, np.nan, 7, 8, 9, np.nan])
myArray2 = np.array([np.nan, np.nan, np.nan, np.nan, np.nan, np.nan])
myArray3 = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
booleanArray1 = [not pd.isnull(number) for number in myArray1]
booleanArray2 = [not pd.isnull(number) for number in myArray2]
booleanArray3 = [not pd.isnull(number) for number in myArray3]
print(myArray1)
print(myArray2)
print(myArray3)
print(myArray1[booleanArray1])
print(myArray2[booleanArray2])
print(myArray3[booleanArray3])
print(myArray1[~pd.isnull(myArray1)]) # Line 1
print(myArray2[~pd.isnull(myArray2)]) # Line 2
print(myArray3[~pd.isnull(myArray3)]) # Line 3
Output:
[ 1. 2. 3. nan nan 4. 5. 6. nan 7. 8. 9. nan]
[nan nan nan nan nan nan]
[ 1 2 3 4 5 6 7 8 9 10]
[1. 2. 3. 4. 5. 6. 7. 8. 9.]
[]
[ 1 2 3 4 5 6 7 8 9 10]
[1. 2. 3. 4. 5. 6. 7. 8. 9.]
[]
[ 1 2 3 4 5 6 7 8 9 10]
