How to Shuffle an Array in Python
-
Shuffle an Array in Python Using the
random.shuffle()
Method -
Shuffle an Array in Python Using the
shuffle()
Method ofsklearn
Module
In this tutorial, we will look into the various methods to shuffle an array in Python. The shuffling of an array means rearranging the positions of the elements in the array. One of the array shuffling applications is in model training, where we need to shuffle our dataset to improve the model’s training quality. It could also be used in many applications of statistics.
Shuffle an Array in Python Using the random.shuffle()
Method
The random.shuffle()
method takes a sequence as input and shuffles it. The important thing to note here is that the random.shuffle()
does not return a new sequence as output but instead shuffles the original sequence. Therefore the valid input sequence can only be mutable data types like an array or a list etc.
The random.shuffle()
method only works on 1D sequences. The below example code demonstrates how to use the random.shuffle()
to shuffle an array in Python.
import random
import numpy as np
mylist = ["apple", "banana", "cherry"]
x = np.array((2, 3, 21, 312, 31, 31, 3123, 131))
print(x)
print(mylist)
random.shuffle(mylist)
random.shuffle(x)
print(x)
print(mylist)
Output:
[ 2 3 21 312 31 31 3123 131]
['apple', 'banana', 'cherry']
[3123 21 312 3 2 131 31 31]
['banana', 'apple', 'cherry']
Shuffle an Array in Python Using the shuffle()
Method of sklearn
Module
The sklearn.utils.shuffle(array, random_state, n_samples)
method takes indexable sequences like arrays, lists, or dataframes, etc. with the same first dimension as input and returns the copies of the shuffled sequences provided as input.
The sklearn.utils.shuffle()
does not change the original input but returns the input’s shuffled copy. The input can be single or multiple sequences. The random_state
parameter is used to control the random generation of numbers. If it is set to some integer, the method will return the same shuffled sequence every time. The n_samples
represents the number of samples, and its default value is equal to the first dimension of the input default and should not be greater than the length of the input array(s).
sklearn.utils.shuffle()
method will only shuffle the rows.The below example code demonstrates how to use the sklearn.utils.shuffle()
method to get a shuffled array(s) in Python.
from sklearn.utils import shuffle
import numpy as np
x = np.array([[1, 2, 3], [6, 7, 8], [9, 10, 12]])
y = ["one", "two", "three"]
z = [4, 5, 6]
print(x)
print(y)
print(z)
x, y, z = shuffle(x, y, z, random_state=0)
print(x)
print(y)
print(z)
Output:
[[ 1 2 3]
[ 6 7 8]
[ 9 10 12]]
['one', 'two', 'three']
[4, 5, 6]
[[ 9 10 12]
[ 6 7 8]
[ 1 2 3]]
['three', 'two', 'one']
[6, 5, 4]