Selection Methods for a Random Sample From Matrix or Array With Dataset in MATLAB

Mehak Mubarik Jan 07, 2022
  1. Extract Random Samples Using the randsample Function in MATLAB
  2. Extract Random Samples Using the datasample Function in MATLAB
  3. Extract Random Sample Subsets of a Column From a Dataset Matrix Using datasample in MATLAB
Selection Methods for a Random Sample From Matrix or Array With Dataset in MATLAB

We will look at different methods to select random samples from any dataset, array, or matrix using different commands of MATLAB.

To clear your concepts and give you a full insight on how to obtain random samples, we will explain the functions like Randn, randsample, datasample by giving examples of codes to extract random samples from your dataset with replacement as well as without replacement/substitution along with snippets showing how your output will look like.

Let us assume that we have a matrix containing our dataset with 50,000 rows. We want to select a random sample containing 50 entities from our matrix. We can perform this task using more than one random sampling method. Before starting to list these methods, keep in mind that a random sample/data/dataset is some data that is randomly chosen from a matrix of a dataset. To eliminate bias and other undesirable possible repercussions, we use random sampling. But we have to keep in mind that it’s not quite as straightforward as it appears to us. To select a random sample from dataset is more complicated than selecting 10 entities from a dataset consisting of 500 entities. Also, we must ensure whether the random sample is indeed random or not!

Continuing with our assumption, we can use MATLAB to extract random samples from our dataset. MATLAB provides us with several functions to select random samples/data from a given dataset. For example, we can use the function randsample in MATLAB to choose samples at random out of any array or matrix containing data, both with and without replacement/substitution.

Extract Random Samples Using the randsample Function in MATLAB

Assuming that N_obs observations are uniformly picked at random with replacements from entries in the dataset, we use the function:

O_put = randsample(ourdata,N_obs)

Where N_obs represents the number of observations. If ourdata is a vector, our output O_put will also be a vector comprising of N_obs random samples from the dataset.

Let us use this function to solve our assumed problem.

Code:

%Let's assume we have 50,000 entries in a dataset "ourdata".

ourdata=50000;

%We want to obtain 5 random samples from this dataset 

N_obs=5;

%Let's follow the above-explained concept and write our code

O_put = randsample(ourdata,N_obs);

Output:

O_put =

       46700
       33937
       42457
       32788
        1786

Extract Random Samples Using the datasample Function in MATLAB

If we want to keep the dimensions in mind while extracting random samples, then we use the below function.

y = datasample(ourdata,N_obs,'Replace',false)

If Replace is true, we choose the sample with replacement; otherwise, we choose the sample without replacement. If Replace is set to false, we restrict N_obs so that it is not more than our set number of elements in dataset.

Replace is true by default.

true = sample with replacement.

false = sample without replacement.

We can accomplish this by writing a single-line code. Keeping the above assumptions in mind, we formulate our code as below.

%Let's assume we have 50,000 entries in a dataset "ourdata".
%We want to obtain 5 random samples from this dataset 
%Let's follow the above-explained concept and write our code using function
%datasample
%Let's Draw five unique values from the integers 1:50000 using 1 line code.

O_put = datasample(1:50000,5,'Replace',false);

Output:

O_put =

       24489       22279       32315       35467       37732

Extract Random Sample Subsets of a Column From a Dataset Matrix Using datasample in MATLAB

For this purpose, we will use the randn function in MATLAB. It creates random values’ arrays with normal distribution.

I_put=randn(A) produces an A-by-A matrix that contains randomly generated elements.

If A is not scalar (a vector), then MATLAB will display an error message.

Now, to get our random samples, we will use the datasample function, giving random columns’ subsets of our given data matrix.

Code:

I_put = randn(10,100000);
O_put = datasample(I_put,5,2,'Replace',false)

Output:

O_put =

-0.5995   -0.7377   -1.1902   -0.6021   -1.0812
-0.0572   -0.7831    0.4746    0.7105   -0.8038
 0.8401    1.0824   -0.3507    0.4069   -2.0817
-1.1358   -0.9041   -0.1702    0.5950    0.3954
-1.0887   -0.7766   -1.6901   -0.5047    1.1286
-0.0187   -0.3354   -0.7458    1.8554    0.8492
 0.3251   -0.4219    0.2440   -0.4750    0.7628
 1.4713   -1.9788   -1.6672    0.0035   -0.4316
 0.6880    1.4387   -1.3525   -0.6950    0.6411
-0.2777   -0.4776   -0.9841    1.2752    0.2645
Mehak Mubarik avatar Mehak Mubarik avatar

Mehak is an electrical engineer, a technical content writer, a team collaborator and a digital marketing enthusiast. She loves sketching and playing table tennis. Nature is what attracts her the most.

LinkedIn

Related Article - MATLAB Random