How to Perform T-Test in Pandas
This tutorial will discuss how we can find T-test values in Pandas.
Steps to Perform T-Test in Pandas
The following are the steps to perform a T-test in Pandas.
Import Pertinent Libraries
We must import the Pandas library and ttest_ind
from scipy.stats to get started.
import pandas as pd
from scipy.stats import ttest_ind
Create a Pandas DataFrame
Let us create a sample dataframe to perform the T-test operation on the same dataframe.
data = {
"Category": [
"type2",
"type1",
"type2",
"type1",
"type2",
"type1",
"type2",
"type1",
"type1",
"type1",
"type2",
],
"values": [1, 2, 3, 1, 2, 3, 1, 2, 3, 5, 1],
}
df = pd.DataFrame(data)
We created a dataframe with a category column with two types of categories and assigned a value to each category instance.
Let us view our dataframe below.
print(df)
Output:
Category values
0 type2 1
1 type1 2
2 type2 3
3 type1 1
4 type2 2
5 type1 3
6 type2 1
7 type1 2
8 type1 3
9 type1 5
10 type2 1
We will now create a separate data frame for both category types using the below code. This step facilitates the T-test finding procedure.
type1 = my_data[my_data["Category"] == "type1"]
type2 = my_data[my_data["Category"] == "type2"]
Obtain T-Test Values in Pandas
We will now find the T-test results and store them in a variable using the ttest_ind()
function. We use this function in the following way.
res = ttest_ind(type1["values"], type2["values"])
In the above code, we passed our data frames to the function as a parameter, and we got the T-test results, including a tuple with the t-statistic & the p-value.
Let us now print the res
variable to see the results.
print(res)
Output:
Ttest_indResult(statistic=1.4927289925706944, pvalue=0.16970867501294376)
In the above output, we have found the T-test values with the t-statistic and the p-value. Thus, we can successfully find the T-test values in Pandas with the above method.