How to Split Column Into Two Columns in R
-
Use the
separate
Function to Split Column Into Two Columns in R -
Use the
extract
Function to Split Column Into Two Columns in R -
Use the
str_split_fixed
Function to Split Column Into Two Columns in R
This article will introduce how to split a column into two columns using separate
in R.
Use the separate
Function to Split Column Into Two Columns in R
separate
is part of the tidyr
package, and it can be used to split a character column into multiple columns with regular expressions or numeric locations. In this code example, we declare a data frame that contains comma-separated strings of name/surname pairs. separate
function takes the data frame as the first argument and column name as the second argument. The third argument denotes the variable names that will be column names of a newly created character vector. Note that we use %>%
pipe to pass df
object to the separate
function. The same function call can be invoked on the data frame where name and surnames are delimited with a dot separator.
library(dplyr)
library(tidyr)
library(stringr)
df <- data.frame(x = c('John, Mae', 'Maude, Lebowski', 'Mia, Amy', 'Andy, James'))
df1 <- data.frame(x = c('John. Mae', 'Maude. Lebowski', 'Mia. Amy', 'Andy. James'))
df %>% separate(x, c('Name', 'Surname'))
df1 %>% separate(x, c('Name', 'Surname'))
Output:
> df %>% separate(x, c('Name', 'Surname'))
Name Surname
1 John Mae
2 Maude Lebowski
3 Mia Amy
4 Andy James
> df1 %>% separate(x, c('Name', 'Surname'))
Name Surname
1 John Mae
2 Maude Lebowski
3 Mia Amy
4 Andy James
Use the extract
Function to Split Column Into Two Columns in R
Another useful function to split a column into two separate ones is extract
, which is also part of the tidyr
package. extract
function works on columns using regular expressions groups. Note that each regular expression group should be mapped to the items in the previous parameter. If the groups and items don’t match, the output will have NA
values.
library(dplyr)
library(tidyr)
library(stringr)
df <- data.frame(x = c('John, Mae', 'Maude, Lebowski', 'Mia, Amy', 'Andy, James'))
df %>% extract(x, c("Name", "Surname"), "([^,]+), ([^)]+)")
Output:
> df %>% extract(x, c("Name", "Surname"), "([^,]+), ([^)]+)")
Name Surname
1 John Mae
2 Maude Lebowski
3 Mia Amy
4 Andy James
Use the str_split_fixed
Function to Split Column Into Two Columns in R
Alternatively, we can utilize str_split_fixed
function from the stringr
package. It matches the given character pattern and splits the character vector into the corresponding number of columns. Although, the user can explicitly pass the number of split items to return. The number of items is passed as the third argument.
library(dplyr)
library(tidyr)
library(stringr)
df <- data.frame(x = c('John, Mae', 'Maude, Lebowski', 'Mia, Amy', 'Andy, James'))
str_split_fixed(df$x, ", ", 2)
Output:
> str_split_fixed(df$x, ", ", 2)
[,1] [,2]
[1,] "John" "Mae"
[2,] "Maude" "Lebowski"
[3,] "Mia" "Amy"
[4,] "Andy" "James"
Founder of DelftStack.com. Jinku has worked in the robotics and automotive industries for over 8 years. He sharpened his coding skills when he needed to do the automatic testing, data collection from remote servers and report creation from the endurance test. He is from an electrical/electronics engineering background but has expanded his interest to embedded electronics, embedded programming and front-/back-end programming.
LinkedIn Facebook