How to Combine Two Columns of Text in DataFrame in Pandas

Ahmed Waheed Feb 02, 2024
  1. + Operator Method
  2. Series.map() Method
  3. df.apply() Method
  4. Series.str.cat() Method
  5. df.agg() Method
How to Combine Two Columns of Text in DataFrame in Pandas

When working with datasets some times you need to combine two or more columns to form one column. For example, you have a dataset with first name and last name separated in columns, and now you need Full Name column. Listed below are the different ways to achieve this task.

  1. + operator
  2. map()
  3. df.apply()
  4. Series.str.cat()
  5. df.agg()

We will use the same DataFrame in the next sections as follows,

import pandas as pd

data = [["Ali", "Azmat", "30"], ["Sharukh", "Khan", "40"], ["Linus", "Torvalds", "70"]]
df = pd.DataFrame(data, columns=["First", "Last", "Age"])
print(df)

The following will be output.

     First      Last Age
0      Ali     Azmat  30
1  Sharukh      Khan  40
2    Linus  Torvalds  70

+ Operator Method

Use + operator simply if you want to combine data of the same data type.

import pandas as pd

data = [["Ali", "Azmat", "30"], ["Sharukh", "Khan", "40"], ["Linus", "Torvalds", "70"]]
df = pd.DataFrame(data, columns=["First", "Last", "Age"])
df["Full Name"] = df["First"] + " " + df["Last"]
print(df)

The following will be output.

     First      Last Age       Full Name
0      Ali     Azmat  30       Ali Azmat
1  Sharukh      Khan  40    Sharukh Khan
2    Linus  Torvalds  70  Linus Torvalds

Series.map() Method

You can also use the Series.map() method to combine the text of two columns.

import pandas as pd

data = [["Ali", "Azmat", "30"], ["Sharukh", "Khan", "40"], ["Linus", "Torvalds", "70"]]
df = pd.DataFrame(data, columns=["First", "Last", "Age"])
df["Full Name"] = df["First"].map(str) + " " + df["Last"]
print(df)

The following will be output.

     First      Last Age       Full Name
0      Ali     Azmat  30       Ali Azmat
1  Sharukh      Khan  40    Sharukh Khan
2    Linus  Torvalds  70  Linus Torvalds

df.apply() Method

join() function is also used to join strings. We can apply it on our DataFrame using df.apply() function. df.apply() function is used to apply another function on a specific axis.

import pandas as pd

data = [["Ali", "Azmat", "30"], ["Sharukh", "Khan", "40"], ["Linus", "Torvalds", "70"]]
df["Full Name"] = df[["First", "Last"]].apply(" ".join, axis=1)
print(df)

The following will be output.

     First      Last Age       Full Name
0      Ali     Azmat  30       Ali Azmat
1  Sharukh      Khan  40    Sharukh Khan
2    Linus  Torvalds  70  Linus Torvalds

Series.str.cat() Method

We can also use this Series.str.cat() method to concatenate strings in the Series/Index with the given separator.

import pandas as pd

data = [["Ali", "Azmat", "30"], ["Sharukh", "Khan", "40"], ["Linus", "Torvalds", "70"]]
df["Full Name"] = df["First"].str.cat(df["Last"], sep=" ")
print(df)

The following will be output.

     First      Last Age       Full Name
0      Ali     Azmat  30       Ali Azmat
1  Sharukh      Khan  40    Sharukh Khan
2    Linus  Torvalds  70  Linus Torvalds

df.agg() Method

Same as df.apply() this method is also used to apply a specific function over the specified axis.

import pandas as pd

data = [["Ali", "Azmat", "30"], ["Sharukh", "Khan", "40"], ["Linus", "Torvalds", "70"]]
df["Full Name"] = df[["First", "Last"]].agg(" ".join, axis=1)
print(df)

The following will be output.

     First      Last Age       Full Name
0      Ali     Azmat  30       Ali Azmat
1  Sharukh      Khan  40    Sharukh Khan
2    Linus  Torvalds  70  Linus Torvalds

Related Article - Pandas DataFrame Column