Merge on Multiple Columns Using Pandas in Python

Introduction:

In this tutorial, we are learning about the merge on multiple columns using Pandas in Python. Pandas is a widely used open-source library for Python. It provides a fast and flexible way to work with structured data, including reading and writing data from different sources, cleaning, filtering, grouping, and manipulating data, and merging multiple data frames. Pandas is built on NumPy and provides easy-to-use data structures such as Series and DataFrame, which are perfect for data analysis.

Merging data frames is an important task in data analysis and data science. It involves combining data from two or more data frames across one or more columns. This process allows you to combine data from different sources, compare and analyze data from different perspectives, and obtain useful information. For example, combine customer data with sales data to analyze customer behavior and preferences, or weather data with crop data to determine the effects of weather on crops. In data analysis, Pandas DataFrames can be easily combined using the merge function. You can simplify this process by specifying which columns are used. Using a simple syntax, Merge becomes a simple tool for manipulating data in many situations. This article provides quick tips to improve your data processing skills by walking you through the simple steps of merging Pandas DataFrames.

Syntax:

The syntax for merging multiple columns using Pandas in Python is given below -

Parameters:

The parameters for merging on multiple columns using Pandas in Python are given below -

Return value:

The return value for merging multiple columns using Pandas in Python is DataFrame.

Ways to merge multiple columns using Pandas in Python:

There are many ways to merge two DataFrames by corresponding column. Here, we use some methods to merge two DataFrames by column as shown below -

  1. Inner Join Merge
  2. Outer Join Merge
  3. Left Join Merge
  4. Right Join Merge
  5. Column Subset Merge
  6. Merge Dataframe Concatenation

Creating a DataFrame by using Pandas in Python:

In this example, the code uses the pandas library to create two DataFrames ("d1" and "d2") in Python. 'd1' contains 'Name' and 'Marks', while 'd2' contains 'Name', 'Grade', 'Rank', and 'Gender'. Then, view the data frame. The code is given below -

Output:

Now we run the above code and find the result from it. The result is given below -

Name	Marks
0	Rima	67
1	Priya	79
2	Hiya	90
3	Mita	98
4	Diya	89
	Name	Grade	Rank	Gender
0	Rima	B	4	Female
1	Rudra	A	3	Male
2	Hiya	A	2	Female
3	Mita	A	1	Female

1. Merging two dataframes by using the Inner Join Merge in Python:

The "merge" method is used to combine two DataFrames by inner merging them, comparing the rows relative to the specified column, and creating a new DataFrame with the merged results.

Program Code:

Here, we give the program code for merging two dataframes by using the inner join merge in Python. In this example, the code merge DataFrames "d1" and "d2" using the "Name" column as the key. The result is a new DataFrame containing the columns "Name", "Marks", "Grade" and "Rank" with only valid values in both DataFrames. The code is given below -

Output:

Now we run the above code and find the result from it. The result is given below -

   Name  Marks Grade  Rank
0  Rima     67     B     4
1  Hiya     90     A     2
2  Mita     98     A     1

2. Merging two DataFrames by Using the Outer Join Merge in Python:

The outer join merge method includes all rows from both dataframes. If there is no match for a row in one of the dataframes, the non-matching rows in the data are filled with NaN values.

Program Code:

Here, we give the program code for merging two dataframes by using the outer join merge in Python. In this example, the code performs an outer join between DataFrame "d1" and "d2" based on the "Name" column and creates a new DataFrame named "d_merged" containing the merged data (Include all row in DataFrames). The code is given below -

Output:

Now we run the above code and find the result from it. The result is given below -

    Name  Marks Grade  Rank  Gender
0   Rima   67.0     B   4.0  Female
1  Priya   79.0   NaN   NaN     NaN
2   Hiya   90.0     A   2.0  Female
3   Mita   98.0     A   1.0  Female
4   Diyr   89.0   NaN   NaN     NaN
5  Rudra    NaN     A   3.0    Male

3. Merging two dataframes by using the Left Join Merge in Python:

The left join merge method joins two pandas DataFrames using left join, merging the rows sequentially and matching the rows in the left DataFrame while storing all rows in the right DataFrame.

Program Code:

Here, we give the program code for merging two dataframes by using the Left join merge in Python. In the created dataframe, the Level field of d2 will be merged with d1 as the main column name, and the merge mode will be left, i.e., all values of the left data frame (d1) will be displayed. The code is given below -

Output:

Now we run the above code and find the result from it. The result is given below -

    Name  Marks  Rank
0   Rima     67   4.0
1  Priya     79   NaN
2   Hiya     90   2.0
3   Mita     98   1.0
4   Diyr     89   NaN

4. Merging two dataframes by using the Right Join Merge in Python:

The right join merge method includes all rows from the right dataframe and matching rows from the left file. If there is no match, then the rows in the left data frame are filled with NaN values.

Program Code:

Here, we give the program code for merging two dataframes by using the Right join merge in Python. In this example, the code applies the merge rule to two DataFrames, "d1" and "d2" based on the "Name" column and assigns the result to the value "d_merged". The code is given below -

Output:

Now we run the above code and find the result from it. The result is given below -

    Name  Marks Grade  Rank  Gender
0   Rima   67.0     B     4  Female
1  Rudra    NaN     A     3    Male
2   Hiya   90.0     A     2  Female
3   Mita   98.0     A     1  Female

5. Merging two dataframes by using the Column subset Merge in Python:

The column subset merge method combines two data frames in pandas by selecting a specific set of rows from a dataframe and linking them to other files in the corresponding order. This creates the same dataframe containing selected rows from both datasets.

Program Code:

Here, we give the program code for merging two dataframes by using the column subset merge in Python. In this example we combined d1 with d2. The character string of d1 is merged with d2, where only positive results relative to the main column Name in both dataframes are displayed. The code is given below -

Output:

Now we run the above code and find the result from it. The result is given below -

    Name  Marks   Name Grade  Rank  Gender
0   Rima     67   Rima     B   4.0  Female
1  Priya     79  Rudra     A   3.0    Male
2   Hiya     90   Hiya     A   2.0  Female
3   Mita     98   Mita     A   1.0  Female
4   Diyr     89    NaN   NaN   NaN     NaN

Conclusion:

In this tutorial we are learning about the merge on multiple columns using Pandas in Python. Merging data frameworks is an important task in data analysis and data science. Pandas provides a powerful tool for merging dataframes across multiple channels. In this tutorial, we take a step-by-step look at how to merge two dataframes into multiple columns using Pandas. We create two dataframes, place them in the corresponding column, and search the combined dataframe to remove useful comments. By implementing Pandas' merging capabilities, you can unlock the full potential of your data. Here, we learn some ways to merge multiple columns using pandas along with the example.