Drop Columns in pandas

When working with data in Pandas, we may remove a column(s) or some rows from a Pandas DataFrame. Columns/rows are usually deleted if they are no longer needed for further study. There are a few ways to do this, but the best way in Pandas is to use the .drop() form. A DataFrame can often contain columns that are irrelevant to the research. Such columns should be removed from the DataFrame to allow us to concentrate on the remaining columns.

Columns may be omitted by defining the label names and corresponding axis or simply specifying the index or column names. In addition, labels on various levels may be removed by using a multi-index by defining the level. In this article, we are going to discuss the drop columns in pandas with some examples.

Drop() function

The drop() function is used to remove a set of labels from a row or column. We may exclude rows or columns by defining label names and matching axes or directly defining index or column names. Labels on various levels may be removed by using a multi-index by defining the level. We may drop or remove one or more columns from a python DataFrame using the .drop() feature.

Syntax:

The syntax of drop() function may be defined as:

Parameters:

Labels: A string or a list of column names or the row index value.

Index: to provides the row labels.

Level: In the case of a MultiIndex DataFrame, it is used to determine the level from which the labels should be removed. It accepts either a level location or a level name as input.

Axis: It indicates that columns or rows should be dropped. To remove columns, set an axis to 1 or 'columns'. It deletes the rows from the DataFrame by default.

Columns: It's an alternative for axis = 'columns'. As input, it accepts a single column label or a list of column labels.

Inplace: It specifies whether a new DataFrame should be returned or an existing one should be modified. It is a Boolean flag with a default value of False.

Errors: If set 'ignore', ignore errors.

Returns

  • If inplace = True, it returns the DataFrame with the dropped columns or None.
  • If labels aren't found, it throws a KeyError.

Drop Single Column

A DataFrame may require the deletion of a single or complex column.

Example: We use df.drop(columns = 'col name') to remove the 'age' column from the DataFrame in the example below.

Output: After executing this code, we will get the output as shown below:

name  age  marks
0  Joe   20   85.1
1  Nat   21   77.8
name  marks
0  Joe   85.1
1  Nat   77.8

Using drop function with axis = 'column' or axis = 1

To delete columns, use the axis parameter of a DataFrame.drop() method. A row or column may be used as the axis. The column axis is denoted by the number 1 or 'columns'. Set axis=1 or axis= 'columns' and have a list of column names to be removed.

Example: Let's take an above example to understand how we may use the drop function with axis = 'column' and axis = 1.

Output: After executing this code, we will get the output as shown below:

name  age  marks
0  Joe   20   85.1
1  Nat   21   77.8
name
0  Joe
1  Nat

Drop multiple columns

There are two parameters of DataFrame.drop() function parameters that we may use to delete the multiple columns of DataFrame at once.

  1. Use the column parameter to specify a list of column names to remove.
  2. Set the axis to 1 and move the column names list.

Example: Let's take an example to understand how we may drop the multiple columns in the DataFrame.

Output: After executing this code, we will get the output as shown below:

name  age  marks
0  John   24   77.29
1  Alex   18   69.15
name
0  John
1  Alex

Drop the column in place

In the previous instances, whenever we executed a drop procedure, pandas generated a new copy of DataFrame because the modification was not in place. The parameter inplace specifies whether to drop a column from an existing DataFrame or make a copy of it.

  1. If inplace=True, it updates the current DataFrame without returning anything.
  2. If the inplace parameter is set to False, it generates a new DataFrame with the updated changes and returns it.

Example: Let's explain how we may use the drop function to drop the column in place.

Output: After executing this above code, we will get the output as shown below:

name  age  marks
0  John   24   79.18
1  Alex   18   68.79
name
0  John
1  Alex

Drop the columns by suppressing errors

If the column we are attempting to delete does not exist in the dataset, the DataFrame.drop() method throws a KeyError. If we just want to drop the column if it occurs, we could use the parameter errors to remove the error.

  1. Set errors= 'ignore' to prevent any errors from being thrown.
  2. Set errors= 'raised' to generate a KeyError for unknown columns.

Example: Let's take an example to understand how we may drop the columns by suppressing errors.

Output: After executing this above code, we will get the output as shown below:

name  age  marks
0  John   24  79.49
1  Alex   18  82.54
raise KeyError(f"{labels[mask]} not found in axis")
KeyError: "['salary'] not found in axis"

Drop the column by index position

If we want to remove columns from a DataFrame but do not know their names, we can do so by deleting the column using its index position. Column indexing begins with 0 (zero) and continues until the last column, whose index value is len(df.columns)-1.

Drop First n columns

We can use DataFrame.iloc and the Python range() function to define the column's range to be removed if we need to remove the first 'n' columns from a DataFrame. With the columns parameter of DataFrame.drop(), we are required to use the built-in function range().

Example: Let's take an example to understand how we may drop the first n columns in the DataFrame.

Output: After executing this code, we will get the output as shown below:

name  age  marks  class  city
0  John   24  84.45   A     US
1  Alex   18  76.11    B     UK
marks  class  city
84.45   A     US
76.11    B     UK

Drop the last column

Assume that we want to exclude the DataFrame's first or last column without using the column name. Use the DataFrame.columns attribute to delete a DataFrame column based on its index location in such situations. Simply move df.columns[index] to the DataFrame.drop columns parameter ().

Example: Let's take an example to understand how we may drop the last column from the DataFrame.

Output: After executing this above code, we will get the output as shown below:

name  age  marks
0  John   24  68.44
1  Alex   18  85.67
name  age
0  John   24
1  Alex   18

Drop range of columns using iloc

We may need to exclude the fourth column from the dataset or a group of columns altogether. DataFrame.iloc can be used to pick a single or several columns from a DataFrame. To define the index location of the columns that need to be dropped, we can use DataFrame.iloc in the column's parameter.

Example: Let's take an example to understand how we may drop the range of columns by using iloc function.

Output: After executing this above code, we will get the output as shown below:

name  age  marks
0  John   24  79.64
1  Alex   18  86.84
name  
0  John  
1  Alex  

Drop the Columns from multi-index DataFrames

A DataFrame with several column headers is referred to as a multi-index DataFrame. Such headers are divided into levels, with level 0 being the first, level 1 being the second, etc. A column may be dropped from any stage of a multi-index DataFrame. It drops the columns from all levels by default, but we can use a parameter level to drop the columns from only one level. We are required to pass a level name as level=level index.

Example: Let's take an example to understand how we may drop the columns from multi-index DataFrames.

Output: After executing this above code, we will get the output as shown below:

Class X Class Y Class Z Class Y
     Name   Marks    Name   Marks
0    John   87.22     Nat   68.79
1   Peter   73.45    Alex   82.76
Class X Class Z
     Name    Name
0    John     Nat
1   Peter    Alex

Drop column using a function

We can also use the feature to delete columns based on some logic or a condition. To drop columns, we can use both built-in and user-defined functions.

Drop the column using the pandas DataFrame.pop() function

If we just want to delete one column, we can use the DataFrame.pop(col label) function. We are required to pass a column label that requires to be deleted. By updating the existing DataFrame, it removes the column in-place. If the column is not found, it raises a KeyError.

Example: Let's take an example to understand how we may drop the column using the pandas DataFrame.pop() function.

Output: After executing this code, we will get the output as shown below:

name  age  marks
0  John   24  62.46
1  Alex   18  54.21
name  marks
0  John  62.46
1  Alex  54.21

Drop the columns using the loc function

If we want to drop all of the columns from DataFrame, we may do it quickly and easily with DataFrame.loc in the column's parameter of DataFrame.drop(). The column labels that need to be deleted are defined using DataFrame.loc. If no column labels are defined, such as df.loc[:], the DataFrame will be dropped off all columns.

Example: Let's take an example to understand how we may drop the columns using the loc function.

Output: After executing this above code, we will get the output as shown below:

name  age  marks
0  John   24  79.68
1  Alex   18  84.45

Drop the columns using the pandas DataFrame delete

To drop a single column from a DataFrame, we could use pandas inbuilt function del. It is a very simplified method of dropping a column from a DataFrame. We must choose the DataFrame column to be removed and transfer it as del df[col label].

Example: Let's take an example to understand how we may drop the columns using the pandas DataFrame delete.

Output: After executing this code, we will get the output as shown below:

name  age  marks
0  John   23  57.88
1  Alex   22  78.84
name  marks
0  John  57.88
1  Alex  78.84