How to Drop a Column in Pandas


In Pandas, dropping columns from a DataFrame is a common operation during data manipulation and preprocessing.

Understanding different methods to drop columns is crucial for data analysis workflows.

    Table of Contents

  1. Column Drop Methods
    1. Using drop() method
    2. Using del keyword
    3. Using pop() method
  2. Drop Multiple Columns
    1. Method 1
    2. Method 2
    3. Method 3
  3. Conclusion

1. Column Drop Methods

There are multiple ways to drop columns from a DataFrame in Pandas. Here are some of the most common methods:

1.1 Using drop() method

The drop() method can be used to drop columns from a DataFrame. To drop a column, we need to specify the column name as an argument and set axis=1 as the second argument.

For example, to drop a column named 'Age' you can use df.drop('Age', axis=1) where df is the DataFrame.

import pandas as pd

# Creating a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['NY', 'LA', 'SF']}

df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

# ๐Ÿ‘‰ drop a column
df = df.drop('Age', axis=1)
# or
# df.drop('Age', axis=1, inplace=True)

print("\nAfter dropping 'Age' column:")
print(df)

Output:

Original DataFrame:
      Name  Age City
0    Alice   25   NY
1      Bob   30   LA
2  Charlie   35   SF

After dropping 'Age' column:
      Name City
0    Alice   NY
1      Bob   LA
2  Charlie   SF

1.2 Using del keyword

The del keyword is well known for deleting variables in Python. But it can also be used to delete columns from a DataFrame.

To delete a column named 'Age' you can write del df['Age'], and the column will be deleted from the DataFrame.

import pandas as pd

# Creating a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['NY', 'LA', 'SF']}

df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

# ๐Ÿ‘‰ delete a column
del df['Age']

print("\nAfter deleting Age column:")
print(df)

Output:

Original DataFrame:
      Name  Age City
0    Alice   25   NY
1      Bob   30   LA
2  Charlie   35   SF

After deleting Age column:
      Name City
0    Alice   NY
1      Bob   LA
2  Charlie   SF

1.3 Using pop() method

The pop() method is used to remove a column from a DataFrame and return it. It takes the column name as an argument.

For example, df.pop('Age') will remove the 'Age' column from the DataFrame and return it.

import pandas as pd

# Creating a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['NY', 'LA', 'SF']}

df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

# ๐Ÿ‘‰ drop a column
df.pop('Age')

print("\nAfter dropping 'Age' column:")
print(df)

Output:

Original DataFrame:
      Name  Age City
0    Alice   25   NY
1      Bob   30   LA
2  Charlie   35   SF

After deleting Age column:
      Name City
0    Alice   NY
1      Bob   LA
2  Charlie   SF

2. Drop Multiple Columns

Pandas deal with data of 100s of columns so knowing how to drop multiple columns is important.

Method 1

To drop multiple columns, you can use drop() method and pass a list of column names to be dropped as an argument.

For example, df.drop(['Age', 'City'], axis=1) will drop both 'Age' and 'City' columns from the DataFrame.

import pandas as pd

# Creating a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['NY', 'LA', 'SF']}

df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

# ๐Ÿ‘‰ droping multiple columns
df.drop(['Age', 'City'], axis=1, inplace=True)

print("\nAfter dropping 'Age' and 'City' columns:")
print(df)

Output:

Original DataFrame:
      Name  Age City
0    Alice   25   NY
1      Bob   30   LA
2  Charlie   35   SF

After dropping 'Age' and 'City' columns:
      Name
0    Alice
1      Bob
2  Charlie

Method 2

The df.column[] method returns a Series object containing the column values. You can pass the index values if columns you want to delete in df.column[] which will return a DataFrame with the specified columns.

You can pass the returned DataFrame to the drop() method and set axis=1 to drop the columns.

import pandas as pd

# Creating a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['NY', 'LA', 'SF']}

df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

# ๐Ÿ‘‰ droping multiple columns
df.drop(df.columns[[0, 2]], axis=1, inplace=True)

print("\nAfter dropping colums at index 0 and 2:")
print(df)

Output:

Original DataFrame:
      Name  Age City
0    Alice   25   NY
1      Bob   30   LA
2  Charlie   35   SF

After dropping colums at index 0 and 2:
    Age
0   25
1   30
2   35

Method 3

Another way to delete multiple columns can be iloc[] method.

Following example is removing columns from index 1 to 4 using iloc[] and drop() method.

import pandas as pd

# Creating a sample DataFrame
data = {'A': [1, 2, 3, 4, 5],
        'B': [10, 20, 30, 40, 50],
        'C': [11, 22, 33, 44, 55],
        'D': [12, 24, 36, 48, 60],
        'E': [13, 26, 39, 52, 65],
        'F': [14, 28, 42, 56, 70]}

df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

# ๐Ÿ‘‰ droping multiple columns
# to drop columns from position 1 to 4 pass index 1:5
df.drop(df.iloc[:, 1:5], axis=1, inplace=True)

print("\nAfter dropping colums from index 1 to 4:")
print(df)

Output:

Original DataFrame:
    A   B   C   D   E   F
0   1  10  11  12  13  14
1   2  20  22  24  26  28
2   3  30  33  36  39  42
3   4  40  44  48  52  56
4   5  50  55  60  65  70

After dropping colums from index 1 to 4:
    A   F
0   1  14
1   2  28
2   3  42
3   4  56
4   5  70

Conclusion

Dropping columns in Pandas DataFrames is essential for data manipulation tasks. Using methods like drop(), del, and pop() allows for seamless removal of single or multiple columns.