Rename column by index Pandas in Python
One of the most used Python libraries for data science and machine learning is Pandas.
It offers a number of data exploration, cleaning, and transformation operations that are critical in working with data in Python.
Renaming columns in a Pandas DataFrame is a necessary task when working with datasets.
Pandas Library offers a straightforward way to achieve this. In this tutorial, we'll dive into how to rename columns using their index positions.
Understanding the DataFrame
Let's start by assuming we have a Pandas DataFrame df
with columns we want to rename:
import pandas as pd
# Creating a sample DataFrame
data = {'A': [1, 2, 3],
'B': [4, 5, 6],
'C': [7, 8, 9]}
df = pd.DataFrame(data)
print(df)
Output:
A B C
0 1 4 7
1 2 5 8
2 3 6 9
Renaming Columns by Index
Pandas provide df.columns
attribute to get the column names of the DataFrame.
Lets say we want to rename the column 'A' to 'column_A'. We can do this by accessing the column name using its index position and assigning a new name to it.
Here is how we can do this:
# Renaming column by index
df.columns.values[0] = 'column_A'
print(df)
Output:
column_A B C
0 1 4 7
1 2 5 8
2 3 6 9
As you can see, the column 'A' has been renamed to 'column_A'.
Renaming Multiple Columns by Index
Suppose we want to rename the columns 'A' and 'B' to 'column_A' and 'column_B' respectively.
You can assign a list of new column names to the df.columns.values
attribute and adjust the index positions accordingly.
# Renaming multiple columns by index
df.columns.values[0:2] = ['column_A', 'column_B']
print(df)
Output:
column_A column_B C
0 1 4 7
1 2 5 8
2 3 6 9
Renaming Columns Using a Loop
You can also rename columns using a loop. This is useful when you have a large number of columns to rename.
First create a list of new column names and then use a for loop to iterate over the list and rename the columns.
# Renaming columns using a loop
new_column_names = ['column_A', 'column_B', 'column_C']
for i in range(len(df.columns)):
df.columns.values[i] = new_column_names[i]
print(df)
Output:
column_A column_B column_C 0 1 4 7 1 2 5 8 2 3 6 9
Dynamically Renaming Columns by Index
For a large DataFrame, it is not feasible to rename columns one by one. You can use a for loop to rename columns dynamically.
Suppose we want to rename the columns 'A', 'B', and 'C' to 'A_new', 'B_new', and 'C_new' respectively.
# Appending 'New' to all column names
df.columns = [f'{col}_new' for col in df.columns]
print(df)
Output:
A_new B_new C_new 0 1 4 7 1 2 5 8 2 3 6 9
Renaming column using rename()
The rename()
method is used to rename any index, column or row. It accepts a dictionary as input, where the keys are the old column names and the values are the corresponding new column names.
Syntax:
df.rename(columns = {'old_column_name':'new_column_name'}, inplace=True)
Here, inplace=True
saves the changes right away in the df
dataset.
# Using rename() method to rename columns
df.rename(columns={'A': 'Alpha', 'B': 'Beta', 'C': 'Charlie'}, inplace=True)
print(df)
Output:
Alpha Beta Charlie
0 1 4 7
1 2 5 8
2 3 6 9
Conclusion
In this section we covered renaming columns by index in Pandas, however, Pandas provides multiple approaches to renaming columns, catering to diverse needs and preferences.