Pandas Select Row by Index


In data analysis with Python, Pandas is a powerful library used for manipulating and analyzing structured data.

Selecting specific rows from a DataFrame based on their index is a common task in data manipulation.

In this tutorial, we will explore different ways to single or multiple rows from a Pandas DataFrame based on their index.

    Table of Contents

  1. Using loc[]
  2. Using iloc[]
  3. Select Multiple Rows
  4. Select Rows by Condition
  5. Select Rows in Range
  6. Conclusion

1. Using loc[]

The .loc[] method in Pandas allows for label-based indexing, enabling the selection of rows based on their index labels.

To select a single row from a DataFrame pass the index label of the row to the .loc[] method.

The following example selects the 2nd row (index=1) from the DataFrame.

import pandas as pd

# Creating a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emma'],
        'Age': [25, 30, 35, 40, 45],
        'City': ['NY', 'LA', 'SF', 'NY', 'LA']}

df = pd.DataFrame(data)

# Selecting a row using df.loc[]
print(df.loc[1])

Output:

Name    Bob
Age      30
City     LA
Name: 1, dtype: object

2. Using iloc[]

The .iloc[] method provides integer-based indexing, allowing the selection of rows based on their integer position in the DataFrame.

Pass the index position of the row to select a single row from a DataFrame.

Selecting the 3rd row (index=2) from the DataFrame.

import pandas as pd

# Creating a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emma'],
        'Age': [25, 30, 35, 40, 45],
        'City': ['NY', 'LA', 'SF', 'NY', 'LA']}

df = pd.DataFrame(data)

# Selecting a row using df.iloc[]
print(df.iloc[2])

Output:

Name    Bob
Age      35
City     SF
Name: 2, dtype: object

3. Select Multiple Rows

Selecting multiple columns is required when you wish to include a desired set of rows in the DataFrame for further analysis.

To select multiple rows from a DataFrame, you need to pass a list of index labels or index positions to the .loc[] or .iloc[] method.

For example, to select 2nd, 3rd, and 4th rows from the DataFrame, you need to pass [1, 2, 3] .

import pandas as pd

# Creating a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emma'],
        'Age': [25, 30, 35, 40, 45],
        'City': ['NY', 'LA', 'SF', 'NY', 'LA']}

df = pd.DataFrame(data)

# Selecting multiple row using df.loc[]
print(df.loc[[0,2]])
print()

# Selecting multiple row using df.iloc[]
print(df.iloc[[0,2]])

Output:

      Name  Age City
0    Alice   25   NY
2  Charlie   35   SF

      Name  Age City
0    Alice   25   NY
2  Charlie   35   SF

4. Select Rows by Condition

Data analysis often requires the selection of rows based on a condition for further analysis.

To select rows from the DataFrame where the age is greater than 30, you can pass the condition df['Age'] > 30 to the .loc[] or .iloc[] method and it will return a DataFrame with rows where the condition is True.

import pandas as pd

# Creating a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emma'],
        'Age': [25, 30, 35, 40, 45],
        'City': ['NY', 'LA', 'SF', 'NY', 'LA']}

df = pd.DataFrame(data)

# Selecting rows using condition
print(df.loc[df['Age'] > 30])
print()

# Selecting rows using condition
print(df.iloc[(df['Age'] > 30).values])

Output:

      Name  Age City
2  Charlie   35   SF
3    David   40   NY
4     Emma   45   LA

      Name  Age City
2  Charlie   35   SF
3    David   40   NY
4     Emma   45   LA

5. Select Rows in Range

From a DataFrame, with 1000s of rows, what if you want to select rows from 100 to 200?

Passing a list from 100 to 200 will look messy and is not a good practice.

Instead, you can pass 100:200 to the .loc[] this will select rows from 100 to 200.

Note: You can also use 100:200 with .iloc[] but it will select rows from 100 to 199. The last index is excluded.

import pandas as pd

# Creating a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emma'],
        'Age': [25, 30, 35, 40, 45],
        'City': ['NY', 'LA', 'SF', 'NY', 'LA']}

df = pd.DataFrame(data)

# Selecting rows in range using df.loc[]
print(df.loc[0:2])

# Selecting rows in range using df.iloc[]
print(df.iloc[0:2])

Output:

      Name  Age City
0    Alice   25   NY
1      Bob   30   LA
2  Charlie   35   SF

    Name  Age City
0  Alice   25   NY
1    Bob   30   LA

Conclusion

Selecting rows by index in Pandas DataFrames can be accomplished in 2 ways: .loc[] and .iloc[].

These techniques offer flexibility in extracting specific rows based on their labels, integer positions, or conditional criteria.