Pandas loc vs iloc


Pandas, a popular Python library for data manipulation, provides two essential methods, loc and iloc, for selecting specific rows and columns in DataFrames.

While they serve similar purposes, they are different in many ways. In this article, we will discuss the differences between loc and iloc and when to use them.

    Table of Contents

  1. Pandas loc
  2. Pandas iloc
  3. Pandas loc vs iloc
  4. Conclusion

Pandas loc

The loc[] method is primarily used to select rows and columns from the DataFrame based on the labels of the rows and columns.

# Selecting a single row by name
df.loc['A']

# Selecting a single column by name
df.loc[:, 'Name']

# Selecting a single element by name and index
df.loc['A', 'Name']

# Selecting multiple rows and columns by name
df.loc[['A', 'B'], ['Name', 'Age']]

# Selecting rows and columns with a Boolean Series
df.loc[df['Age'] > 30, 'Name']

Pandas iloc

The iloc[] method is integer-based and uses integer positions to access data in a DataFrame. It enables you to select rows and columns based on their position rather than their labels.

# Select row 0 and column 0
df.iloc[0, 0]

# Select column 'Name'
df.iloc[:, 0]

# Select row 1 and all columns
df.iloc[1, :]

# Select rows 0 to 2
df.iloc[0:3]

# Select columns 1 and 2
df.iloc[:, 1:3]

# Select a diagonal slice
df.iloc[:, 0:2]

# Select rows where 'Age' is greater than 30
df.iloc[df['Age'] > 30, :]

# Select a specific element
df.iloc[1, 2]  # Selects the value in row 1, column 2

# Select rows 0 and 2, and columns 0 and 2
df.iloc[[0, 2], [0, 2]]

Pandas loc vs iloc

The following table summarizes the differences between loc and iloc.

Featurelociloc
Indexing MethodLabel-basedInteger-based
Index TypeLabels, such as column names or row numbersInteger positions (zero-based)
Error HandlingRaises KeyError if label doesn't existRaises IndexError if index is out of bounds
Slicing BehaviorIncludes the endpoints of slicesExcludes the endpoints of slices
Conditional SelectionAccepts Boolean SeriesAccepts Boolean Series or list of integers
FilteringMore intuitive for filtering based on column namesMore efficient for filtering based on integer indices
Typical Use Cases Selecting rows and columns by name, filtering based on conditions with column names Selecting rows and columns by index position, slicing data using integer ranges
Selecting a columndf.loc[:, 'column_name']df.iloc[:, column_index]
Selecting multiple columnsdf.loc[:, ['col1', 'col2']]df.iloc[:, [index1, index2]]
Selecting a rowdf.loc['index_label', :]df.iloc[row_index, :]
Selecting multiple rowsdf.loc[['label1', 'label2'], :]df.iloc[[index1, index2], :]
Conditional selectiondf.loc[df['column_name'] > threshold, :]df.iloc[(df['column_index'] > threshold).values, :]
Accessing specific elementdf.loc['label', 'column_name']df.iloc[row_index, column_index]
PerformanceSlightly slower due to label-based indexingFaster due to integer-based indexing
Pandas loc vs iloc
Difference between loc vs iloc

Conclusion

The loc and iloc methods in Pandas offer distinct approaches to selecting rows and columns in DataFrames. loc employs label-based indexing, while iloc uses integer positions for selection.

Understanding the differences between these methods is crucial for efficiently accessing and manipulating data within Pandas DataFrames.