Python Loop Through Files in Directory


When working with file operations in Python, it is most common task to access files in a directory. This task can be accomplished using various methods provided by the Python programming language.

In this article, we will explore multiple approaches to loop through files in a directory using Python.

    Table of Contents

  1. Using os module
  2. Using glob module
  3. Using os.walk() method
  4. Using pathlib module
  5. Speed Comparison

For this article, we will be using a directory named test which contains 5 files.

test
  ├── file1.txt
  ├── file2.py
  ├── file3.js
  ├── file4.java
  └── file5.cpp

1. Using os module

Python's os module provides a simple but effective way to access files in a directory. The os.listdir() method returns a list of all the files and directories in the specified path.

From these files and directories, we can filter out the files using os.path.isfile() method and then loop through them.

Here is how you can loop through files in a directory using os module.

import os

# path to the directory
path = "/test"

# iterate over all the files in the directory
for filename in os.listdir(path):
    # check whether the file is a file or directory
    if os.path.isfile(os.path.join(path, filename)):
        print(filename)

Output:

file1.txt
file2.py
file3.js
file4.java
file5.cpp

Here, we accessed all the files in the directory using os.listdir() method and then filtered out the files using os.path.isfile() method by passing the path of the file as an argument.


2. Using glob module

The glob() method in Python's glob module returns a list of all the files and directories in the specified path.

To select all the files in a directory, you can pass * as an argument with the path.

Here is an example for this.

import glob

# path to the directory
path = "/test"

# iterate over all the files in the directory
for filename in glob.glob(path + "/*"):
    print(filename)

Output:

test/file1.txt
test/file2.py
test/file3.js
test/file4.java
test/file5.cpp

3. Using os.walk() method

The os.walk() method from os module returns a generator object which can be used to iterate over all the files and directories in a directory.

The following example shows the use of os.walk() method.

import os

# path to the directory
path = "/test"

# iterate over all the files in the directory
for root, dirs, files in os.walk(path):
    for filename in files:
        print(filename)

Output:

file1.txt
file2.py
file3.js
file4.java
file5.cpp

4. Using pathlib module

The pathlib module introduced in Python 3 provides an object-oriented approach to file system operations. The Path() class can be used to iterate through files in a directory.

Let's take a look:

from pathlib import Path

directory = Path('/test')

# iterate over all the files in the directory
for file_path in directory.iterdir():
    if file_path.is_file():
        print(file_path)

Output:

test/file1.txt
test/file2.py
test/file3.js
test/file4.java
test/file5.cpp

In this method, we create a Path object representing the directory path. We then use the iterdir() method to iterate over all items (files and directories) in the directory. By checking is_file(), we can filter out directories and focus on files for further processing.


Speed Comparison

After running all of above methods on a directory containing 5 files for 1000 times, we have plotted the results in the following graph.

Python Loop Through Files in Directory
Speed Comparison

From the graphp we can clearly see that Method 3, i.e. os.walk() method is the fastest method to loop through files in a directory.