Read TSV File in Python


TSV (Tab Separated Values) are a type of file format that stores data in tabular form. It is similar to CSV (Comma Separated Values) file format but instead of comma, it uses tab as a delimiter.

In this tutorial, you will learn various ways to load and read TSV files in Python.

    Table of Contents

  1. Using csv module
  2. Using Pandas Library
  3. Using Built-in Functions
  4. Speed Comparison

For testing purpose we are going to use the following data.

Name        Age     Occupation
John        32      Engineer
Emily       28      Teacher
Michael     42      Doctor
Sarah       35      Lawyer
David       39      Architect

1. Using csv module

The csv module in Python provides functionalities to work with CSV and TSV files.

To read a TSV file using the csv module, open the file using open() function and pass the file object to csv.reader() function with '\t' as the delimiter.

Here is the code to read the above TSV file using csv module.

import csv

# Path to the TSV file
tsv_file = 'data.tsv'

# Open the TSV file using 'csv.reader' with tab delimiter
with open(tsv_file, 'r') as file:
    tsv_reader = csv.reader(file, delimiter='\t')
    for row in tsv_reader:
        print(row)

Output:

['Name        Age     Occupation']
['John        32      Engineer']
['Emily       28      Teacher']
['Michael     42      Doctor']
['Sarah       35      Lawyer']
['David       39      Architect']

2. Using Pandas Library

Pandas is a Python library that provides high-performance, easy-to-use data structures and data analysis tools.

To read a TSV file using Pandas, use read_csv() function and pass the file path as an argument.

Here is the code to read the above TSV file using Pandas.

import pandas as pd

# Path to the TSV file
tsv_file = 'data.tsv'

# Read the TSV file into a DataFrame using Pandas
data = pd.read_csv(tsv_file, sep='\t')
print(data)

Output:

         Name        Age     Occupation
0        John        32      Engineer
1       Emily        28      Teacher
2     Michael        42      Doctor
3       Sarah        35      Lawyer
4       David        39      Architect

3. Using Built-in Functions

In Python we have built-in functions like open() and strip() that can be used to read a TSV file.

For this open the file using open() function and use strip() function to remove the trailing newline character from each line.

Then split each line using split() function with '\t' as the delimiter.

The following code shows how to do it.

# Path to the TSV file
tsv_file = 'data.tsv'

# Open the TSV file using 'open' function
with open(tsv_file, 'r') as file:
    # Iterate over each line
    for line in file:
        # Remove the trailing newline character
        line = line.strip()

        # Split each line using 'split' function
        line = line.split('\t')

        # read the data
        print(line)

Output:

['Name        Age     Occupation']
['John        32      Engineer']
['Emily       28      Teacher']
['Michael     42      Doctor']
['Sarah       35      Lawyer']
['David       39      Architect']

Speed Comparison

Reading TSV files in Python can be accomplished using various methods. Here we have compared the speed of each method and plotted a graph.

read tsv file Python
Comparison of different methods to read TSV files in Python

As you can see from the graph, the fastest method to read TSV files in Python is Method 1 i.e. using csv module.

Happy coding! 😊