How to avoid the "extra column" in Pandas DataFrame.to_csv

Saving and Restoring CSV

We start by creating a DataFrame object we want to store to CSV.

In [1]:
import pandas as pd

numbers = [1,2,3,4,5]  
letters = list("abcde") # ['a', 'b', 'c', 'd', 'e']  
df = pd.DataFrame()
df["letters"] = letters
df["numbers"] = numbers
df["squares"] = df["numbers"]**2
df
Out[1]:
letters numbers squares
0 a 1 1
1 b 2 4
2 c 3 9
3 d 4 16
4 e 5 25

One gotcha to be aware of is that when saving CSV, by default Pandas will save the index as an extra column to the left of the other columns. To avoid this behavior and only get the non-index values, set index=False when saving the CSV. Then it will read back correctly.

In [2]:
# How to avoid extra column when saving CSV
# Use index=False
df.to_csv("out.csv", index=False)
df2 = pd.read_csv("out.csv")
df2
Out[2]:
letters numbers squares
0 a 1 1
1 b 2 4
2 c 3 9
3 d 4 16
4 e 5 25