row_to_names : Elevates a row to be the column names of a DataFrame.
Background
This notebook serves to show a brief and simple example of how to swap column names using one of the rows in the dataframe.
from io import StringIO
import janitor
import pandas as pd
data = """shoe, 220, 100
shoe, 450, 40
item, retail_price, cost
shoe, 200, 38
bag, 305, 25
"""
temp = pd.read_csv(StringIO(data), header=None)
temp
Looking at the dataframe above, we would love to use row 2 as our column names. One way to achieve this involves a couple of steps
- Use loc/iloc to assign row 2 to columns.
- Strip off any whitespace.
- Drop row 2 from the dataframe using the drop method.
- Set axis name to none.
temp.columns = temp.iloc[2, :]
temp.columns = temp.columns.str.strip()
temp = temp.drop(2, axis=0)
temp = temp.rename_axis(None, axis="columns")
temp
However, the first two steps prevent us from method chaining. This is easily resolved using the row_to_names function
df = pd.read_csv(StringIO(data), header=None).row_to_names(
row_number=2, remove_row=True
)
df