How to rename columns in Pandas Dataframe

In this tutorial, we will cover various methods to rename columns in pandas dataframe in Python. Renaming or changing the names of columns is one of the most common data wrangling task. If you are not from programming background and worked only in Excel Spreadsheets in the past you might feel it not so easy doing this in Python as you can easily rename columns in MS Excel by just typing in the cell what you want to have. If you are from database background it is similar to ALIAS in SQL. In Python there is a popular data manipulation package called pandas which simplifies doing these kind of data operations.
rename columns in Pandas Dataframe
2 Methods to rename columns in Pandas
In Pandas there are two simple methods to rename name of columns.

First step is to install pandas package if it is not already installed. You can check if the package is installed on your machine by running !pip show pandas statement in Ipython console. If it is not installed, you can install it by using the command !pip install pandas.

Import Dataset for practice

To import dataset, we are using read_csv( ) function from pandas package.

import pandas as pd
df = df = pd.read_csv("https://raw.githubusercontent.com/JackyP/testing/master/datasets/nycflights.csv", usecols=range(1,17))
To see the names of columns in a data frame, write the command below :
df.columns
Index(['year', 'month', 'day', 'dep_time', 'dep_delay', 'arr_time',
       'arr_delay', 'carrier', 'tailnum', 'flight', 'origin', 'dest',
       'air_time', 'distance', 'hour', 'minute'],
      dtype='object')
Method I : rename() function
Suppose you want to replace column name year with years. In the code below it will create a new dataframe named df2 having new column names and same values.
df2 = df.rename(columns={'year':'years'})
If you want to make changes in the same dataset df you can try this option inplace = True
df.rename(columns={'year':'years'}, inplace = True)
By default inplace = False is set, hence you need to specify this option and mark it True. If you want to rename names of multiple columns, you can specify other columns with comma separator.
df.rename(columns={'year':'years', 'month':'months' }, inplace = True)
Method II : dataframe.columns = [list]
You can also assign the list of new column names to df.columns. See the example below. We are renaming year and month columns here.
df.columns = ['years', 'months', 'day', 'dep_time', 'dep_delay', 'arr_time',
       'arr_delay', 'carrier', 'tailnum', 'flight', 'origin', 'dest',
       'air_time', 'distance', 'hour', 'minute']
Rename columns having pattern
Suppose you want to rename columns having underscore '_' in their names. You want to get rid of underscore
df.columns = df.columns.str.replace('_' , '')
New column names are as follows. You can observe no underscore in the column names.
  Index(['year', 'month', 'day', 'deptime', 'depdelay', 'arrtime', 'arrdelay',
       'carrier', 'tailnum', 'flight', 'origin', 'dest', 'airtime', 'distance',
       'hour', 'minute'],
      dtype='object')
Rename columns by Position
If you want to change the name of column by position (for example renaming first column) you can do it by using the code below. df.columns[0] refers to first column.
df.rename(columns={ df.columns[0]: "Col1" }, inplace = True)  
Rename columns in sequence
If you want to change the name of column in sequence of numbers you can do it by iterating via for loop.
df.columns=["Col"+str(i) for i in range(1, 17)]
In the code below df.shape[1] returns no. of columns in the dataframe. We need to add 1 here as range(1,17) returns 1, 2, 3 through 16 (excluding 17).
df.columns=["Col"+str(i) for i in range(1, df.shape[1] + 1)]
Add prefix / suffix in column names
In case you want to add some text before or after existing column names, you can do it by using add_prefix( ) and add_suffix( ) functions.
df = df.add_prefix('V_')
df = df.add_suffix('_V')
How to access columns having space in names
For demonstration purpose we can add space in some column names by using df.columns = df.columns.str.replace('_' , ' '). You can access the column using the syntax df["columnname"]
df["arr delay"]
How to change row names
With the use of index option, you can rename rows (or index). In the code below, we are altering row names 0 and 1 to 'First' and 'Second' in dataframe df. By creating dictionary and taking previous row names as keys and new row names as values.
df.rename(index={0:'First',1:'Second'}, inplace=True)
Related Posts
About Author:

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he has worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and Human Resource.

0 Response to "How to rename columns in Pandas Dataframe"

Post a comment

Next → ← Prev
Love this Post? Spread the Word!
Share