samanv samanv - 14 days ago 5
Python Question

how to combine seperated date to datetime in a dataframe

I have a dataframe and there is three column for date (day, month, year).
I want to combine these three columns to one date column for further use.
I want to use date column for referring and plotting in matplotlib axis.
enter image description here

I tried this

(lambda x:'%s %2s %2s' % (x['year'],x['month'], x['day']),axis=1)
for create a new column and use that but it creates a string so i use
matplotlib.dates.datestr2num()
but however it is not working either.

question summary

how to combine those three columns to use as datetime in both dataframe and
matplotlib?

Answer

You can use to_datetime with subset day, month, year:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker

df = pd.DataFrame({'day':[1,2,3,4,3,4,5],
                   'month':[4,5,7,4,5,6,8],
                   'year':[2013,2013,2103,2013,2013,2103,2013],
                   'A':[1,3,5,5,6,7,9],
                   'B':[4,5,6,5,4,3,4]})

#print (df)

#convert to datetime
df['date'] = pd.to_datetime(df[['day','month','year']])
print (df)
   A  B  day  month  year       date
0  1  4    1      4  2013 2013-04-01
1  3  5    2      5  2013 2013-05-02
2  5  6    3      7  2103 2103-07-03
3  5  5    4      4  2013 2013-04-04
4  6  4    3      5  2013 2013-05-03
5  7  3    4      6  2103 2103-06-04
6  9  4    5      8  2013 2013-08-05
#remove columns
df.drop(['day','month','year'], axis=1, inplace=True)
#set index from date dolumn - datetimeindex
df.set_index('date', inplace=True)
print (df)
            A  B
date            
2013-04-01  1  4
2013-05-02  3  5
2103-07-03  5  6
2013-04-04  5  5
2013-05-03  6  4
2103-06-04  7  3
2013-08-05  9  4

#plot and set format of axis x:
ax = df.plot()
ticklabels = df.index.strftime('%Y-%m-%d')
ax.xaxis.set_major_formatter(ticker.FixedFormatter(ticklabels))
plt.show()