C.G. C.G. - 1 month ago 7
Python Question

Find the same date from two sets of data

I am new to Python. I got two sets of data shown as below.

Set 1:

Gmt time,Open,High,Low,Close,Volume,RSI,,Change,Gain,Loss,Avg Gain,Avg Loss,RS
15.06.2017 00:00:00.000,0.75892,0.76313,0.7568,0.75858,107799.5406,0,,,,,,,
16.06.2017 00:00:00.000,0.75857,0.76294,0.75759,0.76202,94367.4299,0,,0.00344,0.00344,0,,,
18.06.2017 00:00:00.000,0.76202,0.76236,0.76152,0.76188,5926.0998,0,,-0.00014,0,0.00014,,,
19.06.2017 00:00:00.000,0.76189,0.76289,0.75848,0.75902,87514.849,0,,-0.00286,0,0.00286,,,
...


Set 2:

Gmt time,Open,High,Low,Close,Volume
15.06.2017 00:00:00.000,0.75892,0.75933,0.75859,0.75883,4777.4702
15.06.2017 01:00:00.000,0.75885,0.76313,0.75833,0.76207,7452.5601
15.06.2017 02:00:00.000,0.76207,0.76214,0.76106,0.76143,4798.4102
15.06.2017 03:00:00.000,0.76147,0.76166,0.76015,0.76154,4961.4502
15.06.2017 04:00:00.000,0.76154,0.76162,0.76104,0.76121,2977.6399
15.06.2017 05:00:00.000,0.7612,0.76154,0.76101,0.76151,3105.4399
...


I want to find lines in Set 2 in the same date with Set 1. I tried this:
print(daily['Gmt time'][0].date == hourly['Gmt time'][0].date)
, but I don't know why it came out False. Isn't there a way to compare the date(just date, not including time) from two sets of data?

Answer Source

First read the data sets into dataframes:

import pandas as pd
df_one = pd.DataFrame.from_csv('data_set_one.csv', index_col=False)
df_two = pd.DataFrame.from_csv('data_set_two.csv', index_col=False)

Convert date column to date

df_one['Gmt date'] = pd.to_datetime(df_one['Gmt time']).dt.date
df_two['Gmt date'] = pd.to_datetime(df_two['Gmt time']).dt.date

now compare both the dataframes:

for i, row in df_one.iterrows():
    df_one_date = row['Gmt date']
    print('df_one_date', df_one_date)
    print(df_two[df_two['Gmt date'] == df_one_date])
    print('----')

it's still unclear how you want to handle for different dates from df_one to match df_two. Hope this gives you enough idea on how to handle it.