ZacAttack ZacAttack - 2 months ago 9
Python Question

How do I get all the rows of data for a specific month,or days; over a range of many years using pandas DataFrame?

I am attempting to find seasonal trends in the stock market. I want to have the ability to see how an asset has preformed i.e. what is the average return for appl(apple computer) in the month of May, since 1990? Also I would like to see i.e how has aapl performed between September and December since 1990. Finally I would also like the ability to see what days were the most profitable i.e. what was the average return for appl on Monday, Tuesday, Wednesday, Thursday, and Friday since 1990?

I am using pandas dataframes and am loading my data from a csv file loaded from yahoo finance. No matter what try I can't get this to work right, any help or input would be greatly appreciated. Also I am not using apple's stock for my code but ticker CDE

In addition, when I run my code I get only the end and the beginning of my data, how do I get it to where it will display all 6000+ rows?

from pandas_datareader import data as dreader
import pandas as pd


df = pd.read_csv("cde_data.csv",index_col='Date')


print(df['1900-05':'2016-05'])


I am trying to get the return for the month of may but I get I range instead

Open High Low Close Volume Adj Close
Date
1990-04-12 26.875 26.875 26.625 26.625 6100 250.576036
1990-04-16 26.500 26.750 26.375 26.750 500 251.752449
1990-04-17 26.750 26.875 26.750 26.875 2300 252.928863
1990-04-18 26.875 26.875 26.500 26.625 3500 250.576036
1990-04-19 26.500 26.750 26.500 26.750 700 251.752449
1990-04-20 26.750 26.875 26.750 26.875 2100 252.928863
1990-04-23 26.875 26.875 26.750 26.875 700 252.928863
1990-04-24 27.000 27.000 26.000 26.000 2400 244.693970
1990-04-25 25.250 25.250 24.875 25.125 9300 236.459076
1990-04-26 25.000 25.250 24.750 25.000 1200 235.282663
1990-04-27 25.000 25.250 25.000 25.250 1100 237.635490
1990-04-30 25.125 25.250 25.000 25.125 3500 236.459076
1990-05-01 25.375 25.500 25.250 25.250 1100 237.635490
1990-05-02 25.125 25.125 24.000 24.250 1800 228.224183
1990-05-03 25.000 25.000 24.625 24.750 9100 232.929836
1990-05-04 24.625 24.875 24.375 24.750 500 232.929836
1990-05-07 25.000 25.000 24.625 24.625 900 231.753423
1990-05-08 24.875 25.250 24.875 25.125 400 236.459076
1990-05-09 25.375 25.875 25.250 25.875 6900 243.517556
1990-05-10 26.000 26.750 26.000 26.750 5500 251.752449
1990-05-11 27.000 27.000 26.875 27.000 1800 254.105276
1990-05-14 27.000 27.250 26.750 27.000 6800 254.105276
1990-05-15 27.000 27.125 26.625 26.750 3300 251.752449
1990-05-16 26.625 26.625 25.875 25.875 2600 243.517556
1990-05-17 26.125 26.500 26.000 26.375 500 248.223210
1990-05-18 26.250 26.875 26.250 26.875 1000 252.928863
1990-05-21 27.375 27.375 26.875 27.375 2700 257.634516
1990-05-22 27.625 28.250 27.500 27.875 2000 262.340169
1990-05-23 27.375 28.500 27.125 28.000 4000 263.516583
1990-05-24 28.250 28.375 27.625 27.875 1100 262.340169
... ... ... ... ... ... ...
2016-03-18 5.490 5.750 5.390 5.590 9415600 5.590000
2016-03-21 5.560 5.940 5.550 5.760 4018800 5.760000
2016-03-22 5.810 5.890 5.680 5.800 3429600 5.800000
2016-03-23 5.330 5.570 5.200 5.250 4445500 5.250000
2016-03-24 5.260 5.400 5.150 5.280 2668800 5.280000
2016-03-28 5.320 5.480 5.210 5.440 2093700 5.440000
2016-03-29 5.400 5.850 5.380 5.710 3709800 5.710000
2016-03-30 5.640 5.780 5.490 5.650 2444900 5.650000
2016-03-31 5.800 5.860 5.570 5.620 2319800 5.620000
2016-04-01 5.410 5.650 5.210 5.640 2922400 5.640000
2016-04-04 5.620 5.690 5.430 5.550 2561200 5.550000
2016-04-05 5.620 5.770 5.440 5.730 2294900 5.730000
2016-04-06 5.630 5.880 5.610 5.820 2108400 5.820000
2016-04-07 5.900 6.110 5.870 5.940 2963100 5.940000
2016-04-08 5.790 6.030 5.750 6.010 3583700 6.010000
2016-04-11 6.160 6.500 6.110 6.490 5140100 6.490000
2016-04-12 6.580 6.730 6.330 6.720 4015000 6.720000
2016-04-13 6.640 6.990 6.600 6.700 3972300 6.700000
2016-04-14 6.660 6.750 6.220 6.380 4125700 6.380000
2016-04-15 6.410 6.750 6.370 6.670 2907800 6.670000
2016-04-18 6.700 6.830 6.530 6.790 2452900 6.790000
2016-04-19 7.110 7.450 6.970 7.380 6057600 7.380000
2016-04-20 7.410 7.680 6.820 7.000 6494400 7.000000
2016-04-21 7.300 7.530 6.940 7.140 4394000 7.140000
2016-04-22 7.080 7.380 6.730 6.890 3838700 6.890000
2016-04-25 6.850 7.040 6.720 6.870 2905400 6.870000
2016-04-26 6.900 7.190 6.700 7.100 2743900 7.100000
2016-04-27 7.160 7.280 6.870 7.180 3558900 7.180000
2016-04-28 7.350 7.960 7.080 7.440 6516000 7.440000
2016-04-29 7.650 8.140 7.650 8.100 6457000 8.100000

[6564 rows x 6 columns]
Press any key to continue . . .

Answer

Use the standard libraries datetime and write helper functions to do the conversions you want. Then make a new column by applying the helper function to the Date column.

from datetime import datetime
import dateutil.parser

def hour(x):
    return(x.hour)
def dow(x):
    return(x.isoweekday())
def month(x):
    return(x.month)

df.reset_index()
df.Date     = df.Date.apply(dateutil.parser.parse)
df["hour"]  = df.Date.apply(hour)
df["dow"]   = df.Date.apply(dow)
df["month"] = df.Date.apply(month)

Now you can do group by on the column you have created or slice on the column you have created. To get all the Fridays in January slice like this:

df[(df.dow == 5) & (df.month == 1)]

To print more rows you can change the settings by including the following line below your imports:

pd.options.display.max_rows = 6000
Comments