John John - 4 months ago 13
Python Question

dataframe slicing and pivoting then into multiple dataframs

Ultimately I want to run a parametric var on stock data below

I have data in the form:

Date Symbol ClosingPrice Weight
0 7/22/2016 A 46.58 0.000002
1 7/25/2016 A 46.14 0.000002
2 7/26/2016 A 46.95 0.000002
3 7/27/2016 A 47.26 0.000002
4 7/28/2016 A 47.51 0.000002
5 7/22/2016 AA 10.57 0.000287
6 7/25/2016 AA 10.49 0.000287
7 7/26/2016 AA 10.67 0.000287
8 7/27/2016 AA 10.74 0.000287
9 7/28/2016 AA 10.68 0.000287
10 7/22/2016 AAAP 30.51 0.000003
11 7/25/2016 AAAP 31.02 0.000003
12 7/26/2016 AAAP 30.85 0.000003
13 7/27/2016 AAAP 30.97 0.000003
14 7/28/2016 AAAP 31.00 0.000003


I would like to create 2 separate dataframes as such:

date A AA AAAP
7/22/2016 46.58 10.57 30.51
7/25/2016 46.14 10.49 31.02
7/26/2016 46.95 10.67 30.85
7/27/2016 47.26 10.74 30.97
7/28/2016 47.51 10.68 31


with symbols as the column headers

and

Symbol Weight
A 0.00000166
AA 0.00028664
AAAP 0.00000326


The first dataframe will be used to calculate a variance/co-variance matrix and the second dataframe represents the weights of each security in the stock portfolio (a parameter in the parametric VAR calculation)

Answer

To get the first table, you can pivot your original data frame to transform it from long to wide format on the first three columns:

import pandas as pd
df.iloc[:,0:3].pivot('Date', 'Symbol', 'ClosingPrice')

#   Symbol      A      AA    AAAP
#     Date          
#7/22/2016  46.58   10.57   30.51
#7/25/2016  46.14   10.49   31.02
#7/26/2016  46.95   10.67   30.85
#7/27/2016  47.26   10.74   30.97
#7/28/2016  47.51   10.68   31.00

The second table is the unique values of the second and fourth columns, so select them and drop_duplicates should be OK:

df.iloc[:,[1,3]].drop_duplicates()

#      Symbol     Weight
#0          A   0.000002
#5         AA   0.000287
#10      AAAP   0.000003
Comments