Chaos - 1 year ago 114
Python Question

# Python: creating a covariance matrix from lists

Is there a quickest way to go from the following three lists to a covariance matrix in Python (numpy array)?

``````Fac2 Fac1  VarCovar
a    a       1.4
a    b       0.7
a    c       0.3
b    a       0.7
b    b       1.8
b    c       6.3
c    a       0.3
c    b       6.3
c    c       2.4
``````

You can create the 3x3 matrix easily using Pandas. Create a DataFrame `df` from the above array and pivot on the third column using `pivot_table`.

For example if you have the following dictionary `d` of lists:

``````{'Fac1': ['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c'],
'Fac2': ['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'],
'VarCovar': [1.4, 0.7, 0.3, 0.7, 1.8, 6.3, 0.3, 6.3, 2.4]}
``````

Create the DataFrame like this:

``````df = pd.DataFrame(d)
``````

And then:

``````>>> df.pivot_table(rows='Fac1', cols='Fac2', values='VarCovar')
Fac2    a    b    c
Fac1
a     1.4  0.7  0.3
b     0.7  1.8  6.3
c     0.3  6.3  2.4
``````

Using the `values` attribute on the end returns a NumPy array from the table:

``````>>> df.pivot_table(rows='Fac1', cols='Fac2', values='VarCovar').values
array([[ 1.4,  0.7,  0.3],
[ 0.7,  1.8,  6.3],
[ 0.3,  6.3,  2.4]])
``````

If you don't have all pairs, you can proceed in the same way and fill in the missing values with the transposed index pair:

``````>>> d = {'Fac1': ['a', 'b', 'c' , 'b', 'c', 'c'],
'Fac2': ['a', 'a', 'a' , 'b', 'b', 'c'],
'VarCovar': [1.4, 0.7, 0.3, 1.8, 6.3, 2.4]}
>>> df = pd.DataFrame(d)
>>> table = df.pivot_table(rows='Fac1', cols='Fac2', values='VarCovar')
>>> table.combine_first(table.T)
Fac2    a    b    c
Fac1
a     1.4  0.7  0.3
b     0.7  1.8  6.3
c     0.3  6.3  2.4
``````

(I took the idea of using `combine_first` from DSM's answer here)

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download