user3426358 - 1 year ago 108

Python Question

I have two square matrices as two csv files. Each matrix has a row and column label. Each row, column label detects relation score between two companies.

Both the marices have a few labels (companies) which are missing in the other. Also, the dimensions of both matrices are different. ( One is 427 x 427, other is 467 x 467 )

Ex: In the matrix below, the numbers in bold are labels (company serial numbers).

`0 is the score between (10010, 10010) .`

0.43967 is the score between (1013, 10010) and so on.

I need to add these two matrices in such a way that:

In the resulting matrix ( sum of the two matrix), no label (row or column labels) is lost.

`Example:`

Note that {a, b ... e} are labels. X & Y are matrices.

a b c d a e

a 1.2 1.3 1.4 1.5 a 9.1 9.2

X= b 2.1 2.2 2.3 2.4 Y= e 8.1 8.2

c 3.3 3.4 3.5 3.6

d 4.2 4.3 4.4 4.5

a b c d e

a 1.2+9.1 1.3 1.4 1.5 9.2

X+Y= b 2.1 2.2 2.3 2.4 0

c 3.3 3.4 3.5 3.6 0

d 4.2 4.3 4.4 4.5 0

e 8.1 0 0 0 8.2

I had posted a similar question but the answer given did not take into account the row labels and made the resulting matrix rectangular.

I am lost at how this should be done in python/ pandas or any other python libraries. Can someone help me with this?

The matrix files are here.

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

Assuming that the columns and indices labels are the same values, if you take the union of the columns and reindex both dfs and fill the `NaN`

with `0`

prior to adding then you get the result you desire.

if the column and index labels are not the same you can just make 2 arrays of the combined labels:

```
all_cols = x.columns | y.columns
all_indices = x.index | y.index
```

and `reindex`

.

With your example data:

```
In [11]:
all_cols = x.columns | y.columns
x = x.reindex(index=all_cols, columns=all_cols).fillna(0)
y = y.reindex(index=all_cols, columns=all_cols).fillna(0)
x+y
Out[11]:
a b c d e
a 10.3 1.3 1.4 1.5 9.2
b 2.1 2.2 2.3 2.4 0.0
c 3.3 3.4 3.5 3.6 0.0
d 4.2 4.3 4.4 4.5 0.0
e 8.1 0.0 0.0 0.0 8.2
```

Actually you can just `reindex`

the result of `add`

:

```
In [13]:
x.add(y, fill_value=0).reindex(index=all_cols,columns=all_cols).fillna(0)
Out[13]:
a b c d e
a 10.3 1.3 1.4 1.5 9.2
b 2.1 2.2 2.3 2.4 0.0
c 3.3 3.4 3.5 3.6 0.0
d 4.2 4.3 4.4 4.5 0.0
e 8.1 0.0 0.0 0.0 8.2
```

**EDIT**

regarding your real data, you need to specify that the first unnamed column is the index:

```
X= pd.read_csv('30203_Transpose.csv', index_col=0)
Y= pd.read_csv('62599_1999psCSV.csv', index_col=0)
```

Additionally your index is not the same as your columns so these need to be aggregated separately

```
all_cols = X.columns | Y.columns
all_indices = X.index | Y.index
```

then you can `add`

:

```
X.add(Y, fill_value=0).reindex(index=all_cols,columns=all_cols).fillna(0)
```

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**