a.powell - 5 months ago 17

R Question

I cannot find a complete answer to this question. I am attempting to solve a similar system of equations to this:

`r_Aus <- 8.7 + r_Fra + r_Ser + r_USA`

r_Fra <- 2.7 + r_Aus + r_Chi + r_Ser

r_USA <- 37 + r_Chi + r_Ven + r_Aus

r_Chi <- -29.7 + r_USA + r_Fra + r_Ven

r_Ser <- 2.7 + r_Ven + r_Aus + r_Fra

r_Ven <- -21.3 + r_Ser + r_USA + r_Chi

How could I solve for each country variable??

Answer

It is super sophisticated to understand how a person could accept an answer but vote it down at the same time...

**Preparation**

We first express your linear system in matrix form `A * x = b`

. In case you are unclear of how to do this, have a read on General forms. For your example, you can express it as:

```
## x = r_Aus, r_Chi, r_Fra, r_Ser, r_USA, r_Ven
r_Aus - r_Fra - r_Ser - r_USA = 8.7
- r_Aus - r_Chi + r_Fra - r_Ser = 2.7
- r_Aus - r_Chi + r_USA - r_Ven = 37
+ r_Chi - r_Fra - r_USA - r_Ven = -29.7
- r_Aus - r_Fra + r_Ser - r_Ven = 2.7
- r_Chi - r_Ser - r_USA + r_Ven = -21.3
```

then define coefficient matrix `A`

and RHS vector `b`

:

```
A <- matrix(c( 1, 0, -1, -1, -1, 0,
-1, -1, 1, -1, 0, 0,
-1, -1, 0, 0, 1, -1,
0, 1, -1, 0, -1, -1,
-1, 0, -1, 1, 0, -1,
0, -1, 0, -1, -1, 1),
nrow = 6, ncol = 6, byrow = TRUE)
b <- as.matrix(c(8.7, 2.7, 37, -29.7, 2.7, -21.3))
```

**Trying solve()**

Almost always, we think about `solve`

first. But `solve()`

is based on LU factorization, and requires a full rank coefficient matrix `A`

; when `A`

is found rank-deficient, LU factorization meets a 0 diagonal element and fails. Let's try your `A`

and `b`

:

```
solve(A, b)
#Error in solve.default(A, b) :
# Lapack routine dgesv: system is exactly singular: U[6,6] = 0
```

`U[0,0] = 0`

tells you, your `A`

has only a rank of 5.

**A stable method: QR factorization**

QR factorization is known to be a very stable method. We can make use of `.lm.fit()`

to do this:

```
x <- .lm.fit(A, b)
x$coef
# [1] 4.783333 -5.600000 -21.450000 -18.650000 40.866667 0.000000
x$rank
# [1] 5
```

Your system is of rank-5, so least square fitting is performed. The 6th value is `r_Ven`

is constrained at 0, and none of your equations are exactly satisfied. `x$resi`

gives you residuals, i.e., `b - A %*% x$beta`

.

**Gaussian elimination**

To complete the picture, I have to mention Gaussian elimination. **In theory** this is the best approach, as you can determine whether:

- there is a unique solution;
- there is no solution;
- there are infinite number of solutions

as well as solving the linear system.

There is a small R package `optR`

around, but as I found out, it is not doing a perfect job.

```
#install.packages("optR")
library(optR)
```

`?optR`

gives a full rank linear system as an example which works certainly fine (where simply using `solve(A, b)`

would also work!). But for your system with rank-5, it gives:

```
optR(A, b, method="gauss")
call:
optR.default(x = A, y = b, method = "gauss")
Coefficients:
[,1]
[1,] 9.466667
[2,] -24.333333
[3,] -16.766667
[4,] -4.600000
[5,] 22.133333
[6,] 0.000000
Warning messages:
1: In opt.matrix.reorder(A, tol) : Singular Matrix
2: In opt.matrix.reorder(A, tol) : Singular Matrix
```

Note the warning message that your linear system is rank-deficient. To understand what `optR`

does in such case, compare `b`

with

```
A %*% x$beta
# [,1]
#[1,] 8.7
#[2,] 2.7
#[3,] 37.0
#[4,] -29.7
#[5,] 2.7
#[6,] 6.8
```

The first 5 equations are satisfied, except the 6th. So, `optR`

abandoned your last equation to address rank-deficiency, instead of doing least square fitting.