user98235 - 1 year ago 66
R Question

# Calculating the transition probabilities in R

Let's assume that we have the following 4 states: (A, B, C, D)

The table I have has the following format

``````old   new
A      B
A      A
B      C
D      B
C      D
.      .
.      .
.      .
.      .
``````

I would like the calculate the following probabilities based on the data given in the table:

``````P(new=A | old=A)
P(new=B | old=A)
P(new=C | old=A)
P(new=D | old=A)
P(new=A | old=B)
.
.
.
.
P(new=C | old=D)
P(new=D | old=D)
``````

I can do it in a manual way, summing up all the values when each transition happens and dividing by the number of rows, but I was wondering if there's a built-in function in R that calculates those probabilities or at least helps to fasten calculating those probabilities.

Any help/input would be greatly appreciated. If there's no such function, oh well.

In base R, you could use `prop.table` on a table object:

``````transMat <- prop.table(with(df, table(old, new)), 2)
transMat
new
old          A          B          C          D
A 0.26315789 0.27272727 0.18181818 0.22222222
B 0.31578947 0.36363636 0.09090909 0.22222222
C 0.21052632 0.27272727 0.45454545 0.33333333
D 0.21052632 0.09090909 0.27272727 0.22222222
``````

Here, the columns sum to 1:

``````colSums(transMat)
A B C D
1 1 1 1
``````

edit On further reflection, I think using margin=1 is actually the desired outcome since old (the conditioned variable) is in the rows and because p(A|A) + p(B|A) + p(C|A) + p(D|A) should equal 1. In this scenario,

``````transMat <- prop.table(with(df, table(old, new)), 1)
transMat
new
old          A          B          C          D
A 0.41666667 0.25000000 0.16666667 0.16666667
B 0.46153846 0.30769231 0.07692308 0.15384615
C 0.26666667 0.20000000 0.33333333 0.20000000
D 0.40000000 0.10000000 0.30000000 0.20000000
``````

will work. alternatively, the transpose `prop.table(with(df, table(new, old)), 2)`.

data

``````set.seed(1234)
df <- data.frame(old=sample(LETTERS[1:4], 50, replace=TRUE),
new=sample(LETTERS[1:4], 50, replace=TRUE))
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download