TheCurlyManLives - 6 days ago 5
R Question

# How to model multiple (>2) color polya urn processes?

I am trying to build a Polya Urn model. Two colours went fine, with three however I ran into some trouble.

``````ndraws<-1000; nexps<-2000; Distribution.yellow<-matrix(0,ndraws,1); for (k in       1:nexps){
red<- 1;
yellow<- 1;
blue<-1 ;
for (n in 1:ndraws){
drawn<-sample(0:2,size=1,prob=c(red,yellow,blue)/(red+yellow +blue))
red<-?? ;
blue<-?? ;
yellow<-?? ;
}
Distribution.yellow[k]<-yellow/(red+yellow+blue) }
``````

My problem lies in translating this line of code:

`````` drawn<-sample(0:2,size=1,prob=c(red,yellow,blue)/(red+yellow +blue))
``````

Into the respective extra balls added to the urn. (hence the question marks).

With two colours I did it as follows:

``````drawn<-sample(0:1,size=1,prob=c(red,blue)/(red+blue))
red<-red+(1-drawn);
blue<-blue+(drawn);
``````

But this obviously does not work when there's more than two colours. How should I approach with three or more colours?

According to Wikipedia, the rules for a PĆ³lya urn process are:

one ball is drawn randomly from the urn and its color observed; it is then returned in the urn, and an additional ball of the same color is added to the urn, and the selection process is repeated.

In other words, drawing a ball increments the number of balls of that colo(u)r by one.

So we can set up an `if` statement that adds one red ball if `drawn==0`, one yellow ball if `drawn==1`, and one blue ball otherwise ...

``````ndraws <- 1000
nexps <- 500
set.seed(101)
yellow_final <- numeric(nexps)
for (k in 1:nexps) {
red <- 1; yellow <- 1; blue<-1
for (n in 1:ndraws) {
drawn <- sample(0:2,size=1,prob=c(red,yellow,blue)/(red+yellow+blue))
if (drawn==0) {
red <- red+1
} else if (drawn==1) {
yellow <- yellow+1
} else blue <- blue+1
}
yellow_final[k]<-yellow/(red+yellow+blue)
}
``````

A picture:

``````par(las=1,bty="l")
hist(yellow_final,col="gray",freq=FALSE,
xlab="Prop. yellow after 1000 draws")
``````