TheCurlyManLives TheCurlyManLives - 6 days ago 5
R Question

How to model multiple (>2) color polya urn processes?

I am trying to build a Polya Urn model. Two colours went fine, with three however I ran into some trouble.

ndraws<-1000; nexps<-2000; Distribution.yellow<-matrix(0,ndraws,1); for (k in 1:nexps){
red<- 1;
yellow<- 1;
blue<-1 ;
for (n in 1:ndraws){
drawn<-sample(0:2,size=1,prob=c(red,yellow,blue)/(red+yellow +blue))
red<-?? ;
blue<-?? ;
yellow<-?? ;
}
Distribution.yellow[k]<-yellow/(red+yellow+blue) }


My problem lies in translating this line of code:

drawn<-sample(0:2,size=1,prob=c(red,yellow,blue)/(red+yellow +blue))


Into the respective extra balls added to the urn. (hence the question marks).

With two colours I did it as follows:

drawn<-sample(0:1,size=1,prob=c(red,blue)/(red+blue))
red<-red+(1-drawn);
blue<-blue+(drawn);


But this obviously does not work when there's more than two colours. How should I approach with three or more colours?

Answer

According to Wikipedia, the rules for a PĆ³lya urn process are:

one ball is drawn randomly from the urn and its color observed; it is then returned in the urn, and an additional ball of the same color is added to the urn, and the selection process is repeated.

In other words, drawing a ball increments the number of balls of that colo(u)r by one.

So we can set up an if statement that adds one red ball if drawn==0, one yellow ball if drawn==1, and one blue ball otherwise ...

ndraws <- 1000
nexps <- 500
set.seed(101)
yellow_final <- numeric(nexps)
for (k in 1:nexps) {
    red <- 1; yellow <- 1; blue<-1
    for (n in 1:ndraws) {
        drawn <- sample(0:2,size=1,prob=c(red,yellow,blue)/(red+yellow+blue))
        if (drawn==0) {
            red <- red+1
        } else if (drawn==1) {
            yellow <- yellow+1
        } else blue <- blue+1
    }
    yellow_final[k]<-yellow/(red+yellow+blue)
}

A picture:

par(las=1,bty="l")
hist(yellow_final,col="gray",freq=FALSE,
     xlab="Prop. yellow after 1000 draws")
Comments