Amit Kohli - 5 days ago 4
R Question

# create a vector of dates that follow a probability distribution

I am trying to create a fake dataset for training purposes, and would like a function to create a vector of dates that matches a certain probability distribution... ie - there should be more dates from a certain range selected than another.

I know that to select a range of dates, I can do this:

`seq(as.Date("1940-12-30"), as.Date("2005-01-04"), by="days")`

And to assign to a population, I can do this:

`dchisq(x=1:500,df = 100)`
or
`rlnorm(500,1,.6)`

But I'm drawing a blank on how to make the
`seq()`
draw from one of the specific probability distributions mentioned above. So how do I draw dates according to the pattern?

Answer

If you can describe what probability you want for each date, you can do this with sample. Here is an example that samples from the days of 2005 using a Gaussian distribution centered at mid-year.

``````    Y05 = seq(as.Date("2005-01-01"), as.Date("2005-12-31"), by="days")
Prob = dnorm((1:365)*4/365 - 2)
sample(Y05, 10, replace=TRUE, prob=Prob)
``````
Comments