brainst brainst - 17 days ago 6
Python Question

How do we simulate randomness?

I came across this function randint() in Python that gives you a random integer from a list of integers. The thing that I am not able to digest is that how can we really simulate randomness. How can I really tell that a random function that may be in any programming language doesn't give a biased result? BellCurve?

How can we simulate something that is so natural as randomness? We can just calculate the probability of a result that can appear. But can never tell how this works.

For simulating something we need complete insight of the topic, don't we?

Answer

How can we simulate something that is so natural as randomness?

TL;DR:

  • By knowing what makes that something act "randomly",
  • By being smart and making just the right simplifying assumptions so as to not make the problem too hard,
  • By having good previously-collected statistics so as to know that a statistic model is correct,
  • By having a PRNG that is good enough from which one can simulate that random process, and
  • By having an algorithm that maps outputs from that PRNG to the underlying statistical distribution.

By knowing what makes that something act "randomly"
Radioative decay almost perfectly acts as a Poisson process. Not quite so perfectly, goals scored in a World Cup game can be modeled as a Poisson process. (But it's close enough for Las Vegas to make money.) On the other hand, the outcome of a coin toss is an example of a Bernoulli process. There are lots of different kinds of random processes, and these different random processes lead to different kinds of random distributions. Knowing what is going on underneath the hood is important.

By being smart and making just the right simplifying assumptions
One of the most helpful tools in a modeler's bag of tricks is the central limit theorem. Add lots and lots and lots of random influences together and the end result very often looks gaussian (the "Bell Curve" alluded to in the question). Assuming a gaussian distribution is a nice simplifying assumption, but it can get one in trouble. One has to be smart enough to avoid oversimplifying assumptions.

By having good previously-collected statistics
It took a while before people determined that radioactive decay was indeed a Poisson process. They determined this by having a nice history of previously taken measurements. Without previously collected statistics, all one has is a guess. Guesses are exceptionally good at biting the person who made the guess in the rear.

By having a PRNG that is good enough
There are lots of reasons for using a deterministic pseudo random number generator. That a PRNG isn't quite "random" in the sense that run #12345 of a Monte Carlo simulation can be exactly repeated can be a good thing. If a simulated vehicle blew up or if a simulated patient died in that run of a Monte Carlo simulation, any sane person would want to investigate that case in detail.

Fortunately, there are a number of very good PRNGs out there. Python uses Mersenne twister. While not the best, it is very, very good.

By having an algorithm that maps outputs from the PRNG to the underlying statistical distribution*
You're toast if there's no way to translate the outcome of Mersenne twister (or whatever PRNG you are using) to the distribution at hand. Fortunately, people before us have spent a good deal of time developing algorithms that approximate a wide number of random distributions.

The question was tagged with python, so I'd be remiss to write about python's random package and numpy's random package. The latter is even better than the built-in capabilities one gets for free as a standard python package. It provides a good number of algorithms that convert from the integer output of Mersenne twister (for example) to a wide number of frequently encountered probability distributions. (And in some cases, probability distributions that are only encountered infrequently.)