akki akki - 1 month ago 10
C++ Question

uniform_real_distribution not giving uniform distribution

I am trying to generate high quality random numbers in the range (0,1) in my project and I tried testing the

uniform_real_distribution
from a sample code from here. When I ran it the code it worked fine but when I tried to modify the same with seeding the generator like:

#include <random>
#include <iostream>
#include <chrono>

using namespace std;
// obtain a seed from the system clock:
unsigned seed = static_cast<int> (chrono::system_clock::now().time_since_epoch().count());

// globally defining the generator and making it static for safety because in the
// actual project this might affect the flow.

static default_random_engine gen(seed);
uniform_real_distribution<double> distribution(0.0,1.0);

int main(){
const int nrolls=10000; // number of experiments
const int nstars=95; // maximum number of stars to distribute
const int nintervals=10; // number of intervals

int p[nintervals]={};

for (int i=0; i<nrolls; ++i) {
double number = distribution(gen);
++p[int(nintervals*number)];
}

std::cout << "uniform_real_distribution (0.0,1.0):" << std::endl;
std::cout << std::fixed; std::cout.precision(1);

for (int i=0; i<nintervals; ++i) {
std::cout << float(i)/nintervals << "-" << float(i+1)/nintervals << ": ";
std::cout << std::string(p[i]*nstars/nrolls,'*') << std::endl;
}

return 0;

}


The random numbers were not uniformly distributed. The output of the same when executed repeatedly is:


F:\path>randtest

uniform_real_distribution (0.0,1.0):

0.0-0.1: *********

0.1-0.2: **********

0.2-0.3: ********

0.3-0.4: *********

0.4-0.5: *********

0.5-0.6: *********

0.6-0.7: *********

0.7-0.8: *********

0.8-0.9: *********

0.9-1.0: **********

F:\path>randtest

uniform_real_distribution (0.0,1.0):

0.0-0.1: *********

0.1-0.2: *********

0.2-0.3: *********

0.3-0.4: *********

0.4-0.5: *********

0.5-0.6: *********

0.6-0.7: *********

0.7-0.8: *********

0.8-0.9: *********

0.9-1.0: *********

F:\path>randtest

uniform_real_distribution (0.0,1.0):

0.0-0.1: *********

0.1-0.2: *********

0.2-0.3: *********

0.3-0.4: *********

0.4-0.5: **********

0.5-0.6: *********

0.6-0.7: *********

0.7-0.8: *********

0.8-0.9: *********

0.9-1.0: *********


Is it because of the seeding? or is it better to use a different generator?

I use G++ 5.1.0 compiler c++11 standards.

Answer

If you flipped a coin once and it landed heads, would it always land on tails the next time you flipped it?

A coin produces a uniform distribution on the set {heads, tails}. That doesn't mean for any set of flips, the number of heads and tails is equal. In fact, the chance of that happening exactly goes down as you flip more coins.

In your case, each of those intervals have a 10% chance of being selected.

The variance of such a selection is (0.1)(1-.1), or 0.09.

The expected value is 0.1.

After 10000 attempts, the expected value is going to be 1000.

Tha variance is going to be 900.

900 variance corresponds to a standard deviation of 30.

The 95-ish% confidence interval is 2 standard deviations (actually 1.96, but who cares).

So you should expect the values to typically be between 940 and 1060.

With 95 stars, each star corresponds to 10000/95=105 elements.

940/105 is approx 8.95 1060/105 is approx 10.06

So you'll usually see between 8 and 10 stars on each column. Assuming rounding down, hitting 7 or 11 stars should be very rare (as that is 3 SD away) even on 10 anti-correlated samples.

This all assumes a perfect uniform random distribution. As this models your observed behavior, your problem is with mathematics and the definition of uniform random distribution, not the C++ language.

If you want a perfect histogram, don't use a uniform random distribution. For example, you could simply start with 0, then add 0.0001 each time. After 10001 calls you'll have a uniform histogram from 0 to 1.

Uniform random simply means the chance of each region is the same.

Comments