javadba - 5 months ago 15

Scala Question

Consider the

`generateLinearInput`

`LinearDataGenerator`

Here is the signature of the method:

`def generateLinearInput(`

intercept: Double,

weights: Array[Double],

xMean: Array[Double],

xVariance: Array[Double],

nPoints: Int,

seed: Int,

eps: Double): Seq[LabeledPoint] = {

and here is the core logic for generating the raw data points:

`val rnd = new Random(seed)`

val x = Array.fill[Array[Double]](nPoints)(

Array.fill[Double](weights.length)(rnd.nextDouble()))

x.foreach { v =>

var i = 0

val len = v.length

while (i < len) {

v(i) = (v(i) - 0.5) * math.sqrt(12.0 * xVariance(i)) + xMean(i)

i += 1

}

Notice in particular

`12.0`

For completeness: here is the remainder of that method - in which the input linear function is applied to the x/domain values to generate the output y/range values:

`val y = x.map { xi =>`

blas.ddot(weights.length, xi, 1, weights, 1) + intercept + eps * rnd.nextGaussian()

}

y.zip(x).map(p => LabeledPoint(p._1, Vectors.dense(p._2)))

Answer

If you have random variable *X*

So this piece of code

```
v(i) = (v(i) - 0.5) * math.sqrt(12.0 * xVariance(i)) + xMean(i)
```

should be equivalent to:

where *a'* and *b'* are the parameters of the desired uniform distribution and *EX'* is mean of the desired distribution. If you set `xMean`

to 0 the rest of the code centers input data around 0 and adjusts spread.