Saad - 2 months ago 14

Java Question

I have a program which reads in a pair of integers from a file and stores that in a Point class which i have created. The first integer is the x coordinate and the second is the y coordinate on each line of the file. All valid points have **x-coordinates in the range [0, 40]** and **y-coordinates in the range [1, 20]**.

The input file contains data like this:

I then have to plot a regression line on top of those points.

Th formula used for regression line is this:

Below is the code snippet that i got from @sprinter:

`initializeArray(charArray);`

int xySum = 0;

int xSqSum = 0;

int xSum = 0;

int ySum = 0;

for (Point points: point) {

xySum += points.getX() * points.getY();

xSqSum += points.getX() * points.getX();

xSum += points.getX();

ySum += points.getY();

}

int xMean = xSum / count;

int yMean = ySum / count;

int n = point.size();

int slope = (xySum - n* xMean * yMean) / (xSqSum - n * xMean * xMean);

for (Point points: point) {

charArray[points.getX()][points.getY()] = 'X';

}

// plot the regression line

for (int x = 0; x <charArray.length; x++) {

int y = yMean + slope * (x - xMean); // calculate regression value

charArray[x][y] = charArray[x][y] == 'X' ? '*' : '-';

}

Also this is how i am initializing the charArray:

`public static void initializeArray(char[][] charArray) {`

for(int k =0; k< charArray.length; k++) {

for(int d = 0; d<charArray[k].length;d++) {

charArray[k][d] = ' ';

}

}

}

Answer

I'm finding it hard to understand what the `fillArray`

function is supposed to do. You could have multiple 'y' values for each 'x' value in your list of points so i assume you are calling this once for each point. But the regression line has lots of 'x' values that aren't in the list of points which means you would have to call this once for each regression point. You also don't need to return the array after filling the value.

Your slope calculation doesn't seem to match the formula at all. This would make more sense to me:

```
float xySum = 0;
float xSqSum = 0;
float xSum = 0;
float ySum = 0;
for (Point point: points) {
xySum += point.x * point.y;
xSqSum += point.x * point.x;
xSum += point.x;
ySum += point.y;
}
float xMean = xSum / count;
float yMean = ySum / count;
float n = points.size();
float slope = (xySum - n* xMean * yMean) / (xSqSum - n * xMean * xMean);
```

I suspect you would be much better off plotting all the points then plotting the regression line.

```
List<Point> points = ...;
// first plot the points
for (Point point: points) {
array[point.x][point.y] = 'X';
}
// now plot the regression line
for (int x = 0; x < 40; x++) {
int y = Math.round(yMean + slope * (x - xMean));
array[x][y] = array[x][y] == 'X' ? '*' : '-';
}
```

By the way, if you are familiar with Java 8 streams then you could use:

```
double n = points.size();
double xySum = points.stream().mapToDouble(p -> p.x * p.y).sum();
double xSqSum = points.stream().mapToDouble(p -> p.x * p.x).sum();
double xMean = points.stream().mapToDouble(p -> p.x).sum() / n;
double yMean = points.stream().mapToDouble(p -> p.y).sum() / n;
```

Finally, your x dimension is the first and y dimension second. So to print you need to iterate through y first, not x:

```
for (int y = 0; y < 20; y++) {
for (int x = 0; x < 40; x++) {
System.out.print(array[x][20-y-1]);
}
System.out.println();
}
```