Naveen Balasubramanian - 1 year ago 117
C++ Question

Increasing Performance in C++

I'm trying to create threads in C++. I certainly feel that creating threads inside a for loop doesn't means parallelism. But i want to parallelize a the below code piece of logic.

``````for(int i = 0; i < 100000; i++) // for each instance in the dataset
{
for(int j = 0; j < 100000; j++) // target each other instance
{
if(i == j) continue;

float distance = 0;

for(int k = 0; k < 2000; k++)
{
float a = dataset->get_instance(i)->get(k)->operator float();
float b = dataset->get_instance(j)->get(k)->operator float();
float diff = a - b
distance += diff * diff;
}

distance = distance + 10;

}

}
``````

Is there any possibility of parallelism in the above code piece? Or can anyone provide me some code example to understand a similar parallelization of threads.

If none of the functions being shown have side effects, you could simply run one thread per iteration of the `i` loop, you could create N threads and divide the number of iterations of the outer `i` loop to each thread, or you could use `std::async`:

``````struct ShortestDistance {
float distance;
int distClass;
};

ShortestDistance inner_loop(const Dataset* dataset, int i)
{
ShortestDistance dist { MAX_FLT, 0 };

for(int j = 0; j < dataset->num_instances(); j++) // target each other instance
{
if(i == j) continue;

float distance = 0;

for(int k = 0; k < dataset->num_attributes() - 1; k++) // compute the distance between the two instances
{
float a = dataset->get_instance(i)->get(k)->operator float();
float b = dataset->get_instance(j)->get(k)->operator float();
float diff = a - b
distance += diff * diff;
}

distance = sqrt(distance);
if (distance < dist.distance) {
dist.distance = distance;
dist.distClass = dataset->get_instance(j)->get(dataset->num_attributes() - 1)->operator int32();
}
}

return dist;
}

void outer_loop(const Dataset* dataset)
{
std::vector<std::future<ShortestDistance>> vec;
for(int i = 0; i < dataset->num_instances(); i++) // for each instance in the dataset
{
vec[i] = std::async(inner_loop, dataset, i);
}

DistanceResult overallResult { FLT_MAX, 0 };
for (auto&& fut : vec)
{
DistanceResult threadResult = fut.get();
if (threadResult.distance < overallResult.distance)