Naveen Balasubramanian Naveen Balasubramanian - 2 months ago 13
C++ Question

Threads in C++ for High Performance

I'm trying to create parallel threads in C++. I certainly feel that creating threads inside a for loop doesn't means parallelism. But i want to parallelize a the below code piece of logic.

for(int i = 0; i < 100000; i++) // for each instance in the dataset
{
for(int j = 0; j < 100000; j++) // target each other instance
{
if(i == j) continue;

float distance = 0;

for(int k = 0; k < 2000; k++)
{
float a = dataset->get_instance(i)->get(k)->operator float();
float b = dataset->get_instance(j)->get(k)->operator float();
float diff = a - b
distance += diff * diff;
}

distance = distance + 10;

}

}


Is there any possibility of parallelism in the above code piece? Or can anyone provide me some code example to understand a similar parallelization of threads.

Answer

If none of the functions being shown have side effects, you could simply run one thread per iteration of the i loop, you could create N threads and divide the number of iterations of the outer i loop to each thread, or you could use std::async:

struct ShortestDistance {
    float distance;
    int distClass;
};

ShortestDistance inner_loop(const Dataset* dataset, int i)
{
    ShortestDistance dist { MAX_FLT, 0 };

    for(int j = 0; j < dataset->num_instances(); j++) // target each other instance
    {
        if(i == j) continue;

        float distance = 0;

        for(int k = 0; k < dataset->num_attributes() - 1; k++) // compute the distance between the two instances
        {
            float a = dataset->get_instance(i)->get(k)->operator float();
            float b = dataset->get_instance(j)->get(k)->operator float();
            float diff = a - b
            distance += diff * diff;
        }

        distance = sqrt(distance);
        if (distance < dist.distance) {
            dist.distance = distance;
            dist.distClass = dataset->get_instance(j)->get(dataset->num_attributes() - 1)->operator int32();
        }
    }

    return dist;
}

void outer_loop(const Dataset* dataset)
{
    std::vector<std::future<ShortestDistance>> vec;
    for(int i = 0; i < dataset->num_instances(); i++) // for each instance in the dataset
    {    
        vec[i] = std::async(inner_loop, dataset, i);
    }

    DistanceResult overallResult { FLT_MAX, 0 };
    for (auto&& fut : vec)
    {
        DistanceResult threadResult = fut.get();
        if (threadResult.distance < overallResult.distance)
            overallResult = threadResult);
    }
}
Comments