NateGreco NateGreco - 1 month ago 9
C++ Question

Fastest way to calculate 'percentage matched' of two trapezoids

I asked the following question here and got a good solution, but have found it to be too slow as far as performance (Takes 2-300 ms with 640x480 image). Now I would like to consider how it can be optimized.


Given two polygons (ALWAYS trapezoids that are parallel to the X axis), I would like to calculate by some means how much they match up. By this, I mean overlapping area is not sufficient, because if one polygon has excess area, somehow that needs to count against it. Optimally, I would like to know what percent of the area created by both polygons is common. See image for example as what is desired.


A working (but slow) solution:

- Draw polygon one on an empty image (cv::fillConvexPoly)

- Draw polygon two on an empty image (cv::fillConvexPoly)

- Perform a bitwise and to create an image of all the pixels that overlapped

- Count all nonzero pixels --> overlapped Pixels

- Invert the first image and repeat with non-inverted second --> excessive pixels

- Invert the second image and repeat with non-inverted first --> more excessive pixels

- Take the 'overlapped pixels' over the sum of the 'excessive pixels'

As you can see the current solution is computationally intensive because it is evaluating/ operating on every single pixel of an image ~12 times or so. I would rather a solution that calculates this area that goes through the tedious construction and evaluation of several images.

Existing code:

typedef std::array<cv::Point, 4> Polygon;

float PercentMatch( const Polygon& polygon,
const cv::Mat optimalmat )
//Create blank mats
cv::Mat polygonmat{ cv::Mat(optimalmat.rows, optimalmat.cols, CV_8UC1, cv::Scalar(0)) };
cv::Mat resultmat{ cv::Mat(optimalmat.rows, optimalmat.cols, CV_8UC1, cv::Scalar(0)) };

//Draw polygon
cv::Point cvpointarray[4];
for (int i =0; i < 4; i++ ) {
cvpointarray[i] = cv::Point(POLYGONSCALING * polygon[i].x, POLYGONSCALING *
cv::fillConvexPoly( polygonmat, cvpointarray, 4, cv::Scalar(255) );

//Find overlapped pixels
cv::bitwise_and(polygonmat, optimalmat, resultmat);
int overlappedpixels { countNonZero(resultmat) };

//Find excessive pixels
cv::bitwise_not(optimalmat, resultmat);
cv::bitwise_and(polygonmat, resultmat, resultmat);
int excessivepixels { countNonZero(resultmat) };
cv::bitwise_not(polygonmat, resultmat);
cv::bitwise_and(optimalmat, resultmat, resultmat);
excessivepixels += countNonZero(resultmat);

return (100.0 * overlappedpixels) / (overlappedpixels + excessivepixels);

Currently the only performance improvements i've devised is drawing the 'optimalmat' outside of the function so it isn't redrawn (it gets compared to many other polygons), and also I added a POLYGONSCALING to size down the polygons and lose some resolution but gain some performance. Still too slow.


I may have misunderstood what you want but I think you should be able to do it faster like this...

  • Fill your first trapezoid with 1's on a background of zeroes.
  • Fill your second trapezoid with 2's on a background of zeroes.
  • Add the two Mats together.

Now each pixel must be either 0, 1, 2 or 3. Create an array with 4 elements and in a single pass over all elements, with no if statements, simply increment the corresponding array element according to the value of each pixel.

Then total of pixels in the first index of the array are where neither trapezoid is present, the elements with indices 1 and 2 are where trapezoid 1 or 2 was present, and the elements with index 3 are overlap.

Also, try benchmarking the flood-fills of the two trapezoids, if that is a significant proportion if your time, maybe have a second thread filling the second trapezoid.

enter image description here


I wrote some code to try out the above theory, and with a 640x480 image it takes:

  • 181 microseconds to draw the first polygon
  • 84 microseconds to draw the second polygon
  • 481 microseconds to calculate the overlap

So the total time is 740 microseconds on my iMac.

You could draw the second polygon in parallel with the first, but the thread creation and joining time is around 20 microseconds, so you would only save 60 microseconds there which is 8% or so - probably not worth it.

Most of the code is timing and debug:

#include "opencv2/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <chrono>

using namespace cv;
using namespace std;

const int COLS=640;
const int ROWS=480;

typedef std::chrono::high_resolution_clock hrclock;
hrclock::time_point t1,t2;
std::chrono::nanoseconds elapsed;
int e;

main(int argc,char*argv[]){

   Mat canvas1(ROWS,COLS,CV_8UC1,Scalar(0));
   Mat canvas2(ROWS,COLS,CV_8UC1,Scalar(0));
   Mat sum(ROWS,COLS,CV_8UC1,Scalar(0));

   //Draw polygons on canvases
   Point vertices1[4]={Point(10,10),Point(400,10),Point(400,460),Point(10,460)};
   Point vertices2[4]={Point(300,50),Point(600,50),Point(600,400),Point(300,400)};

   t1 = hrclock::now();
   // FilConvexPoly takes around 150 microseconds here
   t2 = hrclock::now();
   elapsed = t2-t1;
   cout << "fillConvexPoly: " << e << "ns" << std::endl;


   t1 = hrclock::now();
   // FilConvexPoly takes around 80 microseconds here
   t2 = hrclock::now();
   elapsed = t2-t1;
   cout << "fillConvexPoly: " << e << "ns" << std::endl;

   long totals[4]={0,0,0,0};
   t1 = hrclock::now();
   for(int j=0;j<ROWS;j++){
      uchar* data= sum.ptr<uchar>(j);
      for(int i=0;i<COLS;i++) {
   t2 = hrclock::now();
   elapsed = t2-t1;
   cout << "Count overlap: " << e << "ns" << std::endl;
   for(int i=0;i<4;i++){
      cout << "totals[" << i << "]=" << totals[i] << std::endl;

Sample run

fillConvexPoly: 181338ns
fillConvexPoly: 84759ns
Count overlap: 481830ns

Verified using ImageMagick as follows:

identify -verbose sum.png | grep -A4 Histogram:

 60659: (  0,  0,  0) #000000 gray(0)
140890: (  1,  1,  1) #010101 gray(1)
 70200: (  2,  2,  2) #020202 gray(2)
 35451: (  3,  3,  3) #030303 gray(3)