Tamara Bernad Tamara Bernad - 9 months ago 36
iOS Question

CraftAR Image recognition - Translating matchBoundingBox to points in screen

I am using on device image recognition from Catchoom CraftAR and working with the example available on Github https://github.com/Catchoom/craftar-example-ios-on-device-image-recognition.

The image recognition works, I would like to use the matchBoundingBox to draw some squares on all the 4 corners. Somehow the calculations I am doing are not working, I have based them on this article:


The square views are added to the scanning overlay and this is how I am calculating the points where to add the 4 views:

CraftARSearchResult *bestResult = [results objectAtIndex:0];
BoundingBox *box = bestResult.matchBoundingBox;

float w = self._preview.frame.size.width;
float h = self._preview.frame.size.height;

CGPoint tr = CGPointMake(w * box.topRightX , h * box.topRightY);
CGPoint tl = CGPointMake(w * box.topLeftX, h * box.topLeftY);
CGPoint br = CGPointMake(w * box.bottomRightX, h * box.bottomRightY);
CGPoint bl = CGPointMake(w * box.bottomLeftX, h * box.bottomLeftY);

The x position looks like it is pretty close, but the y position is completely off and looks like mirrored.

I am testing on iOS 10 iPhone 6s

Am I missing something?

Answer Source

The issue was that I was using the preview frame to make the translation to the points in screen. But the points that come through with bounding box are not relative to the preview view, they are relative to the VideoFrame (as the support people of catchoom.com pointed out). The VideoFrame size is set by the capturePreset which only accepts two values AVCaptureSessionPreset1280x720 and AVCaptureSessionPreset640x480. The default one is AVCaptureSessionPreset1280x720

So in my case I had to make the calculations with size 1280x720 and then make the conversion from those coordinates to the coordinates in my preview view size.

So it ended up looking like this:

let box = bestResult.matchBoundingBox

let wVideoFrame:CGFloat = 1080.0;
let hVideoFrame:CGFloat = 720.0;

let wRelativePreview = wVideoFrame/CGFloat(preview.frame.size.height)
let hRelativePreview = wVideoFrame/CGFloat(preview.frame.size.width)

var tl = CGPoint(x: wVideoFrame * CGFloat(box.topLeftX),y: hVideoFrame * CGFloat(box.topLeftY));
var tr = CGPoint(x: wVideoFrame * CGFloat(box.topRightX) ,y: hVideoFrame * CGFloat(box.topRightY));
var br = CGPoint(x: wVideoFrame * CGFloat(box.bottomRightX),y: hVideoFrame * CGFloat(box.bottomRightY));
var bl = CGPoint(x: wVideoFrame * CGFloat(box.bottomLeftX),y: hVideoFrame * CGFloat(box.bottomLeftY));

tl = CGPoint(x: tl.x/wRelativePreview, y: tl.y/hRelativePreview)
tr = CGPoint(x: tr.x/wRelativePreview, y: tr.y/hRelativePreview)
br = CGPoint(x: br.x/wRelativePreview, y: br.y/hRelativePreview)
bl = CGPoint(x: bl.x/wRelativePreview, y: bl.y/hRelativePreview) 

// 4 square visualize top-left, top.right, bottom-left and bottom-right points
var fr = vTL.frame;
fr.origin = tl;
vTL.frame = fr;

fr.origin = tr;
vTR.frame = fr;

fr.origin = br;
vBR.frame = fr;

fr.origin = bl;
vBL.frame = fr;

Now the points looked quite ok on screen, but they looked some how rotated. So I rotated the view 90 degrees:

// overlay is the container of the 3 squares to visualize the points in screen
overlay.transform = CGAffineTransform(rotationAngle: CGFloat(M_PI/2.0))

Note this is not the official response from support from catchoom, this might not be 100% correct, but it worked for me quite well.