dst.convertTo(dst, -1, 2, 0);
cv::addWeighted(dst, 1.5, result, -0.5, 0, result);
cvtColor(dst, img_hist_equalized, CV_BGR2YCrCb);
cvtColor(img_hist_equalized, img_hist_equalized, CV_YCrCb2BGR);
The camscanner application may be using some complex algorithm to handle various lightning cases, etc. But I will try to cover a basic approach to such problem, The basic idea here is Binarization of the given input image, Or more precisely we can say Theresholding a given image, If you look at the OpenCV documentation, there are a lot of references to thresholding a given image, So let's start with the documentation.
Global Thresholding: In this approach we assume that the intensity value of the foreground is always below a certain value, In context of printed sheets, we assume that the ink color is always black and paper color is uniform and intensity is greater than the intensity of ink color, so we safely assume some threshold (say 40), (max. is 255) and threshold the input image as :
ret, thresh1 = cv2.threshold(img, 100, 255, cv2.THRESH_BINARY)
ret, thresh1 = cv2.threshold(img, 130, 255, cv2.THRESH_BINARY)
There are many disadvantages to this method, First of all it is NOT independent of intensity variance, So there is a very less chance that you can accurately estimate a threshold value which segments text from the given image, It has very limited applications, can be only applied in case where the background paper is exactly white with minimum variation in intensity, so this process cannot be used for Real world images.
Adaptive Thresholding: This method covers the intensity variation problem in the given image, here the thresholding is done on the values of neighbouring pixels, So transitions from lower intensity to higher and vice versa are successfully captured with this method as:
thresh = cv2.adaptiveThreshold(original_img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)
Further Work: You may work on various techniques of denoising the binary image, to remove the dots, Or have a look at removing the salt and pepper noise from the image.
Otu's Binarization: This is yet another nice approach which intelligently calculates the threshold value, between the maximas, It may work very nice in some of the cases, but it seems to fail in your case.
ret2,thresh = cv2.threshold(img,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
It basically does the same global thresholding But now the threshold is calculated automatically such that the threshold lies between 2 peaks and hence segmenting the ink from the paper.
Recommended Method: I guess the best approach to start with is Adaptive Thresholding, You may try some other preprocessing techniques such as sharpening image, Histogram Equalisation, etc. and analyse How it creates a more realistic output, You may also try to do some post processing such as denoising the image, Morphological operations
I tried the denoising of image and found it more effective to other approaches,
denoised = cv2.fastNlMeansDenoising(thresh, 11, 31, 9) # you may experiment with the constants here
But I welcome you to try various combinations of the above approaches to see which one works for all cases.