Nailgun Nailgun - 7 months ago 170
Java Question

DPI of image extracted from PDF with pdfBox

I'm using java pdfBox library to validate single page pdf files with embedded images.

I know that pdf file itself doesen't contain the DPI information.

However the images that have the equal dimensions in the document have different sizes in pixels after extracting and no dpi meta information.

So is it possible to somehow calculate the image sizes relative to pdf page or to extract images with their dpi information (for png or jpeg image files) using pdfBox?

Thanks!

Answer

Get the PrintImageLocations.java file from the PDFBOX src download. Here's an except of the source, only the last line is by me, and it will output the dpi:

            float imageXScale = ctmNew.getXScale();
            float imageYScale = ctmNew.getYScale();
            System.out.println("position = " + ctmNew.getXPosition() + ", " + ctmNew.getYPosition());
            // size in pixel
            System.out.println("size = " + imageWidth + "px, " + imageHeight + "px");
            // size in page units
            System.out.println("size = " + imageXScale + "pu, " + imageYScale + "pu");
            // size in inches 
            imageXScale /= 72;
            imageYScale /= 72;
            System.out.println("size = " + imageXScale + "in, " + imageYScale + "in");
            // size in millimeter
            imageXScale *= 25.4;
            imageYScale *= 25.4;
            System.out.println("size = " + imageXScale + "mm, " + imageYScale + "mm");

            System.out.printf("dpi  = %.0f dpi (X), %.0f dpi (Y) %n", image.getWidth() * 72 / ctmNew.getXScale(), image.getHeight() * 72 / ctmNew.getYScale());

And here's a sample output:

Found image [X0]

position = 0.0, 0.0

size = 2544px, 3523px <---- pixels

size = 610.56pu, 845.52pu <---- "page units", 1pu = 1/72 inch

size = 8.48in, 11.743334in

size = 215.39198mm, 298.28067mm

dpi = 300 dpi (X), 300 dpi (Y)

Comments