I have some PVRTC 4bpp image data that needs to be flipped vertically in-place without decompression. The code I have written is mostly working but the flip currently introduces small artefacts and I'm unsure exactly why.
The PVRTC flip code first moves the 8-byte 4x4 compression blocks to their flipped position as calculated by the TwiddleUV() function from PVRTDecompress.cpp in the PowerVR SDK. This part appears to be correct.
Second, the code iterates through all the 8-byte compression blocks reversing the order of the second 4 bytes which contain the 4x4 modulation data stored in 2bpp. The first 4 bytes of the block contains color data which is left unchanged.
This seems to be very close to correct, but it leaves small artefacts in the flipped image that weren't there in the original and which manifest mostly as small greyish horizontal lines. If the flipping code is run twice then the artefacts go away and the image is unchanged from the original.
Can anyone with some PVRTC experience explain what else needs to be done to flip the compressed image data? I think the problem may be to do with the flipping of modulation data, but my forays into the PVRTC documentation haven't yielded the answer at this stage.
To make this a bit easier to understand, you need to know the way PVRTC decodes the data as it's a bit different to block-based schemes like ETC1 or S3TC/DXTC. For simplicity I'll only describe the 4bpp variant.
PVRTC consists of 2 low-resolution 15/16bpp images, A and B, that are 1/4 * 1/4 the resolution of the final texture, along with a full-resolution, but 2bpp modulation image. To 'logically' decode a texel, XY, the A and B images are bilinearly upscaled to the target resolution, and then the resulting colours, Axy and Bxy are blended according to the pixel in the modulation image. To make the decoding simpler in hardware, the A and B images are interleaved with the modulation data at the rate of 1 pixel from each with 16 from the modulation data, into 64-bit words.
Now, the reason it's not flipping exactly using just bit shuffling is because the 4x4 bilinear upscale is slightly offset from the centres of 4x4 texels (I think the reasons are described in the Graphics Hardware 2003 paper linked from wikipedia). I expect the only thing you could do is actually evaluate each pixel and then after doing the flip of the bilinear colours, work out which of the four possibilities is actually the closest. In a way, it'd be a recompression, but it should be relatively quick.