0x6B6F77616C74 0x6B6F77616C74 - 5 months ago 41
C++ Question

Converting file in UTF-8 to UTF-16

A program in C++ needs to read a file that is encoded in utf-8. Unfortunately, using char* it cannot get extended characters (☺☻♥♦•◘ and so on), and wchar_t* interprets them wrongly. My algorithm to manage it is:

1) Make a new file

2) Name it to [original name]Utf-16

3) Copy original file to new, making a conversion simultaneously

4) Extract data.

5) Delete this temporary file when it's no longer needed.

I'm stuck at 3), is there somewhere a function like "FileUTF8toUTF16"?


This is what I use

int nLenWide = MultiByteToWideChar(CP_UTF8, 0, (LPCSTR)(pData + nOffset), 
        (int)(nDataLen - nOffset), NULL, 0);
if (MultiByteToWideChar(CP_UTF8, 0, (LPCSTR)(pData + nOffset), 
        (int)(nDataLen - nOffset), 
        str.GetBuffer(nLenWide), nLenWide) != nLenWide)
    return str;
return str;

In which pData is a BYTE pointer to the actual utf-8 data, nOffset is usually 3 (the BOM).