Bastl Bastl - 1 month ago 18
C++ Question

How to uppercase a u32string (char32_t) with a specific locale?

On Windows with Visual Studio 2017 I can use the following code to uppercase a

u32string
(wich is based on
char32_t
):

#include <locale>
#include <iostream>
#include <string>

void toUpper(std::u32string& u32str, std::string localeStr)
{
std::locale locale(localeStr);

for (unsigned i = 0; i<u32str.size(); ++i)
u32str[i] = std::toupper(u32str[i], locale);
}


The same thing is not working with macOS and XCode.
I'm getting such errors:

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/__locale:795:44: error: implicit instantiation of undefined template 'std::__1::ctype<char32_t>'
return use_facet<ctype<_CharT> >(__loc).toupper(__c);


Is there a portable way of doing this?

Answer Source

I have found a solution:

Instead of using std::u32string I'm now using std::string with utf8 encoding. Conversion from std::u32string to std::string (utf8) can be done via utf8-cpp: http://utfcpp.sourceforge.net/

It's needed con convert the utf8 string to std::wstring (because std::toupper is not implemented on all platforms for std::u32string).

void toUpper(std::string& str, std::string localeStr)
{
    //unicode to wide string converter
    std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;

    //convert to wstring (because std::toupper is not implemented on all platforms for u32string)
    std::wstring wide = converter.from_bytes(str);

    std::locale locale;

    try
    {
        locale = std::locale(localeStr);
    }
    catch(const std::exception&)
    {
        std::cerr << "locale not supported by system: " << localeStr << " (" << getLocaleByLanguage(localeStr) << ")" << std::endl;
    }

    auto& f = std::use_facet<std::ctype<wchar_t>>(locale);

    f.toupper(&wide[0], &wide[0] + wide.size());

    //convert back
    str = converter.to_bytes(wide);
}

Note:

  • On Windows localeStr has to be something like this: en, de, fr, ...
  • On other Systems: localeStr must be de_DE, fr_FR, en_US, ...