Daniel B. Chapman Daniel B. Chapman - 25 days ago 12
C++ Question

Is it safe to convert a std::wstring to cstring?

To clarify, I'm not overly concerned about data-loss as this is for logging actions inside my application and I'm using wstring as the primary data type. Due to the nature of the frameworks I'm currently using (OpenFrameworks logging which is

std::string
by default and I'm fine with that).

Here's an example of my current conversion:

//ofLog.h--patch | `message` is a `std::ostringstream`
ofLog& operator<<(const std::wstring& value){
message << value.c_str() << padding;
return *this;
}


By using this specific overload I can save myself a lot of annoyance in the Verbose logs and not worry too much if I have third party std::strings (OSC(char) libraries vs JSON(wchar) libraries for example).

I'm relatively new to C++ having lived in a Java/JavaScript world and I'm just wondering if there's anything other than potential data-loss at risk here. Are there are platform-independent solutions to this problem? I've been Googling for several hours and I want to have a "safe" solution that won't bite me down the road.

Basically my solution appears to work, but I want to know if there are potential issues down the road by doing this.

Thanks!
(the openframeworks tag is just to help people down the road if we solve it)

Answer

The shown code is not going to work correctly. std::wstring's c_str() method returns a const wchar_t *. Passing it to std::ostringstream's operator<< will choose the operator<< overload that takes const void * parameter, which will not accomplish anything useful.

You stated that you expect your std::wstring to consist mostly of US-ASCII characters. If so, the hackiest approach is to rudely convert the std::wstring to a std::string, in the following manner, replacing all non-ASCII characters with a question mark (or pick your favorite punctuation symbol):

std::string cvalue;

std::transform(value.begin(), value.end(),
               std::back_insert_iterator<std::string>(cvalue),
               [](wchar_t wchar)
               {
                     return static_cast<char>(wchar > 127 ? '?':wchar);
               });

Proceed, and << the ordinary std::string into your message.

If you expect your wide string to consist mostly of US-ASCII content, this would be a quick hack to get the job done. Otherwise, one would need to use the localization library to properly convert the wide string to a narrow character string using the current system locale. Quite a bit of work...

Comments