PSkocik PSkocik - 2 months ago 15
C Question

Unicode code point to utf8 and wctomb

I was looking for ways to convert unicode codepoints to utf8.
So far, I've learned I can do it manually or use iconv.

I also thought wctomb would work, but it doesn't:

#include <stdio.h>
#include <stdlib.h>
#include <arpa/inet.h>

#define CENTER_UTF8 "\xf0\x9d\x8c\x86"
#define CENTER_UNICODE 0x1D306

int main(int argc, char** argv)
{
puts(CENTER_UTF8); //OK
static char buf[10];
int r;

#define WCTOMB(What) \
wctomb(NULL,0); \
r=wctomb(buf,What); \
puts(buf); \
printf("r=%d\n", r);

//Either one fails with -1
WCTOMB(CENTER_UNICODE);
WCTOMB(htonl(CENTER_UNICODE));
}


Could someone please explain to me why wctomb won't convert a unicode codepoint to utf8. I'm on Linux with a utf8 locale.

Answer

You should change program locale properly before using of wctomb():

#include <locale.h>
/* ... */
setlocale(LC_ALL, "");

This sets up program locale setting according to your environment. man setlocale

If locale is an empty string, "", each part of the locale that should be modified is set according to the environment variables.

P.S. Actually LC_CTYPE is enough for wctomb().