What are the "serious" libraries?
Joe Nelson
joe at begriffs.com
Sun Apr 12 18:11:38 UTC 2020
I wrote:
> So all I can conclude is that if __STDC_ISO_10646__ is not defined,
> then there is a Unicode character which is stored in wchar_t using a
> different numerical value than its codepoint. Why would that be, does
> anyone have insight into this?
Trying to get to the bottom of the mystery. Thought perhaps Mac stores
UTF-16 rather than UTF-32 in wchar_t, but the following program
disproves that hypothesis. Really wondering why Mac doesn't conform to
ISO 10646.
-------------- next part --------------
#include <locale.h>
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
/* U+1F41A, outside the BMP */
char *seashell_utf8 = "\xf0\x9f\x90\x9a";
wchar_t *seashell_wide = malloc(5 * sizeof(wchar_t));
if (!setlocale(LC_CTYPE, "en_US.UTF-8"))
{
fputs("Cannot set locale\n", stderr);
return EXIT_FAILURE;
}
if ((size_t)(-1) ==
mbstowcs(seashell_wide, seashell_utf8, sizeof(seashell_wide)-1))
{
fputs("Invalid multibyte character\n", stderr);
return EXIT_FAILURE;
}
while (*seashell_wide)
printf("%x ", *seashell_wide++);
puts("");
return EXIT_SUCCESS;
}
More information about the Friends
mailing list