This is the mail archive of the
mailing list for the Cygwin project.
default charset for imlicit locale specificatio
- From: Corinna Vinschen <corinna-cygwin at cygwin dot com>
- To: cygwin-developers at cygwin dot com
- Date: Tue, 19 Jan 2010 19:15:35 +0100
- Subject: default charset for imlicit locale specificatio
- Reply-to: cygwin-developers at cygwin dot com
right now, if a user only specifies a language but not a charset, for
instance LANG="es_MX", then Cygwin defaults to the current ANSI codepage
via the GetACP() function call.
While that matches the current system settings, it doesn't necessarily
result in using a codepage which matches the current language.
For instance, on a US system, this results in using CP1252, even
for LANG="zh_TW". CP1252 very certainly has not the right characters
for the Chinese language. The right codepage would be 950 == Big5
in this case.
There *is* a way in Windows, which isn't even complicated, to fetch
the default ANSI codepage for a given ISO compatible language code.
Locally I'm running such a Cygwin version right now, which asks the
system for the matching ANSI codepage and uses that, rather than the
system default ANSI codepage. It works quite nicely.
So, here's the question:
What do you think is the better default for an arbitrary locale
without explicit charset?
[ ] Stick to the default system ANSI codepage?
[ ] Default to the matching ANSI codepage for the given language?
I think that the second option is the better one since that's
what the user expects when setting the locale to some language.
But I'd like to hear arguments.
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com