Contents Index Understanding the locale language Understanding collations

ASA Database Administration Guide
  International Languages and Character Sets
    Understanding locales

Understanding the locale character set


Both application and server locale definitions have a character set. The application uses its character set when requesting character strings from the server. If character set translation is enabled (the default), the database server compares its character set with that of the application to determine whether character set translation is needed.

For more information about how to find locale settings, see Determining locale information.

The client library determines the character set as follows:

  1. If the connection string specifies a character set, it is used.

    For more information, see CharSet connection parameter [CS].

  2. Open Client applications check the locales.dat file in the Sybase locales directory.

  3. Character set information from the operating system is used to determine the locale:

The database server determines the character set for a connection as follows:

  1. The character set specified by the client is used if it is supported.

    For more information, see CharSet connection parameter [CS].

  2. The database's character set is used if the client specifies a character set that is not supported.

When a new database is created, the database server determines the character set for the new database as follows.

  1. A collation is specified in the CREATE DATABASE statement or on with the dbinit utility.

  2. The ASCHARSET environment variable is used if it exists.

  3. Character set information from the operating system is used to determine the locale.

The locale character set and language are used to determine which collation to use when creating a database if none is explicitly specified.

Character set labels 

The following table displays the valid character set label values, together with the equivalent IANA labels and a description:

Character set label IANA label Description
big5 <N/A> Traditional Chinese (cf. CP950)
cp437 <N/A> IBM CP437 - U.S. code set
cp850 <N/A> IBM CP850 - European code set
cp852 <N/A> PC Eastern Europe
cp855 <N/A> IBM PC Cyrillic
cp856 <N/A> Alternate Hebrew
cp857 <N/A> IBM PC Turkish
cp860 <N/A> PC Portuguese
cp861 <N/A> PC Icelandic
cp862 <N/A> PC Hebrew
cp863 <N/A> IBM PC Canadian French code page
cp864 <N/A> PC Arabic
cp865 <N/A> PC Nordic
cp866 <N/A> PC Russian
cp869 <N/A> IBM PC Greek
cp874 <N/A> Microsoft Thai SB code page
cp932 windows-31j Microsoft CP932 = Win31J-DBCS
cp936 </N/A> Simplified Chinese
cp949 <N/A> Korean
cp950 <N/A> PC (MS) Traditional Chinese
cp1250 <N/A> MS Windows Eastern European
cp1251 <N/A> MS Windows Cyrillic
cp1252 <N/A> MS Windows US (ANSI)
cp1253 <N/A> MS Windows Greek
cp1254 <N/A> MS Windows Turkish
cp1255 <N/A> MS Windows Hebrew
cp1256 <N/A> MS Windows Arabic
cp1257 <N/A> MS Windows Baltic
cp1258 <N/A> MS Windows Vietnamese
deckanji <N/A> DEC UNIX JIS encoding
euccns <N/A> EUC CNS encoding: Traditional Chinese with extensions
eucgb <N/A> EUC GB encoding = Simplified Chinese
eucjis euc-jp Sun EUC JIS encoding
eucksc <N/A> EUC KSC Korean encoding (cf. CP949)
greek8 <N/A> HP Greek-8
iso_1 iso_8859-1:1987 ISO 8859-1 Latin-1
iso15 <N/A> ISO 8859-15 Latin1 with Euro, etc.
iso88592 iso_8859-2:1987 ISO 8859-2 Latin-2 Eastern Europe
iso88595 iso_8859-5:1988 ISO 8859-5 Latin/Cyrillic
iso88596 iso_8859-6:1987 ISO 8859-6 Latin/Arabic
iso88597 iso_8859-7:1987 ISO 8859-7 Latin/Greek
iso88598 iso_8859-8:1988 ISO 8859-8 Latin/Hebrew
iso88599 iso_8859-9:1989 ISO 8859-9 Latin-5 Turkish
koi8 <N/A> KOI-8 Cyrillic
mac macintosh Standard Mac coding
mac_cyr <N/A> Macintosh Cyrillic
mac_ee <N/A> Macintosh Eastern European
macgrk2 <N/A> Macintosh Greek
macturk <N/A> Macintosh Turkish
roman8 hp-rpman8 HP Roman-8
sjis shift_jis Shift JIS (no extensions)
tis620 <N/A> TIS-620 Thai standard
turkish8 <N/A> HP Turkish-8
utf8 utf-8 UTF-8 treated as a character set

Contents Index Understanding the locale language Understanding collations