Contents Index Language issues in client/server computing ANSI and OEM code pages in Windows

ASA Database Administration Guide
  International Languages and Character Sets
    Understanding character sets in software

Code pages


Many languages have few enough characters to be represented in a single-byte character set. In such a character set, each character is represented by a single byte: a two-digit hexadecimal number.

At most, 256 characters can be represented in a single byte. No single-byte character set can hold all of the characters used internationally, including accented characters. This problem was addressed by the development of a set of code pages, each of which describes a set of characters appropriate for one or more national languages. For example, code page 869 contains the Greek character set, and code page 850 contains an international character set suitable for representing many characters in a variety of languages.

Upper and lower pages 

With few exceptions, characters 0 to 127 are the same for all the single-byte code pages. The mapping for this range of characters is called the ASCII character set. It includes the English language alphabet in upper and lower case, as well as common punctuation symbols and the digits. This range is often called the seven-bit range (because only seven bits are needed to represent the numbers up to 127) or the lower page. The characters from 128 to 255 are called extended characters, or upper code-page characters, and vary from code page to code page.

Problems with code page compatibility are rare if the only characters used are from the English alphabet, as these are represented in the ASCII portion of each code page (0 to 127). However, if other characters are used, as is generally the case in any non-English environment, there can be problems if the database and the application use different code pages.

Example 

Suppose a database holding French language strings uses code page 850, and the client operating system uses code page 437. The character À (upper case A grave) is held in the database as character \xB7 (decimal value 183). In code page 437, character \xB7 is a graphical character. The client application receives this byte and the operating system displays it on the screen, the user sees a graphical character instead of an A grave.


ANSI and OEM code pages in Windows

Contents Index Language issues in client/server computing ANSI and OEM code pages in Windows