Chapter 2 Programming Information
This section discusses internationalization and localization issues relevant to jConnect.
With the release of Adaptive Server Enterprise 12.5, database clients can take advantage of two new server datatypes, unichar and univarchar, which allow for the efficient storage and retrieval of Unicode data.
Quoting from the Unicode Standard, version 2.0:[Unmapped FQGI: TECHDOC BOOK DIVISION SECTION SECTION PARA MESSAGE ] "The Unicode Standard is a fixed-width, uniform encoding scheme for encoding characters and text. The repertoire of this international character code for information processing includes characters for the major scripts of the world, as well as technical symbols in common. The Unicode character encoding treats alphabetic characters, ideographic characters, and symbols identically, which means they can be used in any mixture and with equal facility. The Unicode Standard is modeled on the ASCII character set, but uses a 16-bit encoding to support full multilingual text."
This means that the user can designate database table columns to store Unicode data, and clients, such as jConnect, can efficiently store Unicode data directly, without the overhead of conversion.
Two things must happen for jConnect to take advantage of this feature:
When these two conditions are met, jConnect can properly store and retrieve Unicode data from the database. Where this feature is enabled, your jConnect application will continue to behave as expected. That is, your JDBC calls to such methods as PreparedStatement.setString (int column, String value) do not need to be modified just because you have set the JCONNECT_VERSION to 6 and turned on the server capability.
Where the difference will be seen, however, is in the "under the covers" work done by jConnect in communicating character data to the server. Where you are storing data in a database column designed to hold Unicode data, or when you are selecting Unicode data from such a column, jConnect performs all the necessary conversions.
A side effect is that when the above two conditions are met, and the unichar and univarchar datatypes setting is turned on, the CHARSET and CHARSET_CONVERTER connection property settings are ignored by jConnect. This is because with unichar enabled, all character data is passed to the server as Unicode data; therefore the CHARSET setting is irrelevant, and all conversion is handled internally by jConnect.
For more information on support for unichar and univarchar datatypes,
see the Adaptive Server Enterprise version 12.5 manuals.
jConnect uses special classes for all character-set conversions. By selecting a character-set converter class, you specify how jConnect should handle single-byte and multibyte character-set conversions, and the performance impact the conversions will have on your applications.
There are two character-set conversion classes. The conversion class that jConnect uses is based on the version setting (for example, VERSION_4), and the CHARSET and CHARSET_CONVERTER_CLASS connection properties.
jConnect uses the version setting from SybDriver.setVersion( ) to determine the default character-set converter class to use. For VERSION_2, the default is TruncationConverter. For VERSION_4 and later, the default is PureConverter.
You can also set the CHARSET_CONVERTER_CLASS connection property to specify which character-set converter you want jConnect to use. This is useful if you want to use a character-set converter other than the default for your jConnect version.
For example, if you set jConnect to VERSION_4 or later, but want to use the TruncationConverter class rather than the multibyte PureConverter class, you can set CHARSET_CONVERTER_CLASS:
For jConnect 4.x:
... props.put("CHARSET_CONVERTER_CLASS", "com.sybase.utils.TruncationConverter")
For jConnect 5.x:
... props.put("CHARSET_CONVERTER_CLASS", "com.sybase.jdbc2.utils.TruncationConverter")
You can specify the character set to use in your application by setting the CHARSET driver property. If you do not set the CHARSET property:
You can also use the -J charset command line option for the IsqlApp application to specify a character set.
To determine which character sets are installed on your Adaptive Server, issue the following SQL query on your server:
select name from syscharsets go
For the PureConverter class, if the designated CHARSET does not work with the client's Java Virtual Machine (VM), the connection fails with a SQLException, indicating that you must set CHARSET to a character set that is supported by both Adaptive Server and the client.
When the TruncationConverter class is used, character truncation is applied regardless of whether the designated CHARSET is 7-bit ASCII or not.
If you use multibyte character sets and need to improve driver performance, you can use the SunIoConverter class provided with the jConnect samples. See "Character-set conversion" for details.
Table 2-4 lists the Sybase character sets that are supported by jConnect. The table also lists the corresponding JDK byte converter for each supported character set.
Although jConnect supports UCS-2, currently no Sybase databases or open servers support UCS-2.
Adaptive Server Enterprise version 12.5 supports a version of Unicode known as the UTF-16 encoding.
You can still send Unicode data to a Sybase Adaptive
Server version 12.5 and later by setting JCONNECT_VERSION
property to VERSION_6, and by having the server's
default character set as UTF-8.
The Sybase sjis character set does not include the IBM or Microsoft extensions to JIS, whereas the JDK SJIS byte converter includes these extensions. As a result, conversions from Java strings to a Sybase database using sjis may result in character values that are not supported by the Sybase database. However, conversions from sjis on a Sybase database to Java strings should not have this problem.
Table 2-4 lists the character sets currently supported by Sybase.
SybCharset name | JDK byte converter |
---|---|
ascii_7 | 8859_1 |
big5 | Big5 |
cp037 | Cp037 |
cp437 | Cp437 |
cp500 | Cp500 |
cp850 | Cp850 |
cp852 | Cp852 |
cp855 | Cp855 |
cp857 | Cp857 |
cp860 | Cp860 |
cp863 | Cp863 |
cp864 | Cp864 |
cp866 | Cp866 |
cp869 | Cp869 |
cp874 | Cp874 |
cp932 | Cp932 |
cp936 | Cp936 |
cp950 | Cp950 |
cp1250 | Cp1250 |
cp1251 | Cp1251 |
cp1252 | Cp1252 |
cp1253 | Cp1253 |
cp1254 | Cp1254 |
cp1255 | Cp1255 |
cp1256 | Cp1256 |
cp1257 | Cp1257 |
cp1258 | Cp1258 |
deckanji | EUCJIS |
eucgb | GB2312 |
eucjis | EUCJIS |
eucksc | Cp949 |
ibm420 | Cp420 |
ibm918 | Cp918 |
iso_1 | 8859_1 |
iso88592 | 8859-2 |
is088595 | 8859_5 |
iso88596 | 8859_6 |
iso88597 | 8859_7 |
iso88598 | 8859_8 |
iso88599 | 8859_9 |
iso885915 | 8859_15 |
koi8 | KOI8_R |
mac | Macroman |
mac_cyr | MacCyrillic |
mac_ee | MacCentralEurope |
macgreek | MacGreek |
macturk | MacTurkish |
sjis (see note) | SJIS |
tis620 | MS874 |
utf8 | UTF8 |
jConnect version 4.1 and later support the use of the new European currency symbol, or "euro " and its conversion to and from UCS-2 Unicode.
The euro has been added to the following Sybase character sets: cp1250, cp1251, cp1252, cp1253, cp1254, cp1255, cp1256, cp1257, cp1258, cp874, iso885915, and utf8.
Character sets cp1257, cp1258, and iso885915 are new.
To use the euro symbol:
The following Sybase character sets are not supported in jConnect 5.x because no JDK byte converters are analogous to the Sybase character sets:
You can use these character sets with the TruncationConverter class as long as the application uses only the 7-bit ASCII subsets of these characters.
Copyright © 2001 Sybase, Inc. All rights reserved. |
![]() |