HSQLDB Advanced Guide

ADVANCED TOPICS

Many questions repeatedly asked in Forums and mailing lists are answered in this guide. If you want to use HSQLDB with your application, you should read this guide. This document covers system related issues. For issues related to SQL see hsqlSQLGuide.html

CONNECTIONS

The normal method of accessing an HSQLDB database is via the JDBC Connection interface. An introduction to different methods of providing database services and accessing them can be found in hsqlGuide.html. Details and examples of how to connect via JDBC are provided in our JavaDoc for jdbcConnection.html.

Version 1.7.2 introduces a uniform method of distinguishing between different types of connection, alongside new capabilities to provide access to multiple databases. The common driver identifier is jdbc:hsqldb: followed by a protocol identifier (mem: file: res: hsql: http: hsqls: https:) then followed by host and port identifiers in the case of servers, then followed by database identifier.

DRIVER AND PROTOCOL	HOST AND PORT	DATABASE
`jdbc:hsqldb:mem:`	not available	`mydb accounts`
Lowercase, single-word identifier creates the in-memory database when the first connection is made. Subsequent use of the same Connection URL connects to the existing DB. The old form for the URL, `jdbc:hsqldb:.` creates or connects to the same database as the new form for the URL, `jdbc:hsqldb:mem:.`
`jdbc:hsqldb:file:`	not available	`mydb /opt/db/accounts C:/data/mydb`
The file path specifies the database file. In the above examples the first one refers to a set of mydb.* files in the directory where the `java` command for running the application was issued. The second and third examples refer to absolute paths on the host machine.
`jdbc:hsqldb:res:`	not available	`/adirectory/dbname`
Database files can be loaded from one of the jars specified as part of the `Java` command the same way as resource files are accessed in Java programs. The `/adirectory` above stands for a directory in one of the jars.
`jdbc:hsqldb:hsql: jdbc:hsqldb:hsqls: jdbc:hsqldb:http: jdbc:hsqldb:https:`	`//localhost //192.0.0.10:9500 //dbserver.somedomain.com`	`/an_alias /enrolments /quickdb`
The host and port specify the IP address or host name of the server and an optional port number. The database to connect to is specified by an alias. This alias is defined in the `server.properties` to refer to an actual database on the file system of the server or a transient, in-memory database on the server. The following example lines in `server.properties` or `webserver.properties` define the database aliases listed above and accessible to clients to refer to different file and in-memory databases. `database.0=file:/opt/db/accounts dbname.0=an_alias` `database.1=file:/opt/db/mydb dbname.1=enrolments` `database.2=mem:adatabase dbname.2=quickdb` The old form for the server URL, e.g., `jdbc:hsqldb:hsql//localhost` connects to the same database as the new form for the URL, `jdbc:hsqldb:hsql//localhost/` where the alias is a zero length string. In the example below, the database files `lists.*` in the `/home/dbmaster/` directory are associated with the empty alias: `database.3=/home/dbmaster/lists dbname.3=`

Connection properties

Each new JDBC Connection to a database can specify connection properties. The properties user and password are always required. In 1.7.2 the following optional properties can also be used.

Connection properties are specified either by establishing the connection via the:

DriverManager.getConnection (String url, Properties info);

method call. Or the property can be appended to the full Connection URL.

get_column_name	true	column name in ResultSet
When true, `ResultSet.getColumnName(int c)` returns the underlying column name When false, the above method returns the same value as `ResultSet.getColumnLabel(int column)` Example below: `jdbc:hsqldb:hsql://localhost/enrollments;get_column_name=false`
ifexists	false	connect only if database already exists
Has an effect only with `mem:` and `file:` database. When true, will not create a new database if one does not already exist for the URL When false (the default), a new `mem:` or `file:` database will be created if it does not exist. Setting the property to true is useful when troubleshooting as no database is created if the URL is malformed. Example below: `jdbc:hsqldb:file:enrollments;ifexists=true`

PROPERTIES FILES

HSQLDB relies on a set of properties files for different settings. Version 1.7.0 streamlines property naming and introduces a number of new properties (in this document, all references to versions 1.7.0 also apply to versions 1.7.1 and 1.7.2 unless stated otherwise) . This process will continue with future versions and the properties will be used in a hierarchical manner.

In all properties files, values are case-sensitive. All values apart from names of files or pages are required in lowercase (e.g. server.silent=FALSE will have no effect, but server.silent=false will work).

The properties files and the settings stored in them are as follows:

FILE NAME	LOCATION	FUNCTION
`server.properties`	the directory where the command to run the `Server` class is issued	settings for running HSQLDB as a database server communicating with the HSQL protocol
`webserver.properties`	the directory where the command to run the `WebServer` class is issued	settings for running HSQLDB as a database server communicating with the HTTP protocol
`<dbname>.properties`	the directory where all the files for a database are located	settings for each particular database

Properties files for running the servers are not created automatically. You should create your own files that contain server.property=value pairs for each property.

The properties file for each database is generated by the database engine. This file can be edited after closing the database. In 1.7.2, some property values can be changed via SQL commands.

Server and Web Server Properties

In both server.properties and webserver.properties files, supported values and their defaults are as follows:

VALUE	DEFAULT	DESCRIPTION
server.database.0	test	the path and file name of the first database file to use
server.dbname.0	""	server alias for the first database file
server.silent	true	no extensive messages displayed on console
server.trace	false	JDBC trace messages displayed on console

In 1.7.2, each server can serve up to 10 different databases simultaneously. The server.database.0 property defines the filename / path whereas the server.dbname.0 defines the alias used by clients to connect to that database. The digit 0 is incremented for the second database and so on. Values for the server.database.{0-9} property can use the mem:, file: or res: prefixes as discussed above under CONNECTIONS.

Values specific to server.properties are:

VALUE	DEFAULT	DESCRIPTION
server.port	9001	TCP/IP port used for talking to clients. All databases are served on the same port.
server.no_system_exit	true	no `System.exit()` call when the database is closed

Values specific to webserver.properties are:

VALUE	DEFAULT	DESCRIPTION
server.port	80	TCP/IP port used for talking to clients
server.default_page	index.html	the default web page for server
server.root	./	the location of served pages
.<extension>	?	multiple entries such as `.html=text/html` define the mime types of the static files served by the web server. See the source for `WebServer.java` for a list.

All the above values can be specified on the command line to start the server by omitting the server. prefix.

Starting a Server from your application

If you want to start the server from within your application, as opposed to the command line or batch files, you should create an instance of Server or Web Server, then assign the properties in the form of a String and start the Server. An example of this can be find in the org.hsqldb.test.TestBase source.

Upgrading: If you have existing custom properties files, change the values to the new naming convention. Note the use of digits at the end of server.database.n and server.dbname.n properties.

Individual Database Properties

Each database has its own <dbname>.properties file as part of a small group of files which also includes <dbname>.script and <dbname>.data . The properties files contain key/value pairs for some important settings.

In version 1.7.2 a new SQL command allows some user-accessible database properties to be modified as follows:

SET PROPERTY "property_name" property_value

Properties that can be modified via SET PROPERTY are indicated in the table below. The *.properties file can also be edited after closing the database. Only the user-defined values listed below should ever be modified. Changing any other value will result in unexpected malfunction in database operations. Most of these values have been introduced for the new features since 1.7.0 and are listed below with their default values in different contexts:

VALUE	DEFAULT	DESCRIPTION
`readonly`	`no`	whole database is read-only
When true, the database cannot be modified in use. This setting can be changed to `yes` if the database is to be opened from a CD. Prior to changing this setting, the database should be closed with the `SHUTDOWN COMPACT` command to ensure consistency and compactness of the data.
`hsqldb.files_readonly`	`false`	database files will not be written to
When true, data in MEMORY tables can be modified and new MEMORY tables can be added. However, these changes are not saved when the database is shutdown. CACHED and TEXT tables are always readonly when this setting is true.
`hsqldb.first_identity`	`0`	first identity value for a new table
The first value assigned automatically to the identity column of new tables.`(SET PROPERTY)`
sql.enforce_size	false	trimming and padding string columns
When true, all CHARACTER and VARCHAR values that are in a row affected by an INSERT INTO or UPDATE statement are trimmed to the size specified in the SQL table definition. Also all char strings that are shorter than the specified size are padded with spaces. When false (default), stores the exact string that is inserted. `(SET PROPERTY)`
sql.enforce_strict_size	false	size enforcement and padding string columns
Conforms to SQL standards. When true, all CHARACTER and VARCHAR values that are in a row affected by an INSERT INTO or UPDATE statement are checked against the size specified in the SQL table definition. An exception is thrown if the value is too long. Also all char strings that are shorter than the specified size are padded with spaces. When false (default), stores the exact string that is inserted. `(SET PROPERTY)`
sql.compare_in_locale	false	locale used for sorting
CHARACTER and VARCHAR columns are by default sorted according to POSIX standards. Setting the value to true will result in sorting in the character set of the current JRE locale. Changing this value for an existing database that contains cached tables will break the indexing and result in inconsistent operation. To avoid this, first change the value in the properties file, then open the database and issue the SHUTDOWN COMPACT command to recreate all the indexes. If set `true`, this setting affects all the database in the JVM.
sql.strict_fk	true	require unique index for foreign keys
Deprecated property. 1.7.2 always requires a pre-existing unique constraint or primary key for the column(s) referenced by a foreign key constraint and returns an error if the unique constraint does not already exist.
sql.strong_fk	true	make unique index for foreign keys
Deprecated property. In version 1.7.2 no automatic index will be created on referenced columns and this property will have no effect.
hsqldb.cache_scale	14	memory cache exponent
Indicates the size of memory cache used with cached tables, calculated as 3 (2value) (three multiplied by (two to the power value)). The default results in thus up to 316384 rows from all cached tables being held in memory at any time. The value can range between 8-16. `(SET PROPERTY)` If the value is set via SET PROPERTY then it becomes effective after the next database SHUTDOWN.
hsqldb.log_size	200	size of log when checkpoint is performed
The value is the size in megabytes that the `.script` file can reach before an automatic checkpoint occurs to rewrite the file. The value can be changed via the `SET LOGSIZE nnn` SQL command.
hsqldb.gc_interval	0	forced garbage collection
This setting forces garbage collection each time a set number of result set row or cache row objects are created. The default, "0" means no garbage collection is forced by the program. This should not be set when the database engine is acting as a server inside an exclusive JVM. The setting can be useful when the database is used in-process with the application with some Java Runtime Environments (JRE�s). Some JRE�s increase the size of the memory heap before doing any automatic garbage collection. This setting would prevent any unnecessary enlargement of the heap. Typical values for this setting would probably be between 10,000 to 100,000. `(SET PROPERTY)`
`hsqldb.nio_data_file`	true	use of nio access methods for the .data file
When HSQLDB is compiled and run in Java 1.4, setting this property to `false` will avoid the use of nio access methods, resulting in reduced speed. `(SET PROPERTY)` If used before defining any CACHED table, it applies to the current session, otherwise it comes to effect after a SHUTDOWN and restart.
textdb.*	0	default properties for new text tables
Properties that override the database engine defaults for newly created text tables. Settings in the text table `SET <tablename> SOURCE <source string>` command override both the engine defaults and the database properties defaults. Individual `textdb.*` properties are listed in hsqlTextTables.html . `(SET PROPERTY)`

Upgrading: From 1.7.0, the location of the database files can no longer be overridden by paths defined in the properties file. All files belonging to a database should reside in the same directory.

JDBC STREAM BASED METHODS

The ResultSet interface methods, getAsciiStream(), getUnicodeStream() and getCharacterStream() are now supported to return byte or char values from CHARACTER columns and its variants. Complementary methods in PreparedStatement, setAsciiStream(), setUnicodeStream() and setCharacterStream() are also supported. Unlike most other databases, the getString() methods can be used to retrieve very long character strings and is faster than the stream based methods.

Upgrading: Several JDBC methods that previously returned incorrect values or were not supported now return correct values. All changes have been documented in the Javadoc for the jdbcXXX classes.

MANAGING DATABASE CONNECTIONS

In all running modes (server or in-process) multiple connections to the database engine are supported. In-process (standalone) mode supports connections from the client in the same Java Virtual Machine, while server modes support connections over the network from several different clients.

Connection pooling software can be used to connect to the database but it is not generally necessary. With other database engines, connection pools are used for reasons that may not apply to HSQLDB.

(a) To allow new queries to be performed while a time-consuming query is being performed in the background. This is not possible with HSQLDB as it blocks while performing the first query and deals with the next query once it has finished it.

(b) To limit the maximum number of simultaneous connections to the database for performance reasons. With HSQLDB this can be useful only if your application is designed in a way that opens and closes connections for each small task.

(c) To control transactions in a multi-threaded application. This can be useful with HSQLDB as well. For example, in a web application, a transaction may involve some processing between the queries or user action across web pages. A separate connection should be used for each session so that the work can be committed when completed or rolled back otherwise.

An application that is not both multi-threaded and transactional, such as an application for recording user login and logout actions, does not need more than one connection. The connection can stay open indefinitely and reopened only when it is dropped due to network problems.

When using an in-process database with versions prior to 1.7.2 the application program had to keep at least one connection to the database open, otherwise the database would have been closed and further attempts to create connections could fail. This is not necessary in 1.7.2, which does not automatically close an in-process database that is opened by establishing a connection. An explicit SHTDOWN command, with or without an argument, is required to close the database.

When using a server database (and to some extent, an in-process database), care must be taken to avoid creating and dropping JDBC Connections too frequently. Failure to observe this will result in unsuccessful connection attempts when the application is under heavy load.

MEMORY AND DISK USE

Memory used by the program can be thought of as two distinct pools: memory used for table data, and memory used for building result sets. In addition, when transactions are used, memory is utilised for storing the information needed for a rollback.

Since version 1.7.1, memory use has been significantly reduced compared to previous versions. The memory used for a MEMORY table is the sum of memory used by each row. Each MEMORY table row is a Java object that has 2 slots for int or reference variables (10 slots in previous versions). It contains an array of objects for the fields in the row. Each field is an object such as Integer, Long, String, etc. In addition each index on the table adds a node object to the row. Each node object has 6 slots for int or reference variables (12 slots in previous versions). As a result, a table with just one column of type INTEGER will have four objects per row, with a total of 10 slots of 4 bytes each - currently taking up 80 bytes per row. Beyond this, each extra column in the table adds at least a few bytes to the size of each row.

With CACHED tables, the data is stored on disk and only up to a maximum number of rows are held in memory at any time. The default is up to 3*16384. The hsqldb.cache_scale database property can be set to alter this amount. As any random subset of the rows in any of the CACHED tables can be held in the cache, the amount of memory needed by cached rows can reach the sum of the rows containing the largest field data. For example if a table with 100,000 rows contains 40,000 rows with 1,000 bytes of data in each row and 60,000 rows with 100 bytes in each, the cache can grow to contain over 49,000 rows, including all the 40,000 larger rows. If memory is limited, the hsqldb.cache_scale database property should be reduced to 13 (corresponding to 3*8192 rows) or less.

The memory used for a result set row has fewer overheads (fewer slots and no index nodes) but still uses a lot of memory. All the rows in the result set are built in memory, so very large result sets may not be possible. In server mode databases, the result set memory is released from the server once the database server has returned the result set. In-process databases release the memory when the application program releases the java.sql.ResultSet object. Server modes require additional memory for returning result sets, as they convert the full result set into an array of bytes which is then transmitted to the client.

When transactions support is enabled with SET AUTOCOMMIT OFF, lists of all insert, delete or update operations are stored in memory so that they can be undone when ROLLBACK is issued. Transactions that span hundreds of modification to data will take up a lot of memory until the next COMMIT or ROLLBACK clears the list.

Most JVM implementations allocate up to a maximum amount of memory (usually 64 MB by default). This amount is generally not adequate when large memory tables are used, or when the average size of rows in cached tables is larger than a few hundred bytes. The maximum amount of allocated memory can be set on the JAVA ... command line that is used for running HSQLDB. For example, with Sun JVM version 1.3.0 the parameter -Xmx256m increases the amount to 256 MB.

1.7.2 uses a fast cache for immutable objects such as Integer or String that are stored in the database. In most circumstances, this reduces the memory footprint still further as fewer copies of the most frequently-used objects are kept in memory.

UPGRADING AND MANAGING DATABASES

Any database not produced with the release version of HSQLDB 1.7.2 must be upgraded to this version. This includes databases created with the ALPHA versions of 1.7.2. The instructions under UPGRADING USING THE SCRIPT COMMAND should be followed when possible. The DIRECT UPGRADE instructions may or may not work depending on the structure of the database.

DIRECT UPGRADE

Databases created with Hypersonic version 1.43 and HSQLDB 1.6x can sometimes be seamlessly upgraded to the new version. However, in many cases the upgrade is not so simple. The following procedure MAY work. If it doesn't, try the alternative upgrade procedure using the SCRIPT command.

Make a backup of your database.
Open the database using the old version of the Database Manager software. Use the SHUTDOWN COMPACT command.
Open the database with the new version 1.7.2 of Database Manager. Check the data is consistent.
Issue the SHUTDOWN COMPACT command.
Start using the database.

Once a database is upgraded to 1.7.2, it can no longer be used with Hypersonic or HSQLDB 1.6.x, 1.7.0 or 1.7.1.

Some Potential Problems

Version 1.7.2 does not accept duplicate names for indexes.
Version 1.7.2 does not accept duplicate names for table columns.
Version 1.7.2 does not create the same type of index for foreign keys as previous versions.

UPGRADING USING THE SCRIPT COMMAND

In all versions of HSQLDB and Hypersonic 1.43, the SCRIPT 'filename' command (used as an SQL query) allows you to save a full record of your database, including database object definitions and data, to a file of your choice. If you encounter any problems upgrading your database with the direct method, you can export a script file using the old version of the database engine and open the script as a database with 1.7.2.

Open the original database in the old version of DatabaseManager
Issue the SCRIPT command, for example SCRIPT 'newversion.script' to create a script file containing a copy of the database.
You can now edit the newversion.script file to change any table and index definition so long as it is consistent with the data. Use the guidelines in the next section (Manual Changes to the .script File). Use a programming editor that is capable of handling very large files and does not wrap long lines of text.
Use the 1.7.2 version of DatabaseManager to create a new database, in this example 'newversion' in a different directory.
SHUTDOWN this database.
Copy the newversion.script file from step 2 over the file of the same name for the new database created in 4.
Try to open the new database using DatabaseManager.
If there is any inconsistency in the data, the script line number is reported on the console and the opening process is aborted. Edit and correct any problems in the newversion.script before attempting to open again.

Manual Changes to the .script File

In 1.7.2 several ALTER TABLE commands are available to change the data structures and their names. However, if an old database cannot be opened due data inconsistencies, or when there is no ALTER TABLE command for the particular change, manual editing of the SCRIPT file can sometimes be used.

Index and row data for CACHED tables is stored in the *.data file. Because of this, in 1.7.2, a new command, SHUTDOWN SCRIPT, has been introduced to save all the CACHED table data in the .script file and delete the .data and *.backup files. After issuing this command, you can make changes to the *.script file as listed below. This procedure can also be used for adding and removing indexes or constraints to CACHED tables when the size of the *.data file is over 1GB and the normal SQL commands do not work due to the *.data file growing beyond 2GB.

The following changes can be applied so long as they do not affect the integrity of existing data.

Names of tables, columns and indexes can be changed.

CREATE UNIQUE INDEX ... to CREATE INDEX ... and vice versa

A unique index can always be converted into a normal index. A non-unique index can only be converted into a unique index if the table data for the column(s) is unique in each row.

NOT NULL

A not-null constraint can always be removed. It can only be added if the table data for the column has no null values.

PRIMARY KEY

A primary key constraint can be removed or added. It cannot be removed if there is a foreign key referencing the column(s).

COLUMN TYPES

Some changes to column types are possible. For example an INTEGER column can be changed to BIGINT, or DATE, TIME and TIMESTAMP columns can be changed to VARCHAR.

Any other changes to data structures should be made only through the supported ALTER commands.

After completing the changes and saving the modified *.script file, you can open the database as normal.

Backing Up Databases

The data for each database consists of up to 5 files in the same directory. The endings are *.properties, *.script, *.data, *.backup and *.log (a file with the *.lck ending is used for controlling access to the database and should not be backed up). These should be backed up together. The files can be backed up while the engine is running but care should be taken that a CHECKPOINT or SHUTDOWN operation does not take place during the backup. It is more efficient to perform the backup immediately after a CHECKPOINT. The *.data file can be excluded from the backup (the *.backup file will be used to replace this file if the full backup is restored). If a backup immediately follows a checkpoint, then the *.log file can also be excluded, limiting the significant files to *.properties, *.script, *.data and *.backup. Normal backup methods, such as archiving the files in a compressed bundle can be used.

Author: Fred Toussi - 14 July 2002 - updated 22 March 2004

Copyright 2002-2003 Fred Toussi. Permission is granted to distribute this document without any alteration under the terms of the HSQLDB license. Additional permission is granted to the HSQLDB Development Group to distribute this document with or without alterations under the terms of the HSQLDB license.

http://hsqldb.sourceforge.net