4.4. TBL Format


Purpose

TBL format is a text-based format for storing tabular data.


Overview

The basic rules are:

Comment lines begin with # as the first non-whitespace character.


Record Formats

For a given table, all records have the same format. The record format is either:

An example of delimited format is given below.

# A simple TBL example
Name:Age:Phone:Addr
Bill:42:397 1234:14 Smith St, New Farm
Sarah:37:892 4321:105 Brown St, Chelmer
Joe:44:365 7890:6 Royal Av, Buranda

An example of fixed-width format is given below.

# A simple TBL example
Name    Age     Phone           Addr
Bill    42      397 1234        14 Smith St, New Farm
Sarah   37      892 4321        105 Brown St, Chelmer
Joe     44      365 7890        6 Royal Av, Buranda


Note: Tabs in fixed-width records are assumed to be 8 spaces wide.


Input Format Specification

The first data line is called the input format specification. It specifies:

Field names can contain:

If the first character after the first field name is a space or tab, the format is assumed to be fixed width. Otherwise, the format is delimited with fields separated by the special character found.


Multi-line Fields

Generally, each data line specifies a single record. However, if the last field begins with the sequence '<<', then it is a multi-line field terminated by the first line beginning with '>>'. Multi-line fields are supported by both record formats.

Within a multi-line field, blank lines and lines starting with # are treated as part of that field. i.e. it is not possible to embed a comment line within a multi-line field.

An example of a table containing multi-line fields is given below.

# A simple TBL example with multi-line fields
Name    Age     Phone           Addr
Bill    42      397 1234        <<
14 Smith St
New Farm QLD 4005
>>
Sarah   37      892 4321        <<
105 Brown St
Chelmer QLD 4068
>>
Joe     44      365 7890        <<
6 Royal Av
Buranda QLD 4102
>>

Special Characters in Fields

For delimited fields:

  1. If a field contains the delimiter character or the double quote character, it must be enclosed in double quotes.
  2. A double quote character within a field is represented by two double quotes.
  3. Leading whitespace is kept.
  4. Trailing whitespace is kept.

For fixed-width fields:

  1. Leading whitespace is kept.
  2. Trailing whitespace is removed.


Note: Multi-line fields are enclosed by the << and >> symbols so none of the rules above apply to them.