Contents Index Duplicate elimination Ordered distinct

ASA SQL User's Guide
  Query Optimization and Execution
    Query execution algorithms
      Duplicate elimination

Hash distinct

The hash distinct algorithm reads its input, and builds an in-memory hash table. If an input row is found in the hash table, it is ignored; otherwise it is written to a work table. If the input does not completely fit into the in-memory hash table, it is partitioned into smaller work tables, and processed recursively.

The hash distinct algorithm works very well if the distinct rows fit into an in-memory table, irrespective of the total number of rows in the input.

The hash distinct uses a work table, and as such can provide insensitive or value sensitive semantics.

If the hash distinct algorithm executes in an environment where there is very little cache memory available, then it will not be able to complete. In this case, the hash distinct method discards its interim results, and the indexed distinct algorithm is used instead. The optimizer avoids generating access plans using the hash distinct algorithm if it detects that a low memory situation may occur during query execution.

The hash distinct returns a row as soon as it finds one that has not previously been returned. However, the results of a hash distinct must be fully materialized before returning from the query. If necessary, the optimizer adds a work table to the execution plan to ensure this.

Hash distinct locks the rows of its input.


Contents Index Duplicate elimination Ordered distinct