ASA SQL User's Guide
Summarizing, Grouping and Sorting Query Results
Performing set operations on query results with UNION, INTERSECT, and EXCEPT
NULL is treated differently by the set operators UNION, EXCEPT, and INTERSECT than it is in search conditions. This difference is one of the main reasons to use set operators.
When comparing rows, set operators treat NULL values as equal to each other. In contrast, when NULL is compared to NULL in a search condition the result is unknown (not true).
One particularly useful consequence of this difference is that the number of rows in the result set for query-1 EXCEPT ALL query-2
is always the difference in the number of rows in the result sets of the individual queries.
For example, consider two tables T1 and T2, each with the following columns:
col1 INT, col2 CHAR(1)
The data in the tables is as follows:
Table T1.
col1 | col2 |
---|---|
1 | a |
2 | b |
3 | (NULL) |
3 | (NULL) |
4 | (NULL) |
4 | (NULL) |
Table T2
col1 | col2 |
---|---|
1 | a |
2 | x |
3 | (NULL) |
One query that asks for rows in T1 that also appear in T2 is as follows:
SELECT T1.col1, T1.col2 FROM T1 JOIN T2 ON T1.col1 = T2.col2 AND T1.col2 = T2.col2
T1.col1 | T1.col2 |
---|---|
1 | a |
The row ( 3, NULL ) does not appear in the result set, as the comparison between NULL and NULL is not true. In contrast, approaching the problem using the INTERSECT operator includes a row with NULL:
SELECT col1, col2 FROM T1 INTERSECT SELECT col1, col2 FROM T2
col1 | col2 |
---|---|
1 | a |
3 | (NULL) |
The following query uses search conditions to list rows in T1 that do not appear in T2:
SELECT col1, col2 FROM T1 WHERE col1 NOT IN ( SELECT col1 FROM t2 WHERE t1.col2 = t2.col2 ) OR col2 NOT IN ( SELECT col2 FROM t2 WHERE t1.col1 = t2.col1 )
col1 | col2 |
---|---|
2 | b |
3 | (NULL) |
4 | (NULL) |
3 | (NULL) |
4 | (NULL) |
The NULL-containing rows from T1 are not excluded by the comparison. In contrast, approaching the problem using EXCEPT ALL excludes NULL-containing rows that appear in both tables. In this case, the (3, NULL) row in T2 is identified as the same as the (3, NULL) row in T1.
SELECT col1, col2 FROM T1 EXCEPT ALL SELECT col1, col2 FROM T2
col1 | col2 |
---|---|
2 | b |
3 | (NULL) |
4 | (NULL) |
4 | (NULL) |
The EXCEPT operator is more restrictive still. It eliminates both (3, NULL) rows from T1 and excludes one of the (4, NULL) rows as a duplicate.
SELECT col1, col2 FROM T1 EXCEPT SELECT col1, col2 FROM T2
col1 | col2 |
---|---|
2 | b |
4 | (NULL) |