By default, MySQL searches are not case sensitive (although there are some character sets that are never case insensitive, such as czech). This means that if you search with col_name LIKE 'a%', you will get all column values that start with A or a. If you want to make this search case sensitive, make sure that one of the operands is a binary string. You can do this with the BINARY operator. Write the condition as either BINARY col_name LIKE 'a%' or col_name LIKE BINARY 'a%'.
If you want a column always to be treated in case-sensitive fashion, declare it as BINARY. See CREATE TABLE.
Simple comparison operations (>=, >, =, <, <=, sorting, and grouping) are based on each character's “sort value.” Characters with the same sort value (such as ‘E’, ‘e’, and ‘é’) are treated as the same character.
If you are using Chinese data in the so-called big5 encoding, you want to make all character columns BINARY. This works because the sorting order of big5 encoding characters is based on the order of ASCII codes. As of MySQL 4.1, you can explicitly declare that a column should use the big5 character set:
CREATE TABLE t (name CHAR(40) CHARACTER SET big5);
The format of a DATE value is 'YYYY-MM-DD'. According to standard SQL, no other format is allowed. You should use this format in UPDATE expressions and in the WHERE clause of SELECT statements. For example:
mysql> SELECT * FROM tbl_name WHERE date >= '2003-05-05';
As a convenience, MySQL automatically converts a date to a number if the date is used in a numeric context (and vice versa). It is also smart enough to allow a “relaxed” string form when updating and in a WHERE clause that compares a date to a TIMESTAMP, DATE, or DATETIME column. (“Relaxed form” means that any punctuation character may be used as the separator between parts. For example, '2004-08-15' and '2004#08#15' are equivalent.) MySQL can also convert a string containing no separators (such as '20040815'), provided it makes sense as a date.
The special date '0000-00-00' can be stored and retrieved as '0000-00-00'. When using a '0000-00-00' date through MyODBC, it is automatically converted to NULL in MyODBC 2.50.12 and above, because ODBC can't handle this kind of date.
Because MySQL performs the conversions described above, the following statements work:
mysql> INSERT INTO tbl_name (idate) VALUES (19970505); mysql> INSERT INTO tbl_name (idate) VALUES ('19970505'); mysql> INSERT INTO tbl_name (idate) VALUES ('97-05-05'); mysql> INSERT INTO tbl_name (idate) VALUES ('1997.05.05'); mysql> INSERT INTO tbl_name (idate) VALUES ('1997 05 05'); mysql> INSERT INTO tbl_name (idate) VALUES ('0000-00-00'); mysql> SELECT idate FROM tbl_name WHERE idate >= '1997-05-05'; mysql> SELECT idate FROM tbl_name WHERE idate >= 19970505; mysql> SELECT MOD(idate,100) FROM tbl_name WHERE idate >= 19970505; mysql> SELECT idate FROM tbl_name WHERE idate >= '19970505';
However, the following will not work:
mysql> SELECT idate FROM tbl_name WHERE STRCMP(idate,'20030505')=0;
STRCMP() is a string function, so it converts idate to a string in 'YYYY-MM-DD' format and performs a string comparison. It does not convert '20030505' to the date '2003-05-05' and perform a date comparison.
If you are using the ALLOW_INVALID_DATES SQL mode, MySQL allows you to store dates that are given only limited checking: MySQL ensures only that the day is in the range from 1 to 31 and the month is in the range from 1 to 12.
This makes MySQL very convenient for Web applications where you obtain year, month, and day in three different fields and you want to store exactly what the user inserted (without date validation).
If you are not using the NO_ZERO_IN_DATE SQL mode, the day or month part can be zero. This is convenient if you want to store a birthdate in a DATE column and you know only part of the date.
If you are not using the NO_ZERO_DATE SQL mode, MySQL also allows you to store '0000-00-00' as a “dummy date.” This is in some cases more convenient than using NULL values.
If the date cannot be converted to any reasonable value, a 0 is stored in the DATE column, which will be retrieved as '0000-00-00'. This is both a speed and a convenience issue. We believe that the database server's responsibility is to retrieve the same date you stored (even if the data was not logically correct in all cases). We think it is up to the application and not the server to check the dates.
If you want MySQL to check all dates and accept only legal dates (unless overriden by IGNORE), you should set sql_mode to "NO_ZERO_IN_DATE,NO_ZERO_DATE".
Date handling in MySQL 5.0.1 and earlier works like MySQL 5.0.2 with the ALLOW_INVALID_DATES SQL mode enabled.
The concept of the NULL value is a common source of confusion for newcomers to SQL, who often think that NULL is the same thing as an empty string ''. This is not the case. For example, the following statements are completely different:
mysql> INSERT INTO my_table (phone) VALUES (NULL); mysql> INSERT INTO my_table (phone) VALUES ('');
Both statements insert a value into the phone column, but the first inserts a NULL value and the second inserts an empty string. The meaning of the first can be regarded as “phone number is not known” and the meaning of the second can be regarded as “the person is known to have no phone, and thus no phone number.”
To help with NULL handling, you can use the IS NULL and IS NOT NULL operators and the IFNULL() function.
In SQL, the NULL value is never true in comparison to any other value, even NULL. An expression that contains NULL always produces a NULL value unless otherwise indicated in the documentation for the operators and functions involved in the expression. All columns in the following example return NULL:
mysql> SELECT NULL, 1+NULL, CONCAT('Invisible',NULL);
If you want to search for column values that are NULL, you cannot use an expr = NULL test. The following statement returns no rows, because expr = NULL is never true for any expression:
mysql> SELECT * FROM my_table WHERE phone = NULL;
To look for NULL values, you must use the IS NULL test. The following statements show how to find the NULL phone number and the empty phone number:
mysql> SELECT * FROM my_table WHERE phone IS NULL; mysql> SELECT * FROM my_table WHERE phone = '';
You can add an index on a column that can have NULL values if you are using MySQL 3.23.2 or newer and are using the MyISAM, InnoDB, or BDB storage engine. As of MySQL 4.0.2, the MEMORY storage engine also supports NULL values in indexes. Otherwise, you must declare an indexed column NOT NULL and you cannot insert NULL into the column.
When reading data with LOAD DATA INFILE, empty or missing columns are updated with ''. If you want a NULL value in a column, you should use \N in the data file. The literal word “NULL” may also be used under some circumstances. See LOAD DATA.
When using DISTINCT, GROUP BY, or ORDER BY, all NULL values are regarded as equal.
When using ORDER BY, NULL values are presented first, or last if you specify DESC to sort in descending order. Exception: In MySQL 4.0.2 through 4.0.10, NULL values sort first regardless of sort order.
Aggregate (summary) functions such as COUNT(), MIN(), and SUM() ignore NULL values. The exception to this is COUNT(*), which counts rows and not individual column values. For example, the following statement produces two counts. The first is a count of the number of rows in the table, and the second is a count of the number of non-NULL values in the age column:
mysql> SELECT COUNT(*), COUNT(age) FROM person;
For some column types, MySQL handles NULL values specially. If you insert NULL into a TIMESTAMP column, the current date and time is inserted. If you insert NULL into an integer column that has the AUTO_INCREMENT attribute, the next number in the sequence is inserted.
You can use an alias to refer to a column in GROUP BY, ORDER BY, or HAVING clauses. Aliases can also be used to give columns better names:
SELECT SQRT(a*b) AS root FROM tbl_name GROUP BY root HAVING root > 0; SELECT id, COUNT(*) AS cnt FROM tbl_name GROUP BY id HAVING cnt > 0; SELECT id AS 'Customer identity' FROM tbl_name;
Standard SQL doesn't allow you to refer to a column alias in a WHERE clause. This is because when the WHERE code is executed, the column value may not yet be determined. For example, the following query is illegal:
SELECT id, COUNT(*) AS cnt FROM tbl_name WHERE cnt > 0 GROUP BY id;
The WHERE statement is executed to determine which rows should be included in the GROUP BY part, whereas HAVING is used to decide which rows from the result set should be used.
If you receive the following message when trying to perform a ROLLBACK, it means that one or more of the tables you used in the transaction do not support transactions:
Warning: Some non-transactional changed tables couldn't be rolled back
These non-transactional tables will not be affected by the ROLLBACK statement.
If you were not deliberately mixing transactional and non-transactional tables within the transaction, the most likely cause for this message is that a table you thought was transactional actually is not. This can happen if you try to create a table using a transactional storage engine that is not supported by your mysqld server (or that was disabled with a startup option). If mysqld doesn't support a storage engine, it will instead create the table as a MyISAM table, which is non-transactional.
You can check the table type for a table by using either of these statements:
SHOW TABLE STATUS LIKE 'tbl_name'; SHOW CREATE TABLE tbl_name;
See SHOW TABLE STATUS and SHOW CREATE TABLE.
You can check which storage engines your mysqld server supports by using this statement:
SHOW ENGINES;
Before MySQL 4.1.2, SHOW ENGINES is unavailable. Use the following statement instead and check the value of the variable that is associated with the storage engine in which you are interested:
SHOW VARIABLES LIKE 'have_%';
For example, to determine whether the InnoDB storage engine is available, check the value of the have_innodb variable.
See SHOW ENGINES and SHOW VARIABLES.
MySQL does not support subqueries prior to Version 4.1, or the use of more than one table in the DELETE statement prior to Version 4.0. If your version of MySQL does not support subqueries or multiple-table DELETE statements, you can use the following approach to delete rows from two related tables:
SELECT the rows based on some WHERE condition in the main table.
DELETE the rows in the main table based on the same condition.
DELETE FROM related_table WHERE related_column IN (selected_rows).
If the total length of the DELETE statement for related_table is more than 1MB (the default value of the max_allowed_packet system variable), you should split it into smaller parts and execute multiple DELETE statements. You will probably get the fastest DELETE by specifying only 100 to 1,000 related_column values per statement if the related_column is indexed. If the related_column isn't indexed, the speed is independent of the number of arguments in the IN clause.
If you have a complicated query that uses many tables but that doesn't return any rows, you should use the following procedure to find out what is wrong:
Test the query with EXPLAIN to check whether you can find something that is obviously wrong. See EXPLAIN.
Select only those columns that are used in the WHERE clause.
Remove one table at a time from the query until it returns some rows. If the tables are large, it's a good idea to use LIMIT 10 with the query.
Issue a SELECT for the column that should have matched a row against the table that was last removed from the query.
If you are comparing FLOAT or DOUBLE columns with numbers that have decimals, you can't use equality (=) comparisons. This problem is common in most computer languages because not all floating-point values can be stored with exact precision. In some cases, changing the FLOAT to a DOUBLE will fix this. See the section called “Problems with Floating-Point Comparisons”.
If you still can't figure out what's wrong, create a minimal test that can be run with mysql test < query.sql that shows your problems. You can create a test file by dumping the tables with mysqldump --quick db_name tbl_name_1 ... tbl_name_n > query.sql. Open the file in an editor, remove some insert lines (if there are more than needed to demonstrate the problem), and add your SELECT statement at the end of the file.
Verify that the test file demonstrates the problem by executing these commands:
shell> mysqladmin create test2 shell> mysql test2 < query.sql
Post the test file using mysqlbug to the general MySQL mailing list. See the section called “The MySQL Mailing Lists”.
Floating-point numbers sometimes cause confusion because they are not stored as exact values inside computer architecture. What you can see on the screen usually is not the exact value of the number. The column types FLOAT, DOUBLE, and DECIMAL are such. DECIMAL columns store values with exact precision because they are represented as strings, but calculations on DECIMAL values may be done using floating-point operations.
The following example demonstrate the problem. It shows that even for the DECIMAL column type, calculations that are done using floating-point operations are subject to floating-point error.
mysql> CREATE TABLE t1 (i INT, d1 DECIMAL(9,2), d2 DECIMAL(9,2)); mysql> INSERT INTO t1 VALUES (1, 101.40, 21.40), (1, -80.00, 0.00), -> (2, 0.00, 0.00), (2, -13.20, 0.00), (2, 59.60, 46.40), -> (2, 30.40, 30.40), (3, 37.00, 7.40), (3, -29.60, 0.00), -> (4, 60.00, 15.40), (4, -10.60, 0.00), (4, -34.00, 0.00), -> (5, 33.00, 0.00), (5, -25.80, 0.00), (5, 0.00, 7.20), -> (6, 0.00, 0.00), (6, -51.40, 0.00); mysql> SELECT i, SUM(d1) AS a, SUM(d2) AS b -> FROM t1 GROUP BY i HAVING a <> b; +------+--------+-------+ | i | a | b | +------+--------+-------+ | 1 | 21.40 | 21.40 | | 2 | 76.80 | 76.80 | | 3 | 7.40 | 7.40 | | 4 | 15.40 | 15.40 | | 5 | 7.20 | 7.20 | | 6 | -51.40 | 0.00 | +------+--------+-------+
The result is correct. Although the first five records look like they shouldn't pass the comparison test (the values of a and b do not appear to be different), they may do so because the difference between the numbers shows up around the tenth decimal or so, depending on computer architecture.
The problem cannot be solved by using ROUND() or similar functions, because the result is still a floating-point number:
mysql> SELECT i, ROUND(SUM(d1), 2) AS a, ROUND(SUM(d2), 2) AS b -> FROM t1 GROUP BY i HAVING a <> b; +------+--------+-------+ | i | a | b | +------+--------+-------+ | 1 | 21.40 | 21.40 | | 2 | 76.80 | 76.80 | | 3 | 7.40 | 7.40 | | 4 | 15.40 | 15.40 | | 5 | 7.20 | 7.20 | | 6 | -51.40 | 0.00 | +------+--------+-------+
This is what the numbers in column a look like when displayed with more decimal places:
mysql> SELECT i, ROUND(SUM(d1), 2)*1.0000000000000000 AS a, -> ROUND(SUM(d2), 2) AS b FROM t1 GROUP BY i HAVING a <> b; +------+----------------------+-------+ | i | a | b | +------+----------------------+-------+ | 1 | 21.3999999999999986 | 21.40 | | 2 | 76.7999999999999972 | 76.80 | | 3 | 7.4000000000000004 | 7.40 | | 4 | 15.4000000000000004 | 15.40 | | 5 | 7.2000000000000002 | 7.20 | | 6 | -51.3999999999999986 | 0.00 | +------+----------------------+-------+
Depending on your computer architecture, you may or may not see similar results. Different CPUs may evaluate floating-point numbers differently. For example, on some machines you may get the “correct” results by multiplying both arguments by 1, as the following example shows.
Warning: Never use this method in your applications. It is not an example of a trustworthy method!
mysql> SELECT i, ROUND(SUM(d1), 2)*1 AS a, ROUND(SUM(d2), 2)*1 AS b -> FROM t1 GROUP BY i HAVING a <> b; +------+--------+------+ | i | a | b | +------+--------+------+ | 6 | -51.40 | 0.00 | +------+--------+------+
The reason that the preceding example seems to work is that on the particular machine where the test was done, CPU floating-point arithmetic happens to round the numbers to the same value. However, there is no rule that any CPU should do so, so this method cannot be trusted.
The correct way to do floating-point number comparison is to first decide on an acceptable tolerance for differences between the numbers and then do the comparison against the tolerance value. For example, if we agree that floating-point numbers should be regarded the same if they are same within a precision of one in ten thousand (0.0001), the comparison should be written to find differences larger than the tolerance value:
mysql> SELECT i, SUM(d1) AS a, SUM(d2) AS b FROM t1 -> GROUP BY i HAVING ABS(a - b) > 0.0001; +------+--------+------+ | i | a | b | +------+--------+------+ | 6 | -51.40 | 0.00 | +------+--------+------+ 1 row in set (0.00 sec)
Conversely, to get rows where the numbers are the same, the test should find differences within the tolerance value:
mysql> SELECT i, SUM(d1) AS a, SUM(d2) AS b FROM t1 -> GROUP BY i HAVING ABS(a - b) <= 0.0001; +------+-------+-------+ | i | a | b | +------+-------+-------+ | 1 | 21.40 | 21.40 | | 2 | 76.80 | 76.80 | | 3 | 7.40 | 7.40 | | 4 | 15.40 | 15.40 | | 5 | 7.20 | 7.20 | +------+-------+-------+