MySQL Standards Compliance

This section describes how MySQL relates to the ANSI/ISO SQL standards. MySQL Server has many extensions to the SQL standard, and here you will find out what they are and how to use them. You will also find information about functionality missing from MySQL Server, and how to work around some differences.

The SQL standard has been evolving since 1986 and several versions exist. In this manual, “SQL-92” refers to the standard released in 1992, “SQL:1999” refers to the standard released in 1999, and “SQL:2003” refers to the current version of the standard. We use the phrase “the SQL standard” to mean the current version of the SQL Standard at any time.

Our goal is to not restrict MySQL Server usability for any usage without a very good reason for doing so. Even if we don't have the resources to perform development for every possible use, we are always willing to help and offer suggestions to people who are trying to use MySQL Server in new territories.

One of our main goals with the product is to continue to work toward compliance with the SQL standard, but without sacrificing speed or reliability. We are not afraid to add extensions to SQL or support for non-SQL features if this greatly increases the usability of MySQL Server for a large segment of our user base. The HANDLER interface in MySQL Server 4.0 is an example of this strategy. See HANDLER.

We will continue to support transactional and non-transactional databases to satisfy both mission-critical 24/7 usage and heavy Web or logging usage.

MySQL Server was originally designed to work with medium size databases (10-100 million rows, or about 100MB per table) on small computer systems. Today MySQL Server handles terabyte-size databases, but the code can also be compiled in a reduced version suitable for hand-held and embedded devices. The compact design of the MySQL server makes development in both directions possible without any conflicts in the source tree.

Currently, we are not targeting realtime support, although MySQL replication capabilities already offer significant functionality.

Database cluster support now exists through third-party clustering solutions as well as the integration of our acquired NDB Cluster technology into a new storage engine, available from version 4.1.2. See Chapter 17, MySQL Cluster.

We are also looking at providing XML support in the database server.

What Standards MySQL Follows

We are aiming toward supporting the full ANSI/ISO SQL standard, but without making concessions to speed and quality of the code.

ODBC levels 0−3.51.

Selecting SQL Modes

The MySQL server can operate in different SQL modes, and can apply these modes differentially for different clients. This allows an application to tailor server operation to its own requirements.

Modes define what SQL syntax MySQL should support and what kind of validation checks it should perform on the data. This makes it easier to use MySQL in a lot of different environments and to use MySQL together with other database servers.

You can set the default SQL mode by starting mysqld with the --sql-mode="modes" option. Beginning with MySQL 4.1, you can also change the mode after startup time by setting the sql_mode variable with a SET [SESSION|GLOBAL] sql_mode='modes' statement.

For more information on setting the server mode, see the section called “The Server SQL Mode”.

Running MySQL in ANSI Mode

You can tell mysqld to use the ANSI mode with the --ansi startup option. See the section called “mysqld Command-Line Options”.

Running the server in ANSI mode is the same as starting it with these options (specify the --sql_mode value on a single line):

--transaction-isolation=SERIALIZABLE
--sql-mode=REAL_AS_FLOAT,PIPES_AS_CONCAT,ANSI_QUOTES,
IGNORE_SPACE,ONLY_FULL_GROUP_BY

In MySQL 4.1, you can achieve the same effect with these two statements (specify the sql_mode value on a single line):

SET GLOBAL TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SET GLOBAL sql_mode = 'REAL_AS_FLOAT,PIPES_AS_CONCAT,ANSI_QUOTES,
IGNORE_SPACE,ONLY_FULL_GROUP_BY';

See the section called “Selecting SQL Modes”.

In MySQL 4.1.1, the sql_mode options shown can be also be set with this statement:

SET GLOBAL sql_mode='ansi';

In this case, the value of the sql_mode variable will be set to all options that are relevant for ANSI mode. You can check the result like this:

mysql> SET GLOBAL sql_mode='ansi';
mysql> SELECT @@global.sql_mode;
        -> 'REAL_AS_FLOAT,PIPES_AS_CONCAT,ANSI_QUOTES,
            IGNORE_SPACE,ONLY_FULL_GROUP_BY,ANSI';

MySQL Extensions to Standard SQL

MySQL Server includes some extensions that you probably will not find in other SQL databases. Be warned that if you use them, your code will not be portable to other SQL servers. In some cases, you can write code that includes MySQL extensions, but is still portable, by using comments of the form /*! ... */. In this case, MySQL Server will parse and execute the code within the comment as it would any other MySQL statement, but other SQL servers will ignore the extensions. For example:

SELECT /*! STRAIGHT_JOIN */ col_name FROM table1,table2 WHERE ...

If you add a version number after the ‘!’ character, the syntax within the comment will be executed only if the MySQL version is equal to or newer than the specified version number:

CREATE /*!32302 TEMPORARY */ TABLE t (a INT);

This means that if you have Version 3.23.02 or newer, MySQL Server will use the TEMPORARY keyword.

The following descriptions list MySQL extensions, organized by category.

Organization of data on disk

MySQL Server maps each database to a directory under the MySQL data directory, and tables within a database to filenames in the database directory. This has a few implications:

  • Database names and table names are case sensitive in MySQL Server on operating systems that have case-sensitive filenames (such as most Unix systems). See the section called “Identifier Case Sensitivity”.

  • You can use standard system commands to back up, rename, move, delete, and copy tables that are managed by the MyISAM or ISAM storage engines. For example, to rename a MyISAM table, rename the .MYD, .MYI, and .frm files to which the table corresponds.

Database, table, index, column, or alias names may begin with a digit (but may not consist solely of digits).

General language syntax
  • Strings may be enclosed by either ‘"’ or ‘'’, not just by ‘'’.

  • Use of ‘\’ as an escape character in strings.

  • In SQL statements, you can access tables from different databases with the db_name.tbl_name syntax. Some SQL servers provide the same functionality but call this User space. MySQL Server doesn't support tablespaces such as used in statements like this: CREATE TABLE ralph.my_table...IN my_tablespace.

SQL statement syntax
  • The ANALYZE TABLE, CHECK TABLE, OPTIMIZE TABLE, and REPAIR TABLE statements.

  • The CREATE DATABASE and DROP DATABASE statements. See CREATE DATABASE.

  • The DO statement.

  • EXPLAIN SELECT to get a description of how tables are joined.

  • The FLUSH and RESET statements.

  • The SET statement. See SET.

  • The SHOW statement. See SHOW.

  • Use of LOAD DATA INFILE. In many cases, this syntax is compatible with Oracle's LOAD DATA INFILE. See LOAD DATA.

  • Use of RENAME TABLE. See RENAME TABLE.

  • Use of REPLACE instead of DELETE + INSERT. See REPLACE.

  • Use of CHANGE col_name, DROP col_name, or DROP INDEX, IGNORE or RENAME in an ALTER TABLE statement. Use of multiple ADD, ALTER, DROP, or CHANGE clauses in an ALTER TABLE statement. See ALTER TABLE.

  • Use of index names, indexes on a prefix of a field, and use of INDEX or KEY in a CREATE TABLE statement. See CREATE TABLE.

  • Use of TEMPORARY or IF NOT EXISTS with CREATE TABLE.

  • Use of IF EXISTS with DROP TABLE.

  • You can drop multiple tables with a single DROP TABLE statement.

  • The ORDER BY and LIMIT clauses of the UPDATE and DELETE statements.

  • INSERT INTO ... SET col_name = ... syntax.

  • The DELAYED clause of the INSERT and REPLACE statements.

  • The LOW_PRIORITY clause of the INSERT, REPLACE, DELETE, and UPDATE statements.

  • Use of INTO OUTFILE and STRAIGHT_JOIN in a SELECT statement. See SELECT.

  • The SQL_SMALL_RESULT option in a SELECT statement.

  • You don't need to name all selected columns in the GROUP BY part. This gives better performance for some very specific, but quite normal queries. See the section called “Functions and Modifiers for Use with GROUP BY Clauses”.

  • You can specify ASC and DESC with GROUP BY.

  • The ability to set variables in a statement with the := assignment operator:

    mysql> SELECT @a:=SUM(total),@b=COUNT(*),@a/@b AS avg
        -> FROM test_table;
    mysql> SELECT @t1:=(@t2:=1)+@t3:=4,@t1,@t2,@t3;
    
Column types
  • The column types MEDIUMINT, SET, ENUM, and the different BLOB and TEXT types.

  • The column attributes AUTO_INCREMENT, BINARY, NULL, UNSIGNED, and ZEROFILL.

Functions and operators
  • To make it easier for users who come from other SQL environments, MySQL Server supports aliases for many functions. For example, all string functions support both standard SQL syntax and ODBC syntax.

  • MySQL Server understands the || and && operators to mean logical OR and AND, as in the C programming language. In MySQL Server, || and OR are synonyms, as are && and AND. Because of this nice syntax, MySQL Server doesn't support the standard SQL || operator for string concatenation; use CONCAT() instead. Because CONCAT() takes any number of arguments, it's easy to convert use of the || operator to MySQL Server.

  • Use of COUNT(DISTINCT list) where list has more than one element.

  • All string comparisons are case-insensitive by default, with sort ordering determined by the current character set (ISO-8859-1 Latin1 by default). If you don't like this, you should declare your columns with the BINARY attribute or use the BINARY cast, which causes comparisons to be done using the underlying character code values rather then a lexical ordering.

  • The % operator is a synonym for MOD(). That is, N % M is equivalent to MOD(N,M). % is supported for C programmers and for compatibility with PostgreSQL.

  • The =, <>, <= ,<, >=,>, <<, >>, <=>, AND, OR, or LIKE operators may be used in column comparisons to the left of the FROM in SELECT statements. For example:

    mysql> SELECT col1=1 AND col2=2 FROM tbl_name;
    
  • The LAST_INSERT_ID() function that returns the most recent AUTO_INCREMENT value. See the section called “Information Functions”.

  • LIKE is allowed on numeric columns.

  • The REGEXP and NOT REGEXP extended regular expression operators.

  • CONCAT() or CHAR() with one argument or more than two arguments. (In MySQL Server, these functions can take any number of arguments.)

  • The BIT_COUNT(), CASE, ELT(), FROM_DAYS(), FORMAT(), IF(), PASSWORD(), ENCRYPT(), MD5(), ENCODE(), DECODE(), PERIOD_ADD(), PERIOD_DIFF(), TO_DAYS(), and WEEKDAY() functions.

  • Use of TRIM() to trim substrings. Standard SQL supports removal of single characters only.

  • The GROUP BY functions STD(), BIT_OR(), BIT_AND(), BIT_XOR(), and GROUP_CONCAT(). See the section called “Functions and Modifiers for Use with GROUP BY Clauses”.

For a prioritized list indicating when new extensions will be added to MySQL Server, you should consult the online MySQL TODO list at http://dev.mysql.com/doc/mysql/en/TODO.html. That is the latest version of the TODO list in this manual. See the section called “MySQL and the Future (the TODO)”.

MySQL Differences from Standard SQL

We try to make MySQL Server follow the ANSI SQL standard and the ODBC SQL standard, but MySQL Server performs operations differently in some cases:

Subqueries

MySQL 4.1 supports subqueries and derived tables. A “subquery” is a SELECT statement nested within another statement. A “derived table” (an unnamed view) is a subquery in the FROM clause of another statement. See the section called “Subquery Syntax”.

For MySQL versions older than 4.1, most subqueries can be rewritten using joins or other methods. See the section called “Rewriting Subqueries as Joins for Earlier MySQL Versions” for examples that show how to do this.

SELECT INTO TABLE

MySQL Server doesn't support the Sybase SQL extension: SELECT ... INTO TABLE .... Instead, MySQL Server supports the standard SQL syntax INSERT INTO ... SELECT ..., which is basically the same thing. See INSERT SELECT.

INSERT INTO tbl_temp2 (fld_id)
    SELECT tbl_temp1.fld_order_id
    FROM tbl_temp1 WHERE tbl_temp1.fld_order_id > 100;

Alternatively, you can use SELECT INTO OUTFILE ... or CREATE TABLE ... SELECT.

From version 5.0, MySQL supports SELECT ... INTO with user variables. The same syntax may also be used inside stored procedures using cursors and local variables. See the section called “SELECT ... INTO Statement”.

Transactions and Atomic Operations

MySQL Server (version 3.23-max and all versions 4.0 and above) supports transactions with the InnoDB and BDB transactional storage engines. InnoDB provides full ACID compliance. See Chapter 15, MySQL Storage Engines and Table Types.

The other non-transactional storage engines in MySQL Server (such as MyISAM) follow a different paradigm for data integrity called “atomic operations.” In transactional terms, MyISAM tables effectively always operate in AUTOCOMMIT=1 mode. Atomic operations often offer comparable integrity with higher performance.

With MySQL Server supporting both paradigms, you can decide whether your applications are best served by the speed of atomic operations or the use of transactional features. This choice can be made on a per-table basis.

As noted, the trade-off for transactional versus non-transactional table types lies mostly in performance. Transactional tables have significantly higher memory and diskspace requirements, and more CPU overhead. On the other hand, transactional table types such as InnoDB also offer many significant features. MySQL Server's modular design allows the concurrent use of different storage engines to suit different requirements and deliver optimum performance in all situations.

But how do you use the features of MySQL Server to maintain rigorous integrity even with the non-transactional MyISAM tables, and how do these features compare with the transactional table types?

  1. If your applications are written in a way that is dependent on being able to call ROLLBACK rather than COMMIT in critical situations, transactions are more convenient. Transactions also ensure that unfinished updates or corrupting activities are not committed to the database; the server is given the opportunity to do an automatic rollback and your database is saved.

    If you use non-transactional tables, MySQL Server in almost all cases allows you to resolve potential problems by including simple checks before updates and by running simple scripts that check the databases for inconsistencies and automatically repair or warn if such an inconsistency occurs. Note that just by using the MySQL log or even adding one extra log, you can normally fix tables perfectly with no data integrity loss.

  2. More often than not, critical transactional updates can be rewritten to be atomic. Generally speaking, all integrity problems that transactions solve can be done with LOCK TABLES or atomic updates, ensuring that you never will get an automatic abort from the server, which is a common problem with transactional database systems.

  3. Even a transactional system can lose data if the server goes down. The difference between different systems lies in just how small the time-lag is where they could lose data. No system is 100% secure, only “secure enough.” Even Oracle, reputed to be the safest of transactional database systems, is reported to sometimes lose data in such situations.

    To be safe with MySQL Server, whether or not using transactional tables, you only need to have backups and have binary logging turned on. With this you can recover from any situation that you could with any other transactional database system. It is always good to have backups, regardless of which database system you use.

The transactional paradigm has its benefits and its drawbacks. Many users and application developers depend on the ease with which they can code around problems where an abort appears to be, or is necessary. However, even if you are new to the atomic operations paradigm, or more familiar with transactions, do consider the speed benefit that non-transactional tables can offer on the order of three to five times the speed of the fastest and most optimally tuned transactional tables.

In situations where integrity is of highest importance, MySQL Server offers transaction-level reliability and integrity even for non-transactional tables. If you lock tables with LOCK TABLES, all updates will stall until any integrity checks are made. If you obtain a READ LOCAL lock (as opposed to a write lock) for a table that allows concurrent inserts at the end of the table, reads are allowed, as are inserts by other clients. The new inserted records will not be seen by the client that has the read lock until it releases the lock. With INSERT DELAYED, you can queue inserts into a local queue, until the locks are released, without having the client wait for the insert to complete. See the section called “INSERT DELAYED Syntax”.

“Atomic,” in the sense that we mean it, is nothing magical. It only means that you can be sure that while each specific update is running, no other user can interfere with it, and there will never be an automatic rollback (which can happen with transactional tables if you are not very careful). MySQL Server also guarantees that there will not be any dirty reads.

Following are some techniques for working with non-transactional tables:

  • Loops that need transactions normally can be coded with the help of LOCK TABLES, and you don't need cursors to update records on the fly.

  • To avoid using ROLLBACK, you can use the following strategy:

    1. Use LOCK TABLES to lock all the tables you want to access.

    2. Test the conditions that must be true before performing the update.

    3. Update if everything is okay.

    4. Use UNLOCK TABLES to release your locks.

    This is usually a much faster method than using transactions with possible rollbacks, although not always. The only situation this solution doesn't handle is when someone kills the threads in the middle of an update. In this case, all locks will be released but some of the updates may not have been executed.

  • You can also use functions to update records in a single operation. You can get a very efficient application by using the following techniques:

    • Modify columns relative to their current value.

    • Update only those columns that actually have changed.

    For example, when we are doing updates to some customer information, we update only the customer data that has changed and test only that none of the changed data, or data that depends on the changed data, has changed compared to the original row. The test for changed data is done with the WHERE clause in the UPDATE statement. If the record wasn't updated, we give the client a message: “Some of the data you have changed has been changed by another user.” Then we show the old row versus the new row in a window so that the user can decide which version of the customer record to use.

    This gives us something that is similar to column locking but is actually even better because we only update some of the columns, using values that are relative to their current values. This means that typical UPDATE statements look something like these:

    UPDATE tablename SET pay_back=pay_back+125;
    
    UPDATE customer
      SET
        customer_date='current_date',
        address='new address',
        phone='new phone',
        money_owed_to_us=money_owed_to_us-125
      WHERE
        customer_id=id AND address='old address' AND phone='old phone';
    

    This is very efficient and works even if another client has changed the values in the pay_back or money_owed_to_us columns.

  • In many cases, users have wanted LOCK TABLES and/or ROLLBACK for the purpose of managing unique identifiers. This can be handled much more efficiently without locking or rolling back by using an AUTO_INCREMENT column and either the LAST_INSERT_ID() SQL function or the mysql_insert_id() C API function. See the section called “Information Functions”. See mysql_insert_id().

    You can generally code around the need for row-level locking. Some situations really do need it, and InnoDB tables support row-level locking. With MyISAM tables, you can use a flag column in the table and do something like the following:

    UPDATE tbl_name SET row_flag=1 WHERE id=ID;
    

    MySQL returns 1 for the number of affected rows if the row was found and row_flag wasn't already 1 in the original row.

    You can think of it as though MySQL Server changed the preceding query to:

    UPDATE tbl_name SET row_flag=1 WHERE id=ID AND row_flag <> 1;
    

Stored Procedures and Triggers

Stored procedures are implemented in MySQL version 5.0. See Chapter 20, Stored Procedures and Functions.

Triggers are currently being implemented, with basic functionality in MySQL version 5.0, with further development planned for version 5.1. A “trigger” is effectively a type of stored procedure, one that is invoked when a particular event occurs. For example, you could set up a stored procedure that is triggered each time a record is deleted from a transactional table, and that stored procedure automatically deletes the corresponding customer from a customer table when all their transactions are deleted.

Foreign Keys

In MySQL Server 3.23.44 and up, the InnoDB storage engine supports checking of foreign key constraints, including CASCADE, ON DELETE, and ON UPDATE. See the section called “FOREIGN KEY Constraints”.

For storage engines other than InnoDB, MySQL Server parses the FOREIGN KEY syntax in CREATE TABLE statements, but does not use or store it. In the future, the implementation will be extended to store this information in the table specification file so that it may be retrieved by mysqldump and ODBC. At a later stage, foreign key constraints will be implemented for MyISAM tables as well.

Foreign key enforcement offers several benefits to database developers:

  • Assuming proper design of the relationships, foreign key constraints make it more difficult for a programmer to introduce an inconsistency into the database.

  • Centralized checking of constraints by the database server makes it unnecessary to perform these checks on the application side. This eliminates the possibility that different applications may not all check the constraints in the same way.

  • Using cascading updates and deletes can simplify the application code.

  • Properly designed foreign key rules aid in documenting relationships between tables.

Do keep in mind that these benefits come at the cost of additional overhead for the database server to perform the necessary checks. Additional checking by the server affects performance, which for some applications may be sufficiently undesirable as to be avoided if possible. (Some major commercial applications have coded the foreign-key logic at the application level for this reason.)

MySQL gives database developers the choice of which approach to use. If you don't need foreign keys and want to avoid the overhead associated with enforcing referential integrity, you can choose another table type instead, such as MyISAM. (For example, the MyISAM storage engine offers very fast performance for applications that perform only INSERT and SELECT operations, because the inserts can be performed concurrently with retrievals. See the section called “Table Locking Issues”.)

If you choose not to take advantage of referential integrity checks, keep the following considerations in mind:

  • In the absence of server-side foreign key relationship checking, the application itself must handle relationship issues. For example, it must take care to insert rows into tables in the proper order, and to avoid creating orphaned child records. It must also be able to recover from errors that occur in the middle of multiple-record insert operations.

  • If ON DELETE is the only referential integrity capability an application needs, note that as of MySQL Server 4.0, you can use multiple-table DELETE statements to delete rows from many tables with a single statement. See DELETE.

  • A workaround for the lack of ON DELETE is to add the appropriate DELETE statement to your application when you delete records from a table that has a foreign key. In practice, this is often as quick as using foreign keys, and is more portable.

Be aware that the use of foreign keys can in some instances lead to problems:

  • Foreign key support addresses many referential integrity issues, but it is still necessary to design key relationships carefully to avoid circular rules or incorrect combinations of cascading deletes.

  • It is not uncommon for a DBA to create a topology of relationships that makes it difficult to restore individual tables from a backup. (MySQL alleviates this difficulty by allowing you to temporarily disable foreign key checks when reloading a table that depends on other tables. See the section called “FOREIGN KEY Constraints”. As of MySQL 4.1.1, mysqldump generates dump files that take advantage of this capability automatically when reloaded.)

Note that foreign keys in SQL are used to check and enforce referential integrity, not to join tables. If you want to get results from multiple tables from a SELECT statement, you do this by performing a join between them:

SELECT * FROM t1, t2 WHERE t1.id = t2.id;

See JOIN. See the section called “Using Foreign Keys”.

The FOREIGN KEY syntax without ON DELETE ... is often used by ODBC applications to produce automatic WHERE clauses.

Views

Views (updatable) are being implemented in the 5.0 version of MySQL Server. They are already available in binary releases from 5.0.1 and up. See the section called “CREATE VIEW Syntax”.

Views are useful for allowing users to access a set of relations (tables) as if it were a single table, and limiting their access to just that. Views can also be used to restrict access to rows (a subset of a particular table). For access control to columns, you can also use the sophisticated privilege system in MySQL Server. See the section called “The MySQL Access Privilege System”.

In designing an implementation of views, our ambitious goal, as much as is possible within the confines of SQL, has been full compliance with “Codd's Rule #6” for relational database systems: “All views that are theoretically updatable, should in practice also be updatable.”

--’ as the Start of a Comment

Some other SQL databases use ‘--’ to start comments. MySQL Server uses ‘#’ as the start comment character. You can also use the C comment style /* this is a comment */ with MySQL Server. See the section called “Comment Syntax”.

MySQL Server 3.23.3 and above support the ‘--’ comment style, provided the comment is followed by a space (or by a control character such as a newline). The requirement for a space is to prevent problems with automatically generated SQL queries that have used something like the following code, where we automatically insert the value of the payment for !payment!:

UPDATE account SET credit=credit-!payment!

Think about what happens if the value of payment is a negative value such as -1:

UPDATE account SET credit=credit--1

credit--1 is a legal expression in SQL, but if -- is interpreted as the start of a comment, part of the expression is discarded. The result is a statement that has a completely different meaning than intended:

UPDATE account SET credit=credit

The statement produces no change in value at all! This illustrates that allowing comments to start with ‘--’ can have serious consequences.

Using our implementation of this method of commenting in MySQL Server 3.23.3 and up, credit--1 is actually safe.

Another safe feature is that the mysql command-line client removes all lines that start with ‘--’.

The following information is relevant only if you are running a MySQL version earlier than 3.23.3:

If you have an SQL program in a text file that contains ‘--’ comments, you should use the replace utility as follows to convert the comments to use ‘#’ characters:

shell> replace " --" " #" < text-file-with-funny-comments.sql \
         | mysql db_name

instead of the usual:

shell> mysql db_name < text-file-with-funny-comments.sql

You can also edit the command file “in place” to change the ‘--’ comments to ‘#’ comments:

shell> replace " --" " #" -- text-file-with-funny-comments.sql

Change them back with this command:

shell> replace " #" " --" -- text-file-with-funny-comments.sql

How MySQL Deals with Constraints

MySQL allows you to work with both transactional tables that allow rollback and non-transactional tables that do not, so constraint handling is a bit different in MySQL than in other databases.

We have to handle the case when you have updated a lot of rows in a non-transactional table that cannot roll back when an error occurs.

The basic philosophy is to try to give an error for anything that we can detect at compile time but try to recover from any errors we get at runtime. We do this in most cases, but not yet for all. See the section called “New Features Planned for the Near Future”.

The options MySQL has when an error occurs are to stop the statement in the middle or to recover as well as possible from the problem and continue.

The following sections describe what happens for the different types of constraints.

Constraint PRIMARY KEY / UNIQUE

Normally you will get an error when you try to INSERT or UPDATE a row that causes a primary key, unique key, or foreign key violation. If you are using a transactional storage engine such as InnoDB, MySQL will automatically roll back the transaction. If you are using a non-transactional storage engine, MySQL will stop at the incorrect row and leave any remaining rows unprocessed.

To make life easier, MySQL supports an IGNORE keyword for most commands that can cause a key violation (such as INSERT IGNORE and UPDATE IGNORE). In this case, MySQL will ignore any key violation and continue with processing the next row. You can get information about what MySQL did with the mysql_info() C API function. See mysql_info(). In MySQL 4.1 and up, you also can use the SHOW WARNINGS statement. See SHOW WARNINGS.

Note that, for the moment, only InnoDB tables support foreign keys. See the section called “FOREIGN KEY Constraints”. Foreign key support in MyISAM tables is scheduled for implementation in MySQL 5.1.

Constraint NOT NULL and DEFAULT Values

To be able to support easy handling of non-transactional tables, all columns in MySQL have implicit/explicit default values.

The following holds true when you are not using the STRICT_ALL_TABLES or if you are using STRICT_TRANS_TABLES and the first update to a non-transactional table didn't produce an error. See the section called “The Server SQL Mode”. It's also true if you are using INSERT IGNORE or UPDATE IGNORE.

If you insert an “incorrect” value into a column, such as a NULL into a NOT NULL column or a too-large numerical value into a numerical column, MySQL sets the column to the “best possible value” instead of producing an error:

  • If you try to store a value outside the range in a numerical column, MySQL Server instead stores zero, the smallest possible value, or the largest possible value in the column.

  • For strings, MySQL stores either the empty string or the longest possible string that can be in the column.

  • If you try to store a string that doesn't start with a number into a numerical column, MySQL Server stores 0.

  • If you try to store NULL into a column that doesn't take NULL values, MySQL Server stores 0 or '' (the empty string) instead. This last behavior can, for single-row inserts, be changed when MySQL is built by using the -DDONT_USE_DEFAULT_FIELDS compile option.) See configure options. This causes INSERT statements to generate an error unless you explicitly specify values for all columns that require a non-NULL value.

  • MySQL allows you to store some incorrect date values into DATE and DATETIME columns (like '2000-02-31' or '2000-02-00'). The idea is that it's not the job of the SQL server to validate dates. If MySQL can store a date value and retrieve exactly the same value, MySQL stores it as given. If the date is totally wrong (outside the server's ability to store it), the special date value '0000-00-00' is stored in the column instead.

  • If you don't specify a column that doesn't have a default value in an INSERT statement, MySQL will insert an empty string for string columns and 0 for numerical columns.

The reason for the preceding rules is that we can't check these conditions until the query has begun executing. We can't just roll back if we encounter a problem after updating a few rows, because the storage engine may not support rollback. The option of terminating the statement is not that good; in this case, the update would be “half done,” which is probably the worst possible scenario. In this case, it's better to “do the best you can” and then continue as if nothing happened.

In other words, if you are using STRICT_ALL_TABLES, you risk getting half updates. A safer setting is to use STRICT_TRANS_TABLES. In this case, a wrong value causes MySQL to roll back, if it can, all updates done so far (that is, if we have updated only transactional tables).

If you are using non-transactional tables, you should not use MySQL to check column content. In general, the safest (and often fastest) way is to let the application ensure that it passes only legal values to the database.

Constraint ENUM and SET

In MySQL 4.x and earlier, ENUM is not a real constraint, but is a more efficient way to define columns that can contain only a given set of values. This is for the same reasons that NOT NULL is not honored. See constraint NOT NULL.

ENUM columns always have a default value. If you don't specify a default value, then it will be NULL for columns that can have NULL, otherwise the first enumeration value is used as the default value.

If you insert an incorrect value into an ENUM column or if you force a value into an ENUM column with IGNORE, it is set to the reserved enumeration value of 0, which is displayed as an empty string in string context. See ENUM.

If you insert an incorrect value into a SET column, the incorrect value is ignored. For example, if the column can contain the values 'a', 'b', and 'c', an attempt to assign 'a,x,b,y' results in a value of 'a,b'. See SET.

As of MySQL 5.0.2, strict SQL mode is implemented. See the section called “The Server SQL Mode”. If strict mode is not enabled, values entered into ENUM and SET columns are handled as just described for MySQL 4.x. If strict mode is enabled, the definition of a ENUM or SET column acts as a constraint on values entered into the column. An error occurs for values that do not satisify these conditions:

  • An ENUM value must be one of those listed in the column definition, or the internal numeric equivalent thereof. The value cannot be the error value (that is, 0 or the empty string). For a column defined as ENUM('a','b','c'), values such as '', 'd', and 'ax' are illegal.

  • A SET value must be the empty string or a value consisting of one or more of the values listed in the column definition separated by commas. For a column defined as SET('a','b','c'), values such as 'd', and 'a,b,c,d' are illegal.

In strict mode, errors for invalid values can be suppressed if you use INSERT IGNORE or UPDATE IGNORE. In this case, a warning is generated rather than an error. For ENUM, the value is inserted as the error member (0). For SET, the value is inserted as given except that that any invalid substrings are deleted. For example, 'a,x,b,y' results in a value of 'a,b', as described earlier.

Known Errors and Design Deficiencies in MySQL

Errors in 3.23 Fixed in a Later MySQL Version

The following known errors or bugs are not fixed in MySQL 3.23 because fixing them would involve changing a lot of code that could introduce other even worse bugs. The bugs are also classified as “not fatal” or “bearable.”

  • You should avoid using spaces at the end of column names because this can cause weird behavior. (Fixed in MySQL 4.0.) (Bug #4196)

  • You can get a deadlock (hung thread) if you use LOCK TABLE to lock multiple tables and then in the same connection use DROP TABLE to drop one of them while another thread is trying to lock it. (To break the deadlock, you can use KILL to terminate any of the threads involved.) This issue is resolved as of MySQL 4.0.12.

  • SELECT MAX(key_column) FROM t1,t2,t3... where one of the tables are empty doesn't return NULL but instead returns the maximum value for the column. This issue is resolved as of MySQL 4.0.11.

  • DELETE FROM heap_table without a WHERE clause doesn't work on a locked HEAP table.

Errors in 4.0 Fixed in a Later MySQL Version

The following known errors or bugs are not fixed in MySQL 4.0 because fixing them would involve changing a lot of code that could introduce other even worse bugs. The bugs are also classified as “not fatal” or “bearable.”

  • In a UNION, the first SELECT determines the type, max_length, and NULL properties for the resulting columns. This issue is resolved as of MySQL 4.1.1; the property values are based on the rows from all UNION parts.

  • In DELETE with many tables, you can't refer to tables to be deleted through an alias. This is fixed as of MySQL 4.1.

  • You cannot mix UNION ALL and UNION DISTINCT in the same query. If you use ALL for one UNION, it is used for all of them. This is fixed as of MySQL 4.1.2. The rules for mixed UNION types are given in UNION.

  • FLUSH TABLES WITH READ LOCK does not block CREATE TABLE, which may cause a problem with the binary log position when doing a full backup of tables and the binary log.

Open Bugs and Design Deficiencies in MySQL

The following problems are known and fixing them is a high priority:

  • Even if you are using lower_case_table_names=2 (which enables MySQL to remember the used case for databases and table names) MySQL will not on case insensitive systems remember the used case for database names for the function DATABASE() or in various logs.

  • Dropping a FOREIGN KEY constraint doesn't work in replication because the constraint may have another name on the slave.

  • REPLACE (and LOAD DATA with the REPLACE option) does not trigger ON DELETE CASCADE.

  • DISTINCT with ORDER BY doesn't work inside GROUP_CONCAT() if you don't use all and only those columns that are in the DISTINCT list.

  • If one user has a long-running transaction and another user drops a table that is updated in the transaction, there is small chance that the binary log may contain the DROP TABLE command before the table is used in the transaction itself. We plan to fix this in 5.0 by having the DROP TABLE wait until the table is not used in any transaction.

  • When inserting a big integer value (between 2^63 and 2^64−1) into a decimal/string column, it is inserted as a negative value because the number is evaluated in a signed integer context. We plan to fix this in MySQL 4.1.

    FLUSH TABLES WITH READ LOCK does not block COMMIT if the server is running without binary logging, which may cause a problem (of consistency between tables) when doing a full backup.

  • ANALYZE TABLE on a BDB table may in some cases make the table unusable until you restart mysqld. If this happens, you will see errors of the following form in the MySQL error file:

    001207 22:07:56  bdb:  log_flush: LSN past current end-of-log
    
  • MySQL accepts parentheses in the FROM clause of a SELECT statement, but silently ignores them. The reason for not giving an error is that many clients that automatically generate queries add parentheses in the FROM clause even where they are not needed.

  • Concatenating many RIGHT JOINS or combining LEFT and RIGHT join in the same query may not give a correct answer because MySQL only generates NULL rows for the table preceding a LEFT or before a RIGHT join. This will be fixed in 5.0 at the same time we add support for parentheses in the FROM clause.

  • Don't execute ALTER TABLE on a BDB table on which you are running multiple-statement transactions until all those transactions complete. (The transaction will probably be ignored.)

  • ANALYZE TABLE, OPTIMIZE TABLE, and REPAIR TABLE may cause problems on tables for which you are using INSERT DELAYED.

  • Doing a LOCK TABLE ... and FLUSH TABLES ... doesn't guarantee that there isn't a half-finished transaction in progress on the table.

  • BDB tables are a bit slow to open. If you have many BDB tables in a database, it will take a long time to use the mysql client on the database if you are not using the -A option or if you are using rehash. This is especially notable when you have a large table cache.

  • Replication uses query-level logging: The master writes the executed queries to the binary log. This is a very fast, compact, and efficient logging method that works perfectly in most cases. Although we have never heard of it actually occurring, it is theoretically possible for the data on the master and slave to become different if a query is designed in such a way that the data modification is non-deterministic; that is, left to the will of the query optimizer. (That generally is not a good practice anyway, even outside of replication!) For example:

    • CREATE ... SELECT or INSERT ... SELECT statements that insert zero or NULL values into an AUTO_INCREMENT column.

    • DELETE if you are deleting rows from a table that has foreign keys with ON DELETE CASCADE properties.

    • REPLACE ... SELECT, INSERT IGNORE ... SELECT if you have duplicate key values in the inserted data.

    If and only if all these queries have no ORDER BY clause guaranteeing a deterministic order.

    For example, for INSERT ... SELECT with no ORDER BY, the SELECT may return rows in a different order (which will result in a row having different ranks, hence getting a different number in the AUTO_INCREMENT column), depending on the choices made by the optimizers on the master and slave. A query will be optimized differently on the master and slave only if:

    • The files used by the two queries are not exactly the same; for example, OPTIMIZE TABLE was run on the master tables and not on the slave tables. (To fix this, OPTIMIZE TABLE, ANALYZE TABLE, and REPAIR TABLE are written to the binary log as of MySQL 4.1.1).

    • The table is stored using a different storage engine on the master than on the slave. (It is possible to use different storage engines on the master and slave. For example, you can use InnoDB on the master, but MyISAM on the slave if the slave has less available disk space.)

    • MySQL buffer sizes (key_buffer_size, and so on) are different on the master and slave.

    • The master and slave run different MySQL versions, and the optimizer code differs between these versions.

    This problem may also affect database restoration using mysqlbinlog|mysql.

    The easiest way to avoid this problem in all cases is to add an ORDER BY clause to such non-deterministic queries to ensure that the rows are always stored or modified in the same order. In future MySQL versions, we will automatically add an ORDER BY clause when needed.

The following problems are known and will be fixed in due time:

  • Log filenames are based on the server hostname (if you don't specify a filename with the startup option). For now you have to use options like --log-bin=old_host_name-bin if you change your hostname to something else. Another option is to just rename the old files to reflect your hostname change. See the section called “mysqld Command-Line Options”.

  • mysqlbinlog will not delete temporary files left after a LOAD DATA INFILE command. See mysqlbinlog.

  • RENAME doesn't work with TEMPORARY tables or tables used in a MERGE table.

  • When using the RPAD() function in a query that has to be resolved by using a temporary table, all resulting strings will have rightmost spaces removed. This is an example of such a query:

    SELECT RPAD(t1.column1, 50, ' ') AS f2, RPAD(t2.column2, 50, ' ') AS f1
    FROM table1 as t1 LEFT JOIN table2 AS t2 ON t1.record=t2.joinID
    ORDER BY t2.record;
    

    The final result of this bug is that you will not be able to get spaces on the right side of the resulting values. The problem also occurs for any other string function that adds spaces to the right.

    The reason for this is due to the fact that HEAP tables, which are used first for temporary tables, are not capable of handling VARCHAR columns.

    This behavior exists in all versions of MySQL. It will be fixed in one of the 4.1 series releases.

  • Due to the way table definition files are stored, you cannot use character 255 (CHAR(255)) in table names, column names, or enumerations. This is scheduled to be fixed in version 5.1 when we have new table definition format files.

  • When using SET CHARACTER SET, you can't use translated characters in database, table, and column names.

  • You can't use ‘_’ or ‘%’ with ESCAPE in LIKE ... ESCAPE.

  • If you have a DECIMAL column in which the same number is stored in different formats (for example, +01.00, 1.00, 01.00), GROUP BY may regard each value as a different value.

  • You cannot build the server in another directory when using MIT-pthreads. Because this requires changes to MIT-pthreads, we are not likely to fix this. See the section called “MIT-pthreads Notes”.

  • BLOB and TEXTvalues can't “reliably” be used in GROUP BY or ORDER BY or DISTINCT. Only the first max_sort_length bytes are used when comparing BLOB values in these cases. The default value of max_sort_length value is 1024. It can be changed at server startup time. As of MySQL 4.0.3, it can also be changed at runtime. For older versions, a workaround for most cases is to use a substring. For example:

    SELECT DISTINCT LEFT(blob_col,2048) FROM tbl_name;
    
  • Numeric calculations are done with BIGINT or DOUBLE (both are normally 64 bits long). Which precision you get depends on the function. The general rule is that bit functions are done with BIGINT precision, IF and ELT() with BIGINT or DOUBLE precision, and the rest with DOUBLE precision. You should try to avoid using unsigned long long values if they resolve to be bigger than 63 bits (9223372036854775807) for anything other than bit fields. MySQL Server 4.0 has better BIGINT handling than 3.23.

  • All string columns, except BLOB and TEXT columns, automatically have all trailing spaces removed when retrieved. For CHAR types, this is okay. The bug is that in MySQL Server, VARCHAR columns are treated the same way.

  • You can have only up to 255 ENUM and SET columns in one table.

  • In MIN(), MAX(), and other aggregate functions, MySQL currently compares ENUM and SET columns by their string value rather than by the string's relative position in the set.

  • mysqld_safe redirects all messages from mysqld to the mysqld log. One problem with this is that if you execute mysqladmin refresh to close and reopen the log, stdout and stderr are still redirected to the old log. If you use --log extensively, you should edit mysqld_safe to log to host_name.err instead of host_name.log so that you can easily reclaim the space for the old log by deleting the old one and executing mysqladmin refresh.

  • In the UPDATE statement, columns are updated from left to right. If you refer to an updated column, you get the updated value instead of the original value. For example, the following statement increments KEY by 2, not 1:

    mysql> UPDATE tbl_name SET KEY=KEY+1,KEY=KEY+1;
    
  • You can refer to multiple temporary tables in the same query, but you cannot refer to any given temporary table more than once. For example, the following doesn't work:

    mysql> SELECT * FROM temp_table, temp_table AS t2;
    ERROR 1137: Can't reopen table: 'temp_table'
    
  • The optimizer may handle DISTINCT differently when you are using “hidden” columns in a join than when you are not. In a join, hidden columns are counted as part of the result (even if they are not shown), whereas in normal queries, hidden columns don't participate in the DISTINCT comparison. We will probably change this in the future to never compare the hidden columns when executing DISTINCT.

    An example of this is:

    SELECT DISTINCT mp3id FROM band_downloads
           WHERE userid = 9 ORDER BY id DESC;
    

    and

    SELECT DISTINCT band_downloads.mp3id
           FROM band_downloads,band_mp3
           WHERE band_downloads.userid = 9
           AND band_mp3.id = band_downloads.mp3id
           ORDER BY band_downloads.id DESC;
    

    In the second case, you might in MySQL Server 3.23.x get two identical rows in the result set (because the values in the hidden id column may differ).

    Note that this happens only for queries where you don't have the ORDER BY columns in the result.

  • Because MySQL Server allows you to work with table types that don't support transactions, and thus can't roll back data, some things behave a little differently in MySQL Server than in other SQL servers. This is just to ensure that MySQL Server never needs to do a rollback for an SQL statement. This may be a little awkward at times because column values must be checked in the application, but this will actually give you a nice speed increase because it allows MySQL Server to do some optimizations that otherwise would be very hard to do.

    If you set a column to an incorrect value, MySQL Server will, instead of doing a rollback, store the “best possible value” in the column. For information about how this occurs, see the section called “How MySQL Deals with Constraints”.

  • If you execute a PROCEDURE on a query that returns an empty set, in some cases the PROCEDURE will not transform the columns.

  • Creation of a table of type MERGE doesn't check whether the underlying tables are of compatible types.

  • If you use ALTER TABLE first to add a UNIQUE index to a table used in a MERGE table and then to add a normal index on the MERGE table, the key order will be different for the tables if there was an old key that was not unique in the table. This is because ALTER TABLE puts UNIQUE indexes before normal indexes to be able to detect duplicate keys as early as possible.

The following are known bugs in earlier versions of MySQL:

  • In the following case you can get a core dump:

    • Delayed insert handler has pending inserts to a table.

    • LOCK TABLE with WRITE.

    • FLUSH TABLES.

  • Before MySQL Server 3.23.2, an UPDATE that updated a key with a WHERE on the same key may have failed because the key was used to search for records and the same row may have been found multiple times:

    UPDATE tbl_name SET KEY=KEY+1 WHERE KEY > 100;
    

    A workaround is to use:

    UPDATE tbl_name SET KEY=KEY+1 WHERE KEY+0 > 100;
    

    This will work because MySQL Server will not use an index on expressions in the WHERE clause.

  • Before MySQL Server 3.23, all numeric types were treated as fixed-point fields. That means that you had to specify how many decimals a floating-point field should have. All results were returned with the correct number of decimals.

For information about platform-specific bugs, see the installation and porting instructions in the section called “Operating System-Specific Notes” and Appendix D, Porting to Other Systems.