We start with system-level factors, because some of these decisions must be made very early to achieve large performance gains. In other cases, a quick look at this section may suffice. However, it is always nice to have a sense of how much can be gained by changing things at this level.
The default operating system to use is very important! To get the best use of multiple-CPU machines, you should use Solaris (because its threads implementation works really well) or Linux (because the 2.4 kernel has really good SMP support). Note that older Linux kernels have a 2GB filesize limit by default. If you have such a kernel and a desperate need for files larger than 2GB, you should get the Large File Support (LFS) patch for the ext2 filesystem. Other filesystems such as ReiserFS and XFS do not have this 2GB limitation.
Before using MySQL in production, we advise you to test it on your intended platform.
If you have enough RAM, you could remove all swap devices. Some operating systems will use a swap device in some contexts even if you have free memory.
Use the --skip-external-locking MySQL option to avoid external locking. This option is on by default as of MySQL 4.0. Before that, it is on by default when compiling with MIT-pthreads, because flock() isn't fully supported by MIT-pthreads on all platforms. It's also on by default for Linux because Linux file locking is not yet safe.
Note that the --skip-external-locking option will not affect MySQL's functionality as long as you run only one server. Just remember to take down the server (or lock and flush the relevant tables) before you run myisamchk. On some systems this option is mandatory, because the external locking does not work in any case.
The only case when you can't use --skip-external-locking is if you run multiple MySQL servers (not clients) on the same data, or if you run myisamchk to check (not repair) a table without telling the server to flush and lock the tables first.
You can still use LOCK TABLES and UNLOCK TABLES even if you are using --skip-external-locking.
You can determine the default buffer sizes used by the mysqld server with this command (prior to MySQL 4.1, omit --verbose):
shell> mysqld --verbose --help
This command produces a list of all mysqld options and configurable system variables. The output includes the default variable values and looks something like this:
back_log current value: 5 bdb_cache_size current value: 1048540 binlog_cache_size current value: 32768 connect_timeout current value: 5 delayed_insert_limit current value: 100 delayed_insert_timeout current value: 300 delayed_queue_size current value: 1000 flush_time current value: 0 interactive_timeout current value: 28800 join_buffer_size current value: 131072 key_buffer_size current value: 1048540 long_query_time current value: 10 lower_case_table_names current value: 0 max_allowed_packet current value: 1048576 max_binlog_cache_size current value: 4294967295 max_connect_errors current value: 10 max_connections current value: 100 max_delayed_threads current value: 20 max_heap_table_size current value: 16777216 max_join_size current value: 4294967295 max_sort_length current value: 1024 max_tmp_tables current value: 32 max_write_lock_count current value: 4294967295 myisam_sort_buffer_size current value: 8388608 net_buffer_length current value: 16384 net_read_timeout current value: 30 net_retry_count current value: 10 net_write_timeout current value: 60 read_buffer_size current value: 131072 read_rnd_buffer_size current value: 262144 slow_launch_time current value: 2 sort_buffer current value: 2097116 table_cache current value: 64 thread_concurrency current value: 10 thread_stack current value: 131072 tmp_table_size current value: 1048576 wait_timeout current value: 28800
If there is a mysqld server currently running, you can see what values it actually is using for the system variables by connecting to it and issuing this statement:
mysql> SHOW VARIABLES;
You can also see some statistical and status indicators for a running server by issuing this statement:
mysql> SHOW STATUS;
System variable and status information also can be obtained using mysqladmin:
shell> mysqladmin variables shell> mysqladmin extended-status
You can find a full description for all system and status variables in the section called “Server System Variables” and the section called “Server Status Variables”.
MySQL uses algorithms that are very scalable, so you can usually run with very little memory. However, normally you will get better performance by giving MySQL more memory.
When tuning a MySQL server, the two most important variables to configure are key_buffer_size and table_cache. You should first feel confident that you have these set appropriately before trying to change any other variables.
The following examples indicate some typical variable values for different runtime configurations. The examples use the mysqld_safe script and use --var_name=value syntax to set the variable var_name to the value value. This syntax is available as of MySQL 4.0. For older versions of MySQL, take the following differences into account:
Use safe_mysqld rather than mysqld_safe.
Set variables using --set-variable=var_name=value or -O var_name=value syntax.
For variable names that end in _size, you may need to specify them without _size. For example, the old name for sort_buffer_size is sort_buffer. The old name for read_buffer_size is record_buffer. To see which variables your version of the server recognizes, use mysqld --help.
If you have at least 256MB of memory and many tables and want maximum performance with a moderate number of clients, you should use something like this:
shell> mysqld_safe --key_buffer_size=64M --table_cache=256 \ --sort_buffer_size=4M --read_buffer_size=1M &
If you have only 128MB of memory and only a few tables, but you still do a lot of sorting, you can use something like this:
shell> mysqld_safe --key_buffer_size=16M --sort_buffer_size=1M
If there are very many simultaneous connections, swapping problems may occur unless mysqld has been configured to use very little memory for each connection. mysqld performs better if you have enough memory for all connections.
With little memory and lots of connections, use something like this:
shell> mysqld_safe --key_buffer_size=512K --sort_buffer_size=100K \ --read_buffer_size=100K &
Or even this:
shell> mysqld_safe --key_buffer_size=512K --sort_buffer_size=16K \ --table_cache=32 --read_buffer_size=8K \ --net_buffer_length=1K &
If you are doing GROUP BY or ORDER BY operations on tables that are much larger than your available memory, you should increase the value of read_rnd_buffer_size to speed up the reading of rows after sorting operations.
When you have installed MySQL, the support-files directory will contain some different my.cnf sample files: my-huge.cnf, my-large.cnf, my-medium.cnf, and my-small.cnf. You can use these as a basis for optimizing your system.
Note that if you specify an option on the command line for mysqld or mysqld_safe, it remains in effect only for that invocation of the server. To use the option every time the server runs, put it in an option file.
To see the effects of a parameter change, do something like this (prior to MySQL 4.1, omit --verbose):
shell> mysqld --key_buffer_size=32M --verbose --help
The variable values are listed near the end of the output. Make sure that the --verbose and --help options are last. Otherwise, the effect of any options listed after them on the command line will not be reflected in the output.
For information on tuning the InnoDB storage engine, see InnoDB tuning.
The task of the query optimizer is to find an optimal plan for executing an SQL query. Because the difference in performance between “good” and “bad” plans can be orders of magnitude (that is, seconds versus hours or even days), most query optimizers, including that of MySQL, perform more or less exhaustive search for an optimal plan among all possible query evaluation plans. For join queries, the number of possible plans investigated by the MySQL optimizer grows exponentially with the number of tables referenced in a query. For small numbers of tables (typically less than 7-10) this is not a problem. However, when bigger queries are submitted, the time spent in query optimization may easily become the major bottleneck in the server performance.
MySQL 5.0.1 introduces a new more flexible method for query optimization that allows the user to control how exhaustive the optimizer is in its search for an optimal query evaluation plan. The general idea is that the fewer plans that are investigated by the optimizer, the less time it will spend in compiling a query. On the other hand, because the optimizer will skip some plans, it may miss finding an optimal plan.
The behavior of the optimizer with respect to the number of plans it evaluates can be controlled via two system variables:
The optimizer_prune_level variable tells the optimizer to skip certain plans based on estimates of the number of rows accessed for each table. Our experience shows that this kind of “educated guess” rarely misses optimal plans, while it may dramatically reduce query compilation times. That is why this option is on (optimizer_prune_level=1) by default. However, if you believe that the optimizer missed better query plans, then this option can be switched off (optimizer_prune_level=0) with the risk that query compilation may take much longer. Notice that even with the use of this heuristic, the optimizer will still explore a roughly exponential number of plans.
The optimizer_search_depth variable tells how far in the “future” of each incomplete plan the optimizer should look in order to evaluate whether it should be expanded further. Smaller values of optimizer_search_depth may result in orders of magnitude smaller query compilation times. For example, queries with 12-13 or more tables may easily require hours and even days to compile if optimizer_search_depth is close to the number of tables in the query. At the same time, if compiled with optimizer_search_depth equal to 3 or 4, the compiler may compile in less than a minute for the same query. If you are unsure of what a reasonable value is for optimizer_search_depth, this variable can be set to 0 to tell the optimizer to determine the value automatically.
Most of the following tests were performed on Linux with the MySQL benchmarks, but they should give some indication for other operating systems and workloads.
You get the fastest executables when you link with -static.
On Linux, you will get the fastest code when compiling with pgcc and -O3. You need about 200MB memory to compile sql_yacc.cc with these options, because gcc/pgcc needs a lot of memory to make all functions inline. You should also set CXX=gcc when configuring MySQL to avoid inclusion of the libstdc++ library, which is not needed. Note that with some versions of pgcc, the resulting code will run only on true Pentium processors, even if you use the compiler option indicating that you want the resulting code to work on all x586-type processors (such as AMD).
By just using a better compiler and better compiler options, you can get a 10-30% speed increase in your application. This is particularly important if you compile the MySQL server yourself.
We have tested both the Cygnus CodeFusion and Fujitsu compilers, but when we tested them, neither was sufficiently bug-free to allow MySQL to be compiled with optimizations enabled.
The standard MySQL binary distributions are compiled with support for all character sets. When you compile MySQL yourself, you should include support only for the character sets that you are going to use. This is controlled by the --with-charset option to configure.
Here is a list of some measurements that we have made:
If you use pgcc and compile everything with -O6, the mysqld server is 1% faster than with gcc 2.95.2.
If you link dynamically (without -static), the result is 13% slower on Linux. Note that you still can use a dynamically linked MySQL library for your client applications. It is the server that is most critical for performance.
If you strip your mysqld binary with strip mysqld, the resulting binary can be up to 4% faster.
For a connection from a client to a server running on the same host, if you connect using TCP/IP rather than a Unix socket file, performance is 7.5% slower. (On Unix, if you connect to the hostname localhost, MySQL uses a socket file by default.)
For TCP/IP connections from a client to a server, connecting to a remote server on another host will be 8-11% slower than connecting to the local server on the same host, even for connections over 100Mb/s Ethernet.
When running our benchmark tests using secure connections (all data encrypted with internal SSL support) performance was 55% slower than for unencrypted connections.
If you compile with --with-debug=full, most queries will be 20% slower. Some queries may take substantially longer; for example, the MySQL benchmarks ran 35% slower. If you use --with-debug (without =full), the slowdown will be only 15%. For a version of mysqld that has been compiled with --with-debug=full, you can disable memory checking at runtime by starting it with the --skip-safemalloc option. The execution speed should then be close to that obtained when configuring with --with-debug.
On a Sun UltraSPARC-IIe, a server compiled with Forte 5.0 is 4% faster than one compiled with gcc 3.2.
On a Sun UltraSPARC-IIe, a server compiled with Forte 5.0 is 4% faster in 32-bit mode than in 64-bit mode.
Compiling with gcc 2.95.2 for UltraSPARC with the -mcpu=v8 -Wa,-xarch=v8plusa options gives 4% more performance.
On Solaris 2.5.1, MIT-pthreads is 8-12% slower than Solaris native threads on a single processor. With more load or CPUs, the difference should be larger.
Compiling on Linux-x86 using gcc without frame pointers (-fomit-frame-pointer or -fomit-frame-pointer -ffixed-ebp) makes mysqld 1-4% faster.
Binary MySQL distributions for Linux that are provided by MySQL AB used to be compiled with pgcc. We had to go back to regular gcc due to a bug in pgcc that would generate code that does not run on AMD. We will continue using gcc until that bug is resolved. In the meantime, if you have a non-AMD machine, you can get a faster binary by compiling with pgcc. The standard MySQL Linux binary is linked statically to make it faster and more portable.
The following list indicates some of the ways that the mysqld server uses memory. Where applicable, the name of the system variable relevant to the memory use is given:
The key buffer (variable key_buffer_size) is shared by all threads; other buffers used by the server are allocated as needed. See the section called “Tuning Server Parameters”.
Each connection uses some thread-specific space:
A stack (default 64KB, variable thread_stack)
A connection buffer (variable net_buffer_length)
A result buffer (variable net_buffer_length)
The connection buffer and result buffer are dynamically enlarged up to max_allowed_packet when needed. While a query is running, a copy of the current query string is also allocated.
All threads share the same base memory.
Only compressed ISAM and MyISAM tables are memory mapped. This is because the 32-bit memory space of 4GB is not large enough for most big tables. When systems with a 64-bit address space become more common, we may add general support for memory mapping.
Each request that performs a sequential scan of a table allocates a read buffer (variable read_buffer_size).
When reading rows in “random” order (for example, after a sort), a random-read buffer may be allocated to avoid disk seeks. (variable read_rnd_buffer_size).
All joins are done in one pass, and most joins can be done without even using a temporary table. Most temporary tables are memory-based (HEAP) tables. Temporary tables with a large record length (calculated as the sum of all column lengths) or that contain BLOB columns are stored on disk.
One problem before MySQL 3.23.2 is that if an internal in-memory heap table exceeds the size of tmp_table_size, the error The table tbl_name is full occurs. From 3.23.2 on, this is handled automatically by changing the in-memory heap table to a disk-based MyISAM table as necessary. To work around this problem for older servers, you can increase the temporary table size by setting the tmp_table_size option to mysqld, or by setting the SQL option SQL_BIG_TABLES in the client program. See SET Syntax.
In MySQL 3.20, the maximum size of the temporary table is record_buffer*16; if you are using this version, you have to increase the value of record_buffer. You can also start mysqld with the --big-tables option to always store temporary tables on disk. However, this will affect the speed of many complicated queries.
Most requests that perform a sort allocate a sort buffer and zero to two temporary files depending on the result set size. See the section called “Where MySQL Stores Temporary Files”.
Almost all parsing and calculating is done in a local memory store. No memory overhead is needed for small items, so the normal slow memory allocation and freeing is avoided. Memory is allocated only for unexpectedly large strings; this is done with malloc() and free().
For each MyISAM and ISAM table that is opened, the index file is opened once and the data file is opened once for each concurrently running thread. For each concurrent thread, a table structure, column structures for each column, and a buffer of size 3 * N are allocated (where N is the maximum row length, not counting BLOB columns). A BLOB column requires five to eight bytes plus the length of the BLOB data. The MyISAM and ISAM storage engines maintain one extra row buffer for internal use.
For each table having BLOB columns, a buffer is enlarged dynamically to read in larger BLOB values. If you scan a table, a buffer as large as the largest BLOB value is allocated.
Handler structures for all in-use tables are saved in a cache and managed as a FIFO. By default, the cache has 64 entries. If a table has been used by two running threads at the same time, the cache contains two entries for the table. See the section called “How MySQL Opens and Closes Tables”.
A FLUSH TABLES statement or mysqladmin flush-tables command closes all tables that are not in use and marks all in-use tables to be closed when the currently executing thread finishes. This effectively frees most in-use memory.
ps and other system status programs may report that mysqld uses a lot of memory. This may be caused by thread stacks on different memory addresses. For example, the Solaris version of ps counts the unused memory between stacks as used memory. You can verify this by checking available swap with swap -s. We have tested mysqld with several memory-leakage detectors (both commercial and open source), so there should be no memory leaks.
When a new client connects to mysqld, mysqld spawns a new thread to handle the request. This thread first checks whether the hostname is in the hostname cache. If not, the thread attempts to resolve the hostname:
If the operating system supports the thread-safe gethostbyaddr_r() and gethostbyname_r() calls, the thread uses them to perform hostname resolution.
If the operating system doesn't support the thread-safe calls, the thread locks a mutex and calls gethostbyaddr() and gethostbyname() instead. In this case, no other thread can resolve hostnames that are not in the hostname cache until the first thread unlocks the mutex.
You can disable DNS hostname lookups by starting mysqld with the --skip-name-resolve option. However, in this case, you can use only IP numbers in the MySQL grant tables.
If you have a very slow DNS and many hosts, you can get more performance by either disabling DNS lookups with --skip-name-resolve or by increasing the HOST_CACHE_SIZE define (default value: 128) and recompiling mysqld.
You can disable the hostname cache by starting the server with the --skip-host-cache option. To clear the hostname cache, issue a FLUSH HOSTS statement or execute the mysqladmin flush-hosts command.
If you want to disallow TCP/IP connections entirely, start mysqld with the --skip-networking option.