Search This Blog

Loading...

Wednesday, July 29, 2015

MySQLTuner series part 3 - minimizing MySQL memory footprint

This is part 3 of the series on using MySQLTuner to optimize MySQL database performance and stability. Part 1 explains how to install and run MySQLTuner. Part 2 addresses the area of database defragmentation. This post illustrates how to manage MySQL memory footprint.

MySQLTuner output

MySQLTuner was used to analyze a WordPress database deployed on the LAMP platform (Linux, Apache, MySQL, PHP). The host was a VPS server with only 512 MB of memory.

$ perl mysqltuner.pl

If you scroll down to the Recommendations section of the above report, it is hard to miss the eye-catching message: 'MySQL's maximum memory usage is dangerously high. Add RAM before increasing MySQL buffer variables.'

Indeed, adding more RAM is often the cheapest and simplest solution to out-of-memory problems. By spending an extra $5 per month, I can upgrade my VPS server to have 1 GB of RAM. But, before you go spend your hard-earned money on RAM, let's explore some other ways to reduce MySQL's memory footprint.

Maximum number of database connections

Lines that begin with two exclamation marks ('!!') are warnings. Note the following lines in the above Performance Metrics section:

-------- Performance Metrics --------------------------------------
...
[--] Up for: 36d 23h 28m 39s (758K q [0.237 qps], 53K conn, TX: 6B, RX: 98M)
[--] Total buffers: 192.0M global + 2.7M per thread (151 max threads)
[!!] Maximum possible memory usage: 597.8M (120% of installed RAM)
...
[OK] Highest usage of available connections: 8% (13/151)

According to the above warning, MySQL could potentially use up to 597.8 MB of RAM. Where did the number come from?

The number was derived from the preceding line. MySQL required 192MB globally and 2.7 MB per connection to the database. By default, the maximum number of connections was 150+1. (The 151st connection would be restricted to database administration only.) Hence, the maximum memory usage was 192 + 150 * 2.7, equaling 597.

Should you allow for 150 connections? Keep in mind that each connection, even in the idle state, will take up some memory. MySQLTuner can help you answer the question with confidence.

MySQLTuner reports the highest number of concurrent connections since the last MySQL restart (13 in the above example). The database should be up for a minimum of 24 hours before you run MySQLTuner. In fact, the longer the time elapses since the last restart, the more trustworthy is the statistic.

You can find out from the MySQLTuner report how long MySQL has been up. Go back to the first line under the Performance Metrics section. In the above example, MySQL had been up for 36 days since the last restart.

Although MySQL was configured for accepting 150 connections, the highest number of concurrent connections made in the past 36 days was only 13 (8% of the maximum). In light of that knowledge, we could lower the maximum number of connections allowed, therefore, reducing the total memory footprint for MySQL.

Before we go ahead to reconfigure MySQL, we will consider the wait-timeout threshold which affects how long idle connections stay alive before timing out.

Wait timeout

One of the General recommendations in the above example was:

'Your applications are not closing MySQL connections properly.'

In other words, database connections were opened but not properly closed after queries or updates were already completed. These idle connections would hang around until a predefined timeout threshold was reached. The default timeout threshold was 8 hours. So, if a query completed in 2 seconds, but because the connection was not closed properly, the connection would live for another 28,798 seconds before timing out. In the meantime, the idle connections continued to consume resources including counting toward the maximum number of open connections.

The culprit was easily identified in the above case: the database was used exclusively for WordPress, an application written in PHP. However, solving the problem can be out of your reach, unless you are a PHP developer.

The good news is that you can reduce the timeout interval by adjusting a MySQL configuration parameter. By making idle connections time out faster, there will be less concurrent connections. For WordPress/PHP applications, I set the wait timeout to be 60 seconds.

It is also worth mentioning that because there are less idle connections due to quicker timeout, you can further reduce the maximum number of connections.

Re-configuring MySQL

To change the maximum number of connections or the wait timeout threshold, edit the MySQL configuration file as follows.

$ sudo vi /etc/mysql/my.cnf

The configuration variables of interest are max_connections, and wait_timeout. Enter a value for each variable using the following syntax:

max_connections = 50
wait_timeout = 60

For the above configuration changes to take effect, a restart of the MySQL daemon is needed.

For non-systemd systems, run the following command:

$ sudo service mysql restart

For systemd-enabled systems, run:

$ sudo systemctl restart mysql

Alternatively, you can dynamically change the configuration variables, thereby avoiding the database restart. To do that, issue the following commands.

$ mysql -u root -p -e "SET GLOBAL max_connections=50;"
$ mysql -u root -p -e "SET GLOBAL wait_timeout=60;"

Note that modifying the MySQL configuration file is still required if you want the changes to persist after future system restarts.

What's next?

MySQLTuner is not something you run once and forget about it. Your web traffic pattern changes over time. You should schedule to run it regularly and examine the output. Please refer back to Part 1 of this series for instructions on how to schedule a run.

The more knowledgeable you are about database optimization, the more effective you become on using the information provided by MySQLTuner. I recommend the following videos if you want to learn more about MySQL optimization:

You may also be interested in the following MySQL-related articles from this blog:

For Apache optimization, please refer to my next blog post.

Tuesday, July 7, 2015

MySQLTuner series part 2 - defragmenting a MySQL database

Part 1 of this series spells out how to install and run MySQLTuner, a script which recommends MySQL configuration changes. The goal is to optimize database performance and stability. This post describes how to interpret and use MySQLTuner output, specifically in the area of database defragmentation.

Proceed with caution

A word of caution is warranted before I plunge into the details of implementing MySQLTuner's suggestions. MySQLTuner does not excuse you from learning the basic database optimization principles and following industry best practices. Following a MySQLTuner recommendation without researching and understanding its ramifications is a gamble that may end up worsening your database performance and reliability.

Optimizing MySQL configuration is not a trivial matter, and must be done in a controlled manner. You should change only one MySQL configuration variable at a time. After every change, monitor the system to verify that the expected outcome is achieved without any negative side effect.

General comments

MySQLTuner is a PERL script which you can invoke like this:

$ perl mysqltuner.pl

The following is the MySQLTuner output for a low-memory VPS server running on the LAMP platform (Linux, Apache, MySQL, PHP). The VPS is dedicated for running a WordPress blog.

One is often tempted to bypass the first several sections of the report on database metrics, and head straight to the Recommendations section. But, the metrics provide the crucial context for the recommendations that follow, and should be read carefully.

Storage engine statistics

The Storage engine statistics section of the report summarizes the total number and size of InnoDB and MyISAM tables in your database.

In the above example, 18 InnoDB and 4 MyISAM tables were detected. But the report does not identify the tables. If you want to list all InnoDB tables, execute the command below.

$ echo 'select concat(table_schema,".",table_name) from information_schema.tables where engine="InnoDB";'|mysql -u root -p

To list all MyISAM tables, replace InnoDB with MyISAM in the above command.

The key actionable statistic in this section is the total number of fragmented tables (20 in the example). Fragmentation occurs during normal database operations when records are inserted and deleted, leaving behind 'gaps' in the database.

MySQLTuner does not report the size of the 'gaps' or unused space in the fragmented tables. You can find out by running the following MySQL statement.

mysql> select ENGINE, TABLE_NAME,               \  
          DATA_LENGTH, INDEX_LENGTH, DATA_FREE, \
          DATA_FREE / (INDEX_LENGTH + DATA_LENGTH) as frag_ratio  \
       from information_schema.tables                             \
       where DATA_FREE > 0 order by frag_ratio desc;   
+-------+-----------+-----------+------------+---------+----------+  
| ENGINE| TABLE_NAME|DATA_LENGTH|INDEX_LENGTH|DATA_FREE|frag_ratio|  
+-------+-----------+-----------+------------+---------+----------+  
...  
| InnoDB| wp_options|   1179648 |      16384 | 11534336|  9.6438  |  
...  
+-------+-----------+-----------+------------+---------+----------+  

The DATA_LENGTH and INDEX_LENGTH variables contain respectively the size of the data and the index for a table. DATA_FREE is the size of the unused space in a table. The fragmentation ratio is the amount of unused space to the sum of the used data and index space.

If your tables are large, you can round up the output length variables to megabytes (MB) by using the following SQL statement:

mysql> select ENGINE, TABLE_NAME,                          \
          round(DATA_LENGTH  /1024 / 1024) as data_length, \
          round(INDEX_LENGTH /1024 /1024) as index_length, \
          round(DATA_FREE / 1024 /1024) as data_free,      \
          data_free / (index_length + data_length) as frag_ratio \
       from information_schema.tables                      \
       where DATA_FREE > 0 order by frag_ratio desc;  

Database Defragmentation

If you scroll down to the Recommendations section of the report, you will see that the first general recommendation is 'Run OPTIMIZE TABLE to defragment tables for better performance'. You may execute the OPTIMIZE TABLE SQL statement for each of the 22 tables. Alternatively, you can run the mysqlcheck command as follows:

$ mysqlcheck -Aos --auto-repair -u root -p

Notes:

  1. Optimizing a table will lock it up. In other words, no update to the table is allowed while the operation is being performed. For a large production table, the substantial downtime is something that the database administrator should consider before deciding to optimize a table.

  2. Optimizing a table does not necessarily reclaim its free space. This is especially true for InnoDB tables. Prior to MySQL version 5.6, all InnoDB tables are by default stored in a single file. This behavior is controlled by the MySQL configuration variable innodb_file_per_table. Optimizing InnoDB tables stored together in a single file may inadvertently produce the undesirable effect of increasing the file size.

  3. InnoDB tables fragment differently than the legacy MyISAM tables. mysqlcheck optimizes an InnoDB table by recreating it. For each InnoDB table that it optimizes, mysqlcheck generates the following informational message: 'Note : Table does not support optimize, doing recreate + analyze instead'. You can safely ignore those messages.

  4. The mysqld server process must be running for mysqlcheck to execute.

  5. -A (--all-databases)

    With -A specified, all tables of all databases are optimized.

    If you want to defragment only a specific table of a specific database, customize the following command.

    $ mysqlcheck -os <database> <table> -u root -p

  6. -o (--optimize)

    This option specifies that the optimize operation is to be performed.

  7. -s (--silent)

    -s enables silent mode: only error messages are displayed.

  8. --auto-repair

    If MySQLTuner finds a target table which is corrupted, it will try to repair it.

What's next?

Part 3 of this series continues the discussion on MySQLTuner output, specifically about the management of database memory footprint.

Friday, June 19, 2015

Improving MySQL stability & performance using MySQLTuner - part 1

MySQL is the database engine behind many web applications on the Internet today. While it is relatively straightforward to install, configuring MySQL to best support your particular application requires expertise and the right tools. This post introduces MySQLTuner, a command-line program which offers suggestions to optimize MySQl performance and stability.

MySQLTuner is a read-only script: it won't actually write to the MySQL configuration file. Based on your database's past usage, it recommends new values to assign to specific MySQL configuration variables. It is your responsibility to understand each recommended change and its possible ramifications, select the changes you want to make, and to make them in a controlled manner.

Installing MySQLTuner

Before you install MySQLTuner, make sure that it supports your MySQL version. You can find the up-to-date compatibility information on its website.

To identify the MySQL version on your database server, run this command:


$ mysqladmin -u root -p version
...
Server version 5.5.43-0+deb7u1
...

MySQLTuner is a PERL script that you can install from the standard Debian and Ubuntu repositories. You can install it using the following command:

$ sudo apt-get install mysqltuner

The prepackaged MySQLTuner may not be the latest release available. If you want the latest, or you run a Linux distro other than Debian/Ubuntu, you can install the up-to-date version by downloading it directly. Simply download the PERL script to a directory of your choice using the command:


$ wget http://mysqltuner.pl/ -O mysqltuner.pl

Running MySQLTuner

Your database should be up longer than 1 day before you run MySQLTuner. This is because MySQLTuner bases its recommendations on past database usage. The more data it has to analyze, the more accurate is its recommendations. If MySQLTuner is run on a database that has been restarted in the last day, you will get a warning message: 'MySQL started within last 24 hours - recommendations may be inaccurate'.

To run the script, enter the following:


$ perl mysqltuner.pl

Analyzing output

MySQLTuner reports statistics about the database, and makes tuning recommendations. The top section of the report gives you useful database metrics, many of them actionable. The bottom section provides tuning suggestions for the MySQL configuration file.

You should thoroughly research a suggested configuration change before deciding to implement it. To change a configuration variable, edit the file /etc/mysql/my.cnf.

After you make a MySQL configuration change, restart the MySQL service.


$ sudo service mysql restart

Scheduling runs

Database tuning is not a 'once and done' type of task. Conditions change over time. A good practice is to schedule regular MySQLTuner runs using crontabs.

The following schedules MySQLTuner to run every Thursday at 4:30 am.


$ (crontab -l; echo "30 04 * * 4 perl <path-to-script>/mysqltuner.pl --nocolor 2>&1") | crontab -

Please refer to my earlier post for details about crontabs.

By default, MySQLTuner prompts the user for the database login credentials. For a cronjob to run MySQLTuner, you may provide the database account and password in the user-specific MySQL configuration file.


$ cat >> /home/peter/.my.cnf <<< '
[client]
user=<db_user>
password=<db_pass>'

Finally, protect the db login credentials by tightening up access permissions for the .my.cnf file.


$ chmod 600 /home/peter/.my.cnf

What's next?

The remainder of this series will guide you through taking actions based on MySQLTuner recommendations.

Tuesday, June 9, 2015

Double asterisk brings double expedience to pathname expansion


If you are a Linux command-line user, most likely, you are familiar with the use of the single asterisk ('*') in pathname expansion (aka globbing). How the asterisk behaves is standardized across all shells (bash, zsh, tcsh, etc). For example, the ls * command lists the files and the immediate sub-directories of the current directory.


$ ls *

The single asterisk, however, is not recursive: it does not traverse beyond the target directory. You may use the find command to generate a recursive listing of pathnames. A simpler solution is the use of the double asterisk ('**').

Unlike the single asterisk, the double asterisk is not standardized. Different shells introduced the feature at different times with slightly different behavior. This post focuses on the use of '**' for the bash shell.

The double asterisk feature for bash first appears with bash4. To find out which bash version you are running, execute the following command:


$ bash --version
GNU bash, version 4.2.37(1)-release (x86_64-pc-linux-gnu)
...

Before you use '**', you must first enable the globstar shell option:


$ shopt -s globstar

With globstar enabled, you may use '**' for pathname expansion.


$ ls **/abc.txt

In the above example, the ls command returns any occurrence of the file abc.txt in the current directory and sub-directories.

Notes:

  1. By default, the double asterisk does not expand to include a hidden file. For example, the following command will not find .htaccess because it is a hidden file.


    $ ls **/.htaccess

    To allow hidden files in '**' output, enable the dotglob shell option:


    $ shopt -s dotglob

  2. When you do a pathname expansion using '*' or '**', you run the risk that a returned filename is the same as a command-line flag, e.g., -r. To mitigate that risk, precede '**' with '--' as below. The double dash marks the spot where command-line flags end, and positional parameters begin.


    $ ls -- **

  3. Under bash, '**' expands to follow symbolic links. This behavior, however, is shell-specific. For zsh, expanding the double asterisk does not follow a symbolic link.

The double dash is a useful tool to add to your everyday command-line usage.

Tuesday, May 26, 2015

Useful sed tricks to customize configuration files



A typical software installation goes like this. You install the software using apt-get install or yum install.
Then, you manually edit the software's configuration file in order to satisfy your requirements. If you have to repeat the install on multiple machines, this quickly becomes tedious.


Instead of manually editing the file, I run a text manipulation command such as sed or awk to make the required changes. Then, I script the procedure by inserting the commands in a bash script file.


The scripting of configuration changes serves multiple purposes:


  • It is a permanent record of the configuration changes.

  • It is readily repeatable on the same or a different machine.


Below, I illustrate 2 sed tricks to make configuration changes to the Apache webserver. The target configuration file is /etc/apache2/apache2.conf.


Before you make any change, please first backup the original configuration file.



$ sudo cp /etc/apache2/apache2.conf /etc/apache2/apache2.conf.orig


Replacing first occurrence of a string


The default apache2.conf file contains the following line:


Timeout 300


Below is the sed command to change the first occurrence of Timeout in the file to 100.



$ sudo sed -i "0,/^Timeout\s/ s/^Timeout\s\+[0-9]\+/Timeout 100/" /etc/apache2/apache2.conf


The -i parameter tells sed to edit the file in place - that is, directly in apache2.conf.


0,/^Timeout\s/ specifies the range of lines over which the sed command is to be executed. In this example, the starting line is the first line (line 0). The finishing line is the line returned by a search for the word Timeout which appears at the beginning of a line (^) and followed by a whitespace (\s).


The line range parameter limits the change to only the first occurrence of Timeout in the file. If you leave out the line range, each occurrence of Timeout in the file will be modified. In many scenarios, leaving it out is OK because the parameter occurs only once in the configuration file.


For some configuration files, a parameter can occur multiples times, in different sections. Next, I illustrate how to limit the change to within a particular section in the configuration file.


Replacing a string within a target section


The MaxClients parameter occurs in 3 sections within the apache2.conf file:


  • mpm_prefork_module

  • mpm_worker_module

  • mpm_event_module


I want to change the MaxClients parameter within the mpm_prefork_module only.


The default mpm_prefork_module is like this:



<IfModule mpm_prefork_module>
StartServers 5
MinSpareServers 5
MaxSpareServers 10
MaxClients 150
MaxRequestsPerChild 0
</IfModule>


Note that a section is delimited by the opening <IfModule> and closing </IfModule> statements.


The following sed command modifies the value of MaxClients to 18 within the mpm_prefork_module section.

$ sudo sed -i "/<IfModule mpm_prefork_module>/,\@</IfModule>@ s/MaxClients\s\+[0-9]\+/MaxClients 18/" /etc/apache2/apache2.conf


The line range is defined by the /<IfModule ... >/,\@</IfModule>@ clause in the above statement. The opening line in the line range is specified by a search for the <IfModule ... > pattern. The closing line is specified by the search pattern \@</IfModule>@.


An explanation of the closing line pattern is warranted. The slash (/) character is part of the search pattern for the closing line (</IfModule>). However, the slash is also the default delimiter for sed. Therefore, we must use a different delimiter (@) for the closing-line search pattern. Note that the first @ is escaped (\@).


The s/MaxClients.../MaxClients 18/ clause changes the value of MaxClients to 18.


Conclusion


The above are examples of how you can use sed to script common scenarios of changing configuration files. You can achieve the same result using other tools such as awk or perl. Please use the comment system to let us know your own examples.


If you are interested to learn more about sed, please read my earlier posts on the tool: