Bit68 Bit68 - 5 months ago 17
MySQL Question

Django can't connect to database every once in a while (Centos)

I'm having a very weird issue, I have an Apache server running with mod_wsgi. The website runs fine but every once in a while I get the

IOError: failed to write data


error on all the pages of the website. I then gets solved with

sudo service mysqld restart


Since the website can't be down for long, I have not time to debug this problem and I just run the command every time this happens. I only see the error in the logs that's why I can't really debug it, and it has no clear replication steps, it just occurs randomly.

Any help would be appreciated and let me know if you need me to post any configuration files.

Edit: The exact error displayed by django is:

(2002, "Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)")


I saved the error message and it is hosted here. (passwords edited out)

Edit:
Here is an extract from the mysql server error log.

160610 10:51:53 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
160610 10:51:53 [Note] /usr/libexec/mysql55/mysqld (mysqld 5.5.46) starting as process 7658 ...
160617 14:35:47 [Note] /usr/libexec/mysql55/mysqld (mysqld 5.5.46) starting as process 32054 ...
160617 14:35:47 [Note] Plugin 'FEDERATED' is disabled.
160617 14:35:47 InnoDB: The InnoDB memory heap is disabled
160617 14:35:47 InnoDB: Mutexes and rw_locks use GCC atomic builtins
160617 14:35:47 InnoDB: Compressed tables use zlib 1.2.8
160617 14:35:47 InnoDB: Using Linux native AIO
160617 14:35:47 InnoDB: Initializing buffer pool, size = 128.0M
InnoDB: mmap(137363456 bytes) failed; errno 12
160617 14:35:47 InnoDB: Completed initialization of buffer pool
160617 14:35:47 InnoDB: Fatal error: cannot allocate memory for the buffer pool
160617 14:35:47 [ERROR] Plugin 'InnoDB' init function returned error.
160617 14:35:47 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
160617 14:35:47 [ERROR] Unknown/unsupported storage engine: InnoDB
160617 14:35:47 [ERROR] Aborting


I saved the full mysqld.log here.

Answer

Why it crashes

This seems to be the culprit

160617 14:35:47 InnoDB: Initializing buffer pool, size = 128.0M
InnoDB: mmap(137363456 bytes) failed; errno 12

This condition is caused by the server running out of physical memory. That's what errno 12 stands for in the kernel errors.

This topic has been discussed at both stackoverflow and dba.stackexchange. If you want to simulate low memory situations or try to manually reproduce the error try some of these tools: http://unix.stackexchange.com/questions/99334/how-to-fill-90-of-the-free-memory

Quick Solution

If you are able to upgrade memory you can try that. If not, you can try creating a large swap file. It's possible that you don't have a swap at all. Some AWS EC2 instanes don't have one by default. You can find out by typing top in the shell. If you don't see swap near the top, that means you don't have one.

A swap file would make the queries a lot slower but at least it's better than the site getting offline.

You might be tempted to try to modify the systemd files to make mysql auto start. Update: @PeterBrittain points out that mysql is auto restarting anyway as shown by the logs. Sometimes though databases can take a bit of time to restart and if the data get's corrupted it will refuse to restart.

Why is memory being exhausted?

If you don't have any other server running on it, 2GB will be more than enough to host a site that serves 6000 pages per day. It could be that you have some rather heavy queries that put an unnecessary load on the db. There are some remedial actions that can be taken

  1. Use django-debug-toolbar to identify pages that execute many queries and see if select_related or prefetch_related can be used to reduce the number.
  2. use mysql slow query log to find the queries that take a long time to execute and optimize them.
  3. Use caching to save the results of complex queries.
Comments