Saturday, December 7, 2013

A fast way to find a LAMP memory leak

Today, we had a memory leak on one of our Linux Apache servers which is running approximately 10 PHP applications in different virtual hosts.

So,  we are going to explain a fast way to locate the leak in LAMP.
First of all, check this link if you don´t know how memory works on Linux or just to remind you how it works.



To start with the debug, activate the mod_status in Apache, if you do not have it installed yet.
In CentOS this module is installed by default, but it is not activated by default, so, you have to edit the httpd.conf, uncomment this paragraph and add one server to the Allow From ACL, for example, you can add the localhost:

###############################
<Location /server-status>
    SetHandler server-status
    Order deny,allow
    Deny from all
    Allow from localhost 
</Location>
###############################

And this line have to be uncommented too, if you want to see an extended status (very useful for our purpose)
ExtendedStatus On

Now it is time to restart/reload Apache to apply these changes.
You will have access to the server status via http, for example with links or lynx, so you need at least one of these packages installed, we recommend links, because you can directly execute the"service httpd fullstatus" command, if you use lynx you have to access to this url: http://servername/server-status

If you execute the command "service httpd fullstatus", you will get a full status like this:

Apache Server Status for localhost

   Server Version: Apache/2.2.15 (Unix) DAV/2 PHP/5.3.3 mod_ssl/2.2.15
   OpenSSL/1.0.0-fips mod_perl/2.0.4 Perl/v5.10.1

   Server Built: Feb 13 2012 22:31:42

     ----------------------------------------------------------------------

   Current Time: Thursday, 05-Dec-2013 18:03:21 CET

   Restart Time: Thursday, 05-Dec-2013 17:10:59 CET

   Parent Server Generation: 0

   Server uptime: 52 minutes 22 seconds

   Total accesses: 352 - Total Traffic: 11.0 MB

   CPU Usage: u35.48 s4.35 cu0 cs0 - 1.27% CPU load

   .112 requests/sec - 3675 B/second - 32.0 kB/request

   1 requests currently being processed, 10 idle workers

 _W_________.....................................................
 ................................................................
 ................................................................
 ................................................................

   Scoreboard Key:
   "_" Waiting for Connection, "S" Starting up, "R" Reading Request,
   "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
   "C" Closing connection, "L" Logging, "G" Gracefully finishing,
   "I" Idle cleanup of worker, "." Open slot with no current process

Srv   PID    Acc   M CPU  SS  Req  Conn Child Slot    Client              VHost                                     Request
0-0  16771 0/31/31 _ 4.10 165 1164 0.0  0.69  0.69 200.31.XX.YY XXXX.com             GET
                                                                                          /images/stories/XXXX-Congress-
1-0  16772 0/38/38 W 3.97 0   0    0.0  1.00  1.00 127.0.0.1     file.XXXXX.es GET /server-status HTTP/1.1
2-0  16773 0/37/37 _ 2.95 140 2759 0.0  1.27  1.27 200.31.XX.YY XXXXX.com             GET
                                                                                          /images/stories/XXX-Congress-
3-0  16774 0/36/36 _ 3.59 181 937  0.0  0.56  0.56 200.31.XX.YY7 XXXXX.com             GET
                                                                                          /images/stories/XXXXX-Congress-
4-0  16775 0/36/36 _ 4.51 114 227  0.0  0.91  0.91 93.90.30.27   XXXX.es      GET / HTTP/1.0
5-0  16776 0/27/27 _ 2.96 169 1401 0.0  0.62  0.62 200.31.XX.YY XXXXX.com             GET
                                                                                          /images/stories/XXXX-Congress-
6-0  16777 0/33/33 _ 3.47 78  242  0.0  0.46  0.46 200.31.XX.YY  XXX.es      GET / HTTP/1.0
7-0  16778 0/33/33 _ 3.70 150 1637 0.0  0.67  0.67 200.31.XX.YY XXXX.com             GET
                                                                                          /images/stories/XXXX-Congress-
8-0  17091 0/26/26 _ 2.80 157 239  0.0  0.34  0.34 200.31.XX.YY  XXX.es      GET / HTTP/1.0
9-0  17092 0/26/26 _ 4.62 53  204  0.0  0.48  0.48 200.31.XX.YY XXX.es      GET / HTTP/1.0
10-0 17093 0/29/29 _ 3.16 172 697  0.0  4.02  4.02 200.31.XX.YY XXXX.com             GET
                                                                                          /images/stories/XXXX-Congress-

     ----------------------------------------------------------------------

    Srv  Child Server number - generation
    PID  OS process ID
    Acc  Number of accesses this connection / this child / this slot
     M   Mode of operation
    CPU  CPU usage, number of seconds
    SS   Seconds since beginning of most recent request
    Req  Milliseconds required to process most recent request
   Conn  Kilobytes transferred this connection
   Child Megabytes transferred this child
   Slot  Total megabytes transferred this slot

     ----------------------------------------------------------------------

   SSL/TLS Session Cache Status:
   cache type: SHMCB, shared memory: 512000 bytes, current sessions: 0
   subcaches: 32, indexes per subcache: 133
   index usage: 0%, cache usage: 0%
   total sessions stored since starting: 0
   total sessions expired since starting: 0
   total (pre-expiry) sessions scrolled out of the cache: 0
   total retrieves since starting: 0 hit, 0 miss
   total removes since starting: 0 hit, 0 miss

You can see here the URLs that we are serving at the moment of the command execution and the PID associated to each URL.  

Now we can order PIDs by memory size with one of these commands ps -ylC httpd --sort:rss or  ps aux --sort rss | grep httpd

Getting an output similar to this:
S   UID   PID  PPID  C PRI  NI   RSS    SZ WCHAN  TTY          TIME CMD
S     0 16769     1  0  80   0 14140 78462 poll_s ?        00:00:00 httpd
S    48 16773 16769  0  80   0 22372 105167 semtim ?       00:00:03 httpd
S    48 17091 16769  0  80   0 28832 106926 semtim ?       00:00:02 httpd
S    48 17093 16769  0  80   0 49148 112022 semtim ?       00:00:03 httpd
S    48 16776 16769  0  80   0 49672 112099 semtim ?       00:00:03 httpd
S    48 16775 16769  0  80   0 49760 111986 semtim ?       00:00:04 httpd
S    48 16771 16769  0  80   0 49804 112110 ep_pol ?       00:00:04 httpd
S    48 16778 16769  0  80   0 49868 112026 semtim ?       00:00:03 httpd
S    48 16774 16769  0  80   0 53460 112998 semtim ?       00:00:04 httpd
S    48 17092 16769  0  80   0 53848 113049 semtim ?       00:00:04 httpd
S    48 16772 16769  0  80   0 53884 113018 semtim ?       00:00:04 httpd
S    48 16777 16769  0  80   0 53888 112942 semtim ?       00:00:03 httpd

Memory size (RSS) column is in ascendent order, now you have to match the PID from apache fullstatus output and the PID from ps command to know which virtualhost is leaking memory or which is using more memory.
You can do this 3 or 4 times to see how PIDs are growing in memory terms. 

Of course, there are other ways to find and debug a memory leak, we are going to name more sophisticated alternatives, but we are not going deep in there, may be in the future we will do new article about PHP APM and more sophisticated ways to look for a memory leak. 

Other "fast ways": 
  • Debugging with Valgrind
  • Debugging Strace
APM focused alternatives.
Open source alternatives:
Non-Opensource:



No comments:

Post a Comment