A little bit ago I had set up Trac on http://dev.dejardin.org, which is a python application, and Subversion which is it’s own Apache dav module, and that’s running on the same slicehost instance as Where’s Lou which is of course a Wordpress php application.

Unfortunately the net result of this was to bloat each instance of the Apache process to 21 megs and about ten of them would be running at any given time. I believe I was also experiencing a condition where apache process were not correctly terminating on a reset. In any case my slice with 256 meg ram and 512 meg swap space was running with all ram used up and over 260 megs swaped at any given time.

I’m not concerned about the use of virtual memory in general, that’s what it’s there for, but at this point there were some noticable request delays appearing. This was the point where I decided something needed to be done because there’s no point driving people away from your web site because of water-torture-esqe inconsistent response delays.

Slicehost even sent me an email:

This is an automatic notification to let you know that your slice, xxxx, is showing a considerable amount of consistent swapping activity. Quite often this is an indicator that your application or database are not as efficient as they could be. It also may indicate that you need to upgrade your slice for more RAM.

Here are a couple of relevant articles that may help you debug your issue:

System monitoring with top:
http://articles.slicehost.com/2007/9/7/system-monitoring-with-top

Memory management with free:
http://articles.slicehost.com/2007/9/7/memory-management-with-free

—-
Slicehost Support
support@slicehost.com

So hats off to them for providing active monitoring of a non-fatal condition like this. Of course in some respects this is also an email that could result in an account upgrade so I shouldn’t be surprised I suppose.

Anyhoo. I was on the form with a $20/month to $37/month bump about to click okay, but I just didn’t want to part with $200/year without putting at least some token effort into optimization.

Enter: Nginx! It was mentioned on the Slicehost tutorials in the context of standing up a Nginx/Mongrel/Rails stack, but there were other resources that described using it over Wordpress.

Now I’m not about to describe the step-by-step process because to do that right I would need to install a clean server and I don’t have the spare time for that at the moment. Also because everything I did was based on Googled resource to begin with that would be redundant. But I can talk about how Nginx is different than Apache and provide links to the resources I used.

What is Nginx? It’s a quarterback.

If Apache could be compared to IIS, Nginx could be compared to http.sys. When the tcp stack hikes a request to Nginx it takes a look in the fairly simple set of rules you give it to find out what process it should throw it to. If it’s a path of static files, or exists on disk, bam - done - that’s something which is a strength and can be set up before the request even goes to php, python, etc. Or say it’s a php app like Wordpress you’ll configure Nginx to fast-cgi the request to the php-cgi processes via localhost port.

It’s a case by case basis, really.

Trac for example provides a tracd process which is it’s own little web server you can run as a daemon. So instead of cgi you’ll tell Nginx to simply web-proxy the tracd localhost port as if it was an upstream web server. Subversion’s dav module will only work in Apache, so it’s a similar use of web-proxy. You’d configure Nginx and Apache to both run on different ports, and the svn url on Nginx will web-proxy to the Apache port.

Now hang on, you’re saying. You’ve just added a ton of new processes to try and save memory. You must be stupid and ugly. But wait! You’re not thinking about how much memory you’re saving in the process!

Nginx for me is running one root process with 870k and a second as www-data with 2.2m resident memory. And that should be taking care of all the static file access too. I’ve set php-cgi to spawn 4 processes and they’re 15-18m each, so that’s fewer instances total and less memory each than apache and should only be hit for actual php calls so the pool has more focus. (Which is good so like trac or svn won’t choke out the blog). Tracd I’m only running one instance (though it’s 35m!).

And here’s the kicker - even though I’m still running Apache it’s also a much more focused resource pool. It’s only used for svn, so a) the apache python and php modules are disabled, b) the number of instances is dialed down to two-four at a time, and c) only svn requests go there so that memory can fall out to the swap file and stay there for days and not affect the memory needs of nginx or php-cgi which I expect to stay resident all the time.

Finally let’s talk about the “how-to” of a rolling Apache to Nginx deployment. By rolling I mean getting them running side-by-side first and giving yourself a quick trigger to pull to swap them over.

Step 1: Open up port 81 and install Nginx on there so it’s world routable.

Step 2: Get your php, trac, etc. daemons running on internal ports and configure virtual servers in Nginx to forward to them. At this point you can verify all of the resources you need at yourdomain.com:81 (Nginx + php-cgi) are working the same as at yourdomain.com:80 (Apache + php5_module).

Step 3: Swap port 80/81 in all your apache and nginx config files. This won’t have any effect on the running instances - so as long as you’re just editing files your web server will still be running normal apache like it always has. (You may want to copy your apache configs to make a potential rollback simpler.)

Here’s where you pull the trigger: Stop them both and start them both and you should have Nginx as your web server.

Step 4: Verify everything on 80 is working and is now coming from Nginx. Close down port 81 again. If you don’t need Apache for anything like dav_svn you can stop it and remove it’s startup on boot.

As an example, here’s the resulting whereslou.com Nginx file.

server {
  listen 80;
  server_name whereslou.com;

  access_log /path/to/sites/whereslou.com/logs/access.nginx.log;
  error_log /path/to/sites/whereslou.com/logs/error.nginx.log;

  location / {
    root /path/to/sites/whereslou.com/public;
    index index.php index.html index.htm;

    # this serves static files that exist without running other rewrite tests
    if (-f $request_filename) {
      expires 30d;
      break;
    }

    # this sends all non-existing file or directory requests to index.php
    if (!-e $request_filename) {
      rewrite ^(.+)$ /index.php?q=$1 last;
    }
  }

  location ~ \.php$ {
    fastcgi_pass   127.0.0.1:9000;  # port where FastCGI processes were spawned
    fastcgi_index  index.php;
    fastcgi_param  SCRIPT_FILENAME    /path/to/sites/whereslou.com/public$fast$

    include /usr/local/nginx/conf/fastcgi_params;
  }
}

Here are some resources I remember using as I installed Nginx. I installed from source, instead of using the Ubuntu apt packages, but I’m not sure I’d recommend it since the default installation ends up being at /usr/local/nginx instead of /etc/nginx which is a more conventional gutsy location.

The last one is important because you’ll want to ensure php-cgi daemon starts on reboot. Adding an ubuntu “service” is pretty simply once you’ve dropped the controlling script into the init.d folder:


sudo chmod +x /etc/init.d/php-fastcgi
sudo update-rc.d php-fastcgi defaults

Now you can run sudo /etc/init.d/php-fastcgi start or restart or stop. The update-rd.d gets the script to be executed with start or stop at the correct time during a the linux startup and shutdown sequence.

Remarkable how low-tech some Linux stuff is really. Windows has services, there’s extensive registry configuration, a service control manager process, and an entire api for service management on the outside and service implementation on the inside. Linux? It’s a folder filled with shell scripts and a handful of conventions.