Home > nginx, PHP, Software > Open source projects currently taking my interest

Open source projects currently taking my interest

For a long time I've been using Lighttpd as an alternative to Apache, Ubuntu, MySQL and PHP.

It looks like I might be mixing it up soon, swapping out some and adding a couple extras. Some of these I've been able to mess around with, others I am excited but have not yet had the chance...

PHP-FPM

PHP-FPM is just a patch for PHP, and a well-received one. I consider it important enough to showcase here. It matures the FastCGI SAPI and adds a couple of performance enhancements, graceful reloading, Apache-like process management for adaptive process spawning, and a much needed suexec-ish capability which will save me many headaches. It will hopefully be merged into PHP core when Andrei considers it "feature complete" and changes the license to be compatible. I can't wait. Right now it can still be used, and I've installed it without an issue on all my boxes - some serving up over a million PHP requests per day under different uid/gids. (Note: use Google Translate for the Russian URLs on the site above, the docs have not been translated to English yet. Google does a good job on it and it reads just fine.)

CouchDB

Throwing out all the concepts of structured databases and building a new system from the ground up with interoperability and scalability in mind as a data store? You've got me sold. It seems a lot of people are nervous about scaling MySQL (with good reason) and CouchDB might be a good alternative. Using RESTful URLs for everything and JSON as a lightweight (compared to SOAP/XML/etc.) transport language, it seems like we'll have plenty of options and usage models. I think I heard on a presentation as well that it will support files of any size, which potentially means it could be used not only as a possible RDBMS replacement (even though it says it isn't, I'm sure plenty of apps could use it), but also as a distributed document storage system (which it might already be considered.) Added bonuses: designed for high traffic, supports disconnected computing, self-healing replication, optimistic locking... I can't wait to play with this.

MogileFS

A distributed document storage system. I've thought about trying this out in the past, but I was mainly looking for a drop-in replacement for standard file storage. MogileFS may have some wacky method to do that via FUSE someday, but in the meantime, it can be leveraged for all application-based file storage, which I'd say is 95%+ of the files I deal with. Just like CouchDB, it leverages standards for communication (WebDAV for actual file access) and simple text-based socket communication which can be used from any programming language that supports a socket. Currently I am successfully running it with nginx serving up the WebDAV portion as opposed to the standard Perl-based webserver. It was too easy. I plan on trying to leverage this on xMike for all of the image uploads and other user assets, most definately. I like how it doesn't require any special low level support - it simply spreads files over N number of hosts and uses a tracker to determine which host(s) have which file(s) - and includes replication management so a broken node does not mean a broken file.

nginx

Powering a handful of extremely busy sites, nginx is the tiny webserver you may or may not have heard of. Every account of it I have read has done nothing but rave about it; I'm in the process of converting all the servers I manage over to it. With complaints about memory leaks in Lighttpd and Apache being bloated, I think it's prime time for nginx to get more attention as a viable option. It's still "beta" but what isn't nowadays? It's been running for over 2.5 years in production on the main site it was developed for and I'm sure many others. The configuration file syntax is extremely simple. It has a couple neat little additions, like the built-into-memory 1x1 transparent gif support (for all those webbugs and spacer images) so you no longer have to host it yourself and it serves it directly from memory. While that's a little bit off the basic needs of a webserver, that seems extremely useful as someone who has had to deal with those for years. Anyway, don't let the old "Mother Russia" style logo on the English wiki scare you. It's worth a shot, and could even replace Pound, other reverse proxies and Layer 7 capable load balancing solutions. I'm sure someone might even be able to write a replacement for Squid and Varnish using it too, by enhancing the proxy module to save local cached copies of the content...

I'm sure I might be missing a couple. It's getting late. I'd add memcached to the list, but I'm already using it. It's no longer "taking my interest" as I've been able to fully integrate it now 🙂

Interesting to note that all these products (especially MogileFS and CouchDB) are capable of being distributed and allow for transparent node failures and were designed to be ran on unreliable/commodity hardware. nginx basically does the same as well, since it is a web node (and I run the FastCGI processes on them too) and can scale horizontally already. It does kind of make me wonder though if I am slowly becoming deprecated by fully automated cloud computing-based solutions (like RightScale, 3Tera, Mosso, Elastra and even DIY Scalr)

Categories: nginx, PHP, Software
  1. mike
    April 15th, 2008 at 16:42 | #1
  2. Kiril Angov
    April 21st, 2008 at 17:49 | #2

    Take a look at ncache for Squid replacement (googlecode project). Ncache is a form of nginx for that purpose.

  3. mike
    April 21st, 2008 at 17:55 | #3

    Yeah, I saw it on the nginx mailing list.

    I am actually not necessarily going to implement a reverse caching proxy. I just need load balancing, which I am using nginx for as well at the moment. I am also using it to gzip my content, and I hope if I need SSL, it would be a bonus if I can do it on the proxy server.

    The things lacking in nginx for load balancing: healthchecks (what if the file store is not mounted? You'll get a 404... which means you have to set proxy_next_upstream http_404 as well; but if the file is truly a 404, then you're wasting a bunch of resources having it cycle through each server. It needs some sort of healthcheck with an expected response. That's outside of the scope though of the core nginx server though, so I'm not really complaining.

    I am thinking about looking at HAproxy, or possibly just going back to LVS+ldirectord. It has the healthchecking and notification when a server drops out that I am looking for, but it does not have any proxy_next_upstream behavior internally. That's where I am hoping HAproxy might be the best of both worlds for a load balancer...

    I could do my own sanity check, have an include file of servers that nginx reads for upstreams, and if one fails, remove it and then reload nginx. But that seems kinda messy.

  4. fak3r
    May 8th, 2008 at 06:21 | #4

    These are some very promising projects, I'm building some sites that will need to scale and be distributed globally, so I'm currently in the building blocks phase, even though I've done some of this before. I've only recently gotten behind nginx, after a few years with Lighttpd, and would like to see its load balancing mature a bit, for reasons you mention. There was an article in the online mag 03 (Issue 6, 2007) on how to use nginx to do global sever load balancing by using multiple upstream farms that sounds pretty promising. As for reverse proxy, have you worked with the caching rev-proxy (http accelerator they call it) Varnish? While also still new with the load-balancing (it uses round-robin that can be combined with a simple weight method for preference), I have high hopes for it. I've put up some of my thoughts and configuration, on my site, but overall I've been very impressed, especially how it crushed Squid in my testing.

    I'm not sure how Varnish would work with memcached, that I used to have running, but I think those two, along with nginx, is going to be part of my future scaling efforts. I want to work with that MogileFS soon as well - sounds v.cool. If anything uses distributed nodes that can become the master upon another nodes failure, I'm all about it. Along those lines, I was thinking of a way I could use Bittorrent to do permanent file stores this way too, globally distributed, all mirrored automatically, all are trackers, seeders, clients and servers at the same time.

    Thanks for the info!

    fak3r

  1. No trackbacks yet.
You must be logged in to post a comment.