Archive

Archive for the ‘PHP’ Category

Updates on the HTTP file upload front

November 26th, 2008 No comments

Right now I'm happy. Why's that? Because I have a neat Google Gears-based file uploader that uses hardly any code and no special server requirements or code requirements. It is *almost* everything I've been trying to accomplish with my behind-the-scenes type efforts to scare up some development... and this was accomplished so quickly it was scary.

Big props to Raghava @ work. He did the legwork of putting together the base code that makes Gears work properly and enough for me to play around with. Between the two of us I think we'll be able to conquer the rest of it.

Okay, so what exactly was I hoping for?

  • No special server requirements - PHP is all that is needed, data comes via POST
  • Not one single huge request - yes, the files are split up in memory at runtime as needed into configurable-sized chunks
  • Large file support - yes, as large as any component allows, at least
  • Multiple file support - yes, uploads happen in a serial fashion
  • Friendly file selection UI - yes, it uses the OS's file browser like a normal file upload
  • Friendly web UI - yes - since it's in JavaScript, anything can be done
  • Transparent HTTP/HTTPS support - have not tested it on an SSL-enabled box, but I see no reason why it wouldn't work (it's basically like an XHR request)
  • Retransmission on failure - no, I believe the Gears developers need to implement additional settings/capabilities into the "httprequest" class - this is the biggest gap
  • Pause/resume support - no, I don't think this is supported quite yet in our implementation
  • Persistence - being able to resume later on after power outage, browser closing, etc. - no, but possibly an option if integrated into the Gears local database
  • Parallel processing (not required, might be neat) - possibly with a Gears worker pool, but not really a big enough deal to bother

Right now what we've got pretty much meets most of the needs. The biggest gap here is making sure it attempts re-transmission. Without that, this is still very cool, but that will be one of the major benefits this exercise could offer.

Perhaps soon I will post some code. I'll want to ask Raghava if he cares first. Maybe some JavaScript/Gears gurus could even clean it up or add more functionality.

Categories: PHP

Open source projects currently taking my interest

April 13th, 2008 4 comments

For a long time I've been using Lighttpd as an alternative to Apache, Ubuntu, MySQL and PHP.

It looks like I might be mixing it up soon, swapping out some and adding a couple extras. Some of these I've been able to mess around with, others I am excited but have not yet had the chance...

PHP-FPM

PHP-FPM is just a patch for PHP, and a well-received one. I consider it important enough to showcase here. It matures the FastCGI SAPI and adds a couple of performance enhancements, graceful reloading, Apache-like process management for adaptive process spawning, and a much needed suexec-ish capability which will save me many headaches. It will hopefully be merged into PHP core when Andrei considers it "feature complete" and changes the license to be compatible. I can't wait. Right now it can still be used, and I've installed it without an issue on all my boxes - some serving up over a million PHP requests per day under different uid/gids. (Note: use Google Translate for the Russian URLs on the site above, the docs have not been translated to English yet. Google does a good job on it and it reads just fine.)

CouchDB

Throwing out all the concepts of structured databases and building a new system from the ground up with interoperability and scalability in mind as a data store? You've got me sold. It seems a lot of people are nervous about scaling MySQL (with good reason) and CouchDB might be a good alternative. Using RESTful URLs for everything and JSON as a lightweight (compared to SOAP/XML/etc.) transport language, it seems like we'll have plenty of options and usage models. I think I heard on a presentation as well that it will support files of any size, which potentially means it could be used not only as a possible RDBMS replacement (even though it says it isn't, I'm sure plenty of apps could use it), but also as a distributed document storage system (which it might already be considered.) Added bonuses: designed for high traffic, supports disconnected computing, self-healing replication, optimistic locking... I can't wait to play with this.

MogileFS

A distributed document storage system. I've thought about trying this out in the past, but I was mainly looking for a drop-in replacement for standard file storage. MogileFS may have some wacky method to do that via FUSE someday, but in the meantime, it can be leveraged for all application-based file storage, which I'd say is 95%+ of the files I deal with. Just like CouchDB, it leverages standards for communication (WebDAV for actual file access) and simple text-based socket communication which can be used from any programming language that supports a socket. Currently I am successfully running it with nginx serving up the WebDAV portion as opposed to the standard Perl-based webserver. It was too easy. I plan on trying to leverage this on xMike for all of the image uploads and other user assets, most definately. I like how it doesn't require any special low level support - it simply spreads files over N number of hosts and uses a tracker to determine which host(s) have which file(s) - and includes replication management so a broken node does not mean a broken file.

nginx

Powering a handful of extremely busy sites, nginx is the tiny webserver you may or may not have heard of. Every account of it I have read has done nothing but rave about it; I'm in the process of converting all the servers I manage over to it. With complaints about memory leaks in Lighttpd and Apache being bloated, I think it's prime time for nginx to get more attention as a viable option. It's still "beta" but what isn't nowadays? It's been running for over 2.5 years in production on the main site it was developed for and I'm sure many others. The configuration file syntax is extremely simple. It has a couple neat little additions, like the built-into-memory 1x1 transparent gif support (for all those webbugs and spacer images) so you no longer have to host it yourself and it serves it directly from memory. While that's a little bit off the basic needs of a webserver, that seems extremely useful as someone who has had to deal with those for years. Anyway, don't let the old "Mother Russia" style logo on the English wiki scare you. It's worth a shot, and could even replace Pound, other reverse proxies and Layer 7 capable load balancing solutions. I'm sure someone might even be able to write a replacement for Squid and Varnish using it too, by enhancing the proxy module to save local cached copies of the content...

I'm sure I might be missing a couple. It's getting late. I'd add memcached to the list, but I'm already using it. It's no longer "taking my interest" as I've been able to fully integrate it now 🙂

Interesting to note that all these products (especially MogileFS and CouchDB) are capable of being distributed and allow for transparent node failures and were designed to be ran on unreliable/commodity hardware. nginx basically does the same as well, since it is a web node (and I run the FastCGI processes on them too) and can scale horizontally already. It does kind of make me wonder though if I am slowly becoming deprecated by fully automated cloud computing-based solutions (like RightScale, 3Tera, Mosso, Elastra and even DIY Scalr)

Categories: nginx, PHP, Software