Archive

Archive for the ‘PHP’ Category

Jérôme Loyet is a saint!

December 15th, 2009 mike No comments

I'd like to announce that Jérôme Loyet stepped up this weekend and hacked up the first round of code to get dynamic process management going. Antony committed it (see below) and it's on its way to being part of PHP core as well. So two major events in only a week and a half or so!

See http://news.php.net/php.internals/46414.

Good job Jérôme. Where were you a couple months ago, buddy? :)

Categories: PHP, PHP-FPM Tags:

PHP-FPM brought in to PHP core - interesting surprise

December 5th, 2009 mike No comments

Read it here: http://news.php.net/php.internals/46277.

First off, big thanks to Antony Dovgal. I've exchanged words with him in the past about PHP-FPM (and actually other PHP things) but was completely unaware he was working on this.

So, we've got a blessing but also an interesting dilemma on our hands. We've got a wishlist and some bugs to work out. I have a feeling if Antony updated some of the CGI internals it may have resolved some of those bugs. Not sure. I'm trying to get some specifics now - what version of PHP-FPM he brought in, how the community can still support it and how difficult it may be to submit patching, if he thinks a separate management daemon makes more sense than keeping it glued inside of the SAPI (it seems out of place to me for a SAPI to require a proprietary configuration file and daemon .pid, log, etc. files...)

Hopefully I can get ahold of him soon and discuss some of this. I was already mid-discussion with another PHP core developer about how they think the best approach would be to get PHP-FPM aligned with PHP core (they leaned more towards a separate SAPI too.)

My main goal is to make it easy as it can be for PHP-FPM to become an official package or included with PHP so that people who use PHP from repositories on their favorite distributions and such can enjoy the benefits of PHP-FPM without patches or separate downloads. If the management portion does split off, I fully intend on making sure it is aligned properly and is as simple as "apt-get install php5-fpm" or something of that nature. Still easily installed and everything.

Anyway, we'll see how things go. This caught me off guard and now I have to figure out at what point we're at now with development. Jérôme Loyet has expressed interested in trying to convert the configuration file to nginx style - something Andrei had told me he had wanted to do. The XML throws some people off, thinking it's an actual XML parsed document with XML include support and such... also if done right, this will allow PHP-FPM's configuration to support includes, and who knows, maybe variables some day. But for now it would be a lot cleaner to read, and it seems the majority of PHP-FPM users are nginx users already anyway :)

Categories: PHP, PHP-FPM Tags:

Who ever said open source software was perfect?

December 14th, 2008 mike No comments

Typically, updates on the open source packages work without a hitch. However, my upgrade last weekend on my servers from Ubuntu Hardy to Intrepid wound up creating a couple major headaches, and at the same time, I noticed a handful of other snafoos happening to open source packages I use daily.

This wound up in server instability, client annoyance, and 20-30 hours solid of trial-and-error compiling, testing, debugging, etc. Even right now, if I forget to hold back the libgpac-dev package from being updated, all videos being converted lose their sound due to MP4Box crashing.

Categories: PHP, Software, nginx Tags:

Updates on the HTTP file upload front, part 2

December 3rd, 2008 mike 2 comments

Continuing from http://michaelshadle.com/2008/11/26/updates-on-the-http-file-upload-front/ ...

I've been doing some research and more hacking. Code should find its way out there sometime soon. Here's my notes since the last installment of the "As The File Upload World Turns"

  • Figured out the appropriate nginx configuration so there is no buffering to disk of the request.
  • Rudamentarily tested browser memory while uploading a 220 meg file - did not appear to use much more than normal (which leads me to believe that Gears is slicing the file up efficiently enough and only grabs the bytes it needs)
  • HTTP authentication will not work. From what I could understand from the code and some random comments, authentication information is not supported by Gears' XHR object.
  • Pause/resume is possible; I've changed the PHP server side piece to accept the byte offset Gears tells it to start at; originally was having issues until I determined why it wasn't fseek()'ing properly :) However this still requires a more stateful approach on the client side. Will probably have to implement a local Gears database and possibly a worker pool setup. This will allow for persistence and other neat things.
  • I've got a decently functioning JavaScript UI which seems to calculate out the average speed, estimated time remaining, etc.

The only thing missing is a better attempt to see if Gears will retry the upload on a failure. I believe it is possible when dealing with a worker pool but this is -very- basic XHR usage at the moment. Perhaps since it is JavaScript-based we can add in our own re-transmission code. That's the next piece I'm going to mess around with.

Stay tuned for the results... (and code, most likely!)

Categories: PHP Tags:

Quick code snippet: normalizing a URI (for friendly URLs, etc.)

December 3rd, 2008 mike 1 comment

When you enter the realm of "friendly URLs" "slugs" "nice names" or whatever else you call them, it can make everything a lot better looking. However, if done incorrectly, you can get some duplicate indexed pages and the like. I couldn't sleep and wanted to try approaching this again a different way, and every URI I've thrown at it comes out how I want it.

Why does this matter? Typically, without rewrites and using normal webserver, directory and file semantics, a request for "/foo" should make the webserver bounce you to "/foo/" - but when dealing with rewritten URLs, there is no enforcement of this behavior. A lot of the time (at least with the stuff I'm currently dealing with) the same page shows up with "/foo" or "/foo/" and both are considered unique to a search engine. It's duplication of data which violates the normalization devil in me! Even worse, certain apps might not even process the request the same. "/foo" could load one page, and "/foo/" could load another, or an error. That's worse; when people send URLs out, sometimes they take artistic license with what they look like. This is to thwart all that and force search engines, users, etc. to all view the same URL. I chose to enforce the URL structure ending with "/" as I think it helps establish that "final" signoff of the non-query string portion of the URI.

You could go about this different ways; however, due to the way I have my nginx rewrites done, I can't rely on $_SERVER['QUERY_STRING'] which is how I had originally written it - and I wondered why I was getting some weird behavior. Now I realize I need the function to handle any string that is passed to it, and then I can make this behave appropriately.

Below is the function and some sample code to run it. There's probably a few opcodes that could be saved in trade for a little bit of memory by calculating the length of the URI, position of the "?" if there is one, etc. However, I hate defining variables so much that get used only once (a major pet peeve is when people create a new variable for absolutely no reason, this one would at least save a couple CPU cycles...) - so this is my least-amount-of-code-possible version. Enjoy.

# some example URIs
$uris = array(
   "/bar/",
   "/bar",
   "/bar/ee",
   "/bar/index.html",
   "/bar/index.php",
   "/bar/index.php?fds",
   "/bar/index.php?f=bar&fbahd=3",
   "/bar/index.php?http://www.foo.com",
   "/bar/index.php/bark",
   "/bar/index.php!meow|fJG)*#)$*J:g",
   "?somehow",
   ""
);

foreach($uris as $uri) {
   echo $uri." => ".normalize_uri($uri)."\n";
}

function normalize_uri($uri) {
   # if there is query string, we want to chop it off and put it aside
   if(strstr($uri, '?')) {
      $query = substr($uri, strpos($uri, '?'), strlen($uri));
      $uri = substr($uri, 0, strpos($uri, '?'));
   }

   # scrub any index.* stuff off the end
   $uri = preg_replace("/index.(\S{0,3})$/", '', $uri);

   # if it doesn't end with a '/', then add one
   if(substr($uri, strlen($uri)-1, strlen($uri)) != '/') {
      $uri .= '/';
   }

   # finally, put the query string back on
   if(!empty($query)) {
      $uri .= $query;
   }

   return $uri;
}

You'd tie this in with something like a:

header('Location: http://'.$_SERVER['HTTP_HOST'].$uri, true, 301);
exit();

To make sure that it is redirecting with a 301 (search engine friendly) header. (Don't hardcode the scheme - https or http, depending on what your site uses.) Something like this should work:

if(isset($_SERVER['HTTPS']) && strtolower($_SERVER['HTTPS']) == 'on') {
   $scheme = 'https://';
} else {
   $scheme = 'http://';
}

This is how I would throw it all together:

if(isset($_SERVER['REQUEST_URI']) && substr($_SERVER['REQUEST_URI'], strlen($_SERVER['REQUEST_URI'])-1, strlen($_SERVER['REQUEST_URI'])) != '/') {
   if(isset($_SERVER['HTTPS']) && strtolower($_SERVER['HTTPS']) == 'on') {
      $url = 'https://';
   } else {
      $url = 'http://';
   }
   # fill in $_SERVER['REQUEST_URI'] here with whatever is holding the original URI
   $url .= $_SERVER['HTTP_HOST'].normalize_uri($_SERVER['REQUEST_URI']);
   header('Location: '.$url, true, 301);
   exit();
}

Now, it is 3:30am and I am trying to compose this inside of WordPress, but I believe that will work.

I apologize - for some reason the indentation is not showing up... remind me to add in that neat code sample plugin soon.

Categories: PHP Tags:

Updates on the HTTP file upload front

November 26th, 2008 mike No comments

Right now I'm happy. Why's that? Because I have a neat Google Gears-based file uploader that uses hardly any code and no special server requirements or code requirements. It is *almost* everything I've been trying to accomplish with my behind-the-scenes type efforts to scare up some development... and this was accomplished so quickly it was scary.

Big props to Raghava @ work. He did the legwork of putting together the base code that makes Gears work properly and enough for me to play around with. Between the two of us I think we'll be able to conquer the rest of it.

Okay, so what exactly was I hoping for?

  • No special server requirements - PHP is all that is needed, data comes via POST
  • Not one single huge request - yes, the files are split up in memory at runtime as needed into configurable-sized chunks
  • Large file support - yes, as large as any component allows, at least
  • Multiple file support - yes, uploads happen in a serial fashion
  • Friendly file selection UI - yes, it uses the OS's file browser like a normal file upload
  • Friendly web UI - yes - since it's in JavaScript, anything can be done
  • Transparent HTTP/HTTPS support - have not tested it on an SSL-enabled box, but I see no reason why it wouldn't work (it's basically like an XHR request)
  • Retransmission on failure - no, I believe the Gears developers need to implement additional settings/capabilities into the "httprequest" class - this is the biggest gap
  • Pause/resume support - no, I don't think this is supported quite yet in our implementation
  • Persistence - being able to resume later on after power outage, browser closing, etc. - no, but possibly an option if integrated into the Gears local database
  • Parallel processing (not required, might be neat) - possibly with a Gears worker pool, but not really a big enough deal to bother

Right now what we've got pretty much meets most of the needs. The biggest gap here is making sure it attempts re-transmission. Without that, this is still very cool, but that will be one of the major benefits this exercise could offer.

Perhaps soon I will post some code. I'll want to ask Raghava if he cares first. Maybe some JavaScript/Gears gurus could even clean it up or add more functionality.

Categories: PHP Tags:

Open source projects currently taking my interest

April 13th, 2008 mike 4 comments

For a long time I've been using Lighttpd as an alternative to Apache, Ubuntu, MySQL and PHP.

It looks like I might be mixing it up soon, swapping out some and adding a couple extras. Some of these I've been able to mess around with, others I am excited but have not yet had the chance...

PHP-FPM

PHP-FPM is just a patch for PHP, and a well-received one. I consider it important enough to showcase here. It matures the FastCGI SAPI and adds a couple of performance enhancements, graceful reloading, Apache-like process management for adaptive process spawning, and a much needed suexec-ish capability which will save me many headaches. It will hopefully be merged into PHP core when Andrei considers it "feature complete" and changes the license to be compatible. I can't wait. Right now it can still be used, and I've installed it without an issue on all my boxes - some serving up over a million PHP requests per day under different uid/gids. (Note: use Google Translate for the Russian URLs on the site above, the docs have not been translated to English yet. Google does a good job on it and it reads just fine.)

CouchDB

Throwing out all the concepts of structured databases and building a new system from the ground up with interoperability and scalability in mind as a data store? You've got me sold. It seems a lot of people are nervous about scaling MySQL (with good reason) and CouchDB might be a good alternative. Using RESTful URLs for everything and JSON as a lightweight (compared to SOAP/XML/etc.) transport language, it seems like we'll have plenty of options and usage models. I think I heard on a presentation as well that it will support files of any size, which potentially means it could be used not only as a possible RDBMS replacement (even though it says it isn't, I'm sure plenty of apps could use it), but also as a distributed document storage system (which it might already be considered.) Added bonuses: designed for high traffic, supports disconnected computing, self-healing replication, optimistic locking... I can't wait to play with this.

MogileFS

A distributed document storage system. I've thought about trying this out in the past, but I was mainly looking for a drop-in replacement for standard file storage. MogileFS may have some wacky method to do that via FUSE someday, but in the meantime, it can be leveraged for all application-based file storage, which I'd say is 95%+ of the files I deal with. Just like CouchDB, it leverages standards for communication (WebDAV for actual file access) and simple text-based socket communication which can be used from any programming language that supports a socket. Currently I am successfully running it with nginx serving up the WebDAV portion as opposed to the standard Perl-based webserver. It was too easy. I plan on trying to leverage this on xMike for all of the image uploads and other user assets, most definately. I like how it doesn't require any special low level support - it simply spreads files over N number of hosts and uses a tracker to determine which host(s) have which file(s) - and includes replication management so a broken node does not mean a broken file.

nginx

Powering a handful of extremely busy sites, nginx is the tiny webserver you may or may not have heard of. Every account of it I have read has done nothing but rave about it; I'm in the process of converting all the servers I manage over to it. With complaints about memory leaks in Lighttpd and Apache being bloated, I think it's prime time for nginx to get more attention as a viable option. It's still "beta" but what isn't nowadays? It's been running for over 2.5 years in production on the main site it was developed for and I'm sure many others. The configuration file syntax is extremely simple. It has a couple neat little additions, like the built-into-memory 1x1 transparent gif support (for all those webbugs and spacer images) so you no longer have to host it yourself and it serves it directly from memory. While that's a little bit off the basic needs of a webserver, that seems extremely useful as someone who has had to deal with those for years. Anyway, don't let the old "Mother Russia" style logo on the English wiki scare you. It's worth a shot, and could even replace Pound, other reverse proxies and Layer 7 capable load balancing solutions. I'm sure someone might even be able to write a replacement for Squid and Varnish using it too, by enhancing the proxy module to save local cached copies of the content...

I'm sure I might be missing a couple. It's getting late. I'd add memcached to the list, but I'm already using it. It's no longer "taking my interest" as I've been able to fully integrate it now :)

Interesting to note that all these products (especially MogileFS and CouchDB) are capable of being distributed and allow for transparent node failures and were designed to be ran on unreliable/commodity hardware. nginx basically does the same as well, since it is a web node (and I run the FastCGI processes on them too) and can scale horizontally already. It does kind of make me wonder though if I am slowly becoming deprecated by fully automated cloud computing-based solutions (like RightScale, 3Tera, Mosso, Elastra and even DIY Scalr)