Archive for the ‘PHP’ Category

How would I change PHP?

September 22nd, 2010 3 comments

Anyone who knows me knows I am a PHP fanboy. I use PHP for everything - web applications, web scraping, batch scripting, if there is an itch that software can fix, I try to scratch it with PHP. I dreamed of a PHP scripting plugin for Eggdrop IRC bots, so I didn't have to fuss with TCL. Anywhere PHP could be adopted, I've hoped someone was working on a way it could be.

However, if you talk to people who know the internals of PHP they'll tell you there's a lot of ugly stuff in there. That it's a language based on macros, etc. I don't necessarily care about that. My experience is from a user perspective, not an internals one. That being said, just from my higher level interaction with the language, these are some of the things I'd love to change.

  • Make function name conventions consistent. Some functions have underscores, some don't. strpos vs. str_replace, html_entity_decode vs. htmlentities, etc.
  • Make argument order consistent for similar types of functions. Depending on what you're doing, it's one or the other. in_array($needle, $haystack) vs. strstr($haystack, $needle), etc.
  • Optimize the core. Strip the core down more, and push more things into modules. Enable some of them by default, fine. But when it comes down to it, I don't need easily 30-40% of the functions that PHP has built in.
  • Combine similar functions and use arguments to define the behavior. For example addslashes() and addcslashes(). Make it one function with a constant to define its behavior.
  • Disable magic quotes (preferred) or enable it and don't give any option to change it. As far as I'm concerned as long as you pick one route, you can guarantee universal compatibility, whether that means using magic quotes, or not using them and expecting developers to understand input sanitization, sanity checking/type checking/all that jazz. Which I don't think is a bad thing.
  • Implement a "strict" mode. "PHP is lazy" as Rasmus says which is fine and all, but I don't like the PHP name shamed with terms like "insecure" - any code can be insecure in any language, however, PHP is so easy to pick up and get things going that it makes it too easy to write crappy and insecure code. Specifics on a "strict mode"? I've got none. It's late and I can't think of how I would enforce better coding practices in core...
  • Get rid of $_REQUEST. I've advocated this for years and even unset($_REQUEST) in my code. To me it's a lazy person's workaround for coding and introduces some of the same vectors that were closed when disabling register_globals. If you -really- want to have a $_REQUEST type mechanism in your code, just array_merge($_GET, $_POST, $_COOKIE, etc) in whatever oder you want. I dislike using software that uses $_REQUEST by default but doesn't actually need the flexibility of POST vs. GET vs. COOKIE and such. Know which input stream your data is coming from, if nothing else, it will at least make replay attacks and such much harder for people to craft.
  • Get rid of objects and OO stuff. Yeah, I said it. Everyone loves OOP. Why? While I see the power of being able to extend classes, I also see it seeming to be the most troublesome when it comes to compatibility checks, all the APC crashing or odd bugs I've suffered from were due to it. If you look at something like Drupal, they've figured out how to extend or override using procedural code quite well. Sadly, even they're converting more things to OO as well.  IMO, OOP is more suited for longer-running applications, perhaps something event driven where a new object to represent a connection is created (however, C's been doing this without dealing with objects forever, it doesn't HAVE to be OO...) Those are the two main examples I see for using OO. Disclaimer: I wasn't raised in an OO environment, this is all based on personal experience and preference. 🙂

I've memorized the function list for what I use pretty well (like I said, I probably only use a subset of the functions in PHP) however the most annoying thing is when it comes to the needle vs. haystack argument positioning. I usually have to reference for it. Sometimes I can trial and error though. In an ideal world, I wouldn't have to.

It would be great if something like PHP 6.0 would adopt some of these practices, since it is a major version change. Perl, Ruby and Python I believe have all done similar things where a major change really was a dramatic change and required conversion of code to meet its new requirements.

I'm sure this list could grow, and I may add to it. Who knows.

Categories: Development, PHP

Cleanest configuration for the new PHP-FPM?

August 26th, 2010 9 comments

When examining the PHP-FPM configuration, I realized that I only tweak a few key pieces per pool, so I decided to share my approach to minimize redundancy and keep things simple (more files, but simpler to manage)

Because I still have to maintain PHP 5.2.x for clients, I have decided to try building everything in self-contained in an /opt/php53 directory. So consider that my $prefix, and change appropriately.

I have the following setup, and it seems to work great so far.


log_level = notice
error_log = /opt/php53/var/log/php-fpm.log
pid = /opt/php53/var/run/
emergency_restart_threshold = 10
emergency_restart_interval = 1m
process_control_timeout = 5s
daemonize = yes

; pools
include = /opt/php53/etc/fpm.d/pools/*.conf

One file per pool, for example, a pool named "mike" -


listen =
user = mike
group = mike
request_slowlog_timeout = 5s
slowlog = /opt/php53/var/log/slowlog-mike.log
pm.max_children = 5
pm.start_servers = 3
pm.min_spare_servers = 2
pm.max_spare_servers = 4
pm.max_requests = 500
include = /opt/php53/etc/fpm.d/common.conf

Common elements for each pool (if these could be inherited globally, which they MIGHT be, I could just toss them in the main php-fpm.conf. Perhaps a feature request. Will post on the mailing list...)

Remember that rlimit_files needs to be something set in your sysctl.conf or on the system level or you'll get that RLIMIT_NOFILE warning. Also, depending on how you want to limit resources per pool/client, you may want to tweak things, such as request_terminate_timeout.


listen.backlog = -1
listen.allowed_clients =
pm = dynamic
pm.status_path = /status
ping.path = /ping
ping.response = pong
request_terminate_timeout = 120s
rlimit_files = 131072
rlimit_core = unlimited
catch_workers_output = yes
env[PATH] = /bin:/usr/bin:/usr/local/bin:/usr/local/sbin:/sbin:/usr/sbin:/opt/php53/bin:/opt/php53/sbin
env[TMP] = /tmp
env[TMPDIR] = /tmp
env[TEMP] = /tmp

As always, YMMV.

Categories: PHP, PHP-FPM

The magic that is... nginx

August 18th, 2010 2 comments

I've told this story countless times, but I've never publicly documented it. This happened over a year ago, but I feel obligated to share it. nginx is the main reason for the success and deserves bragging rights.

We had an upcoming product release, and another team was handling all the communication, content, etc. I was simply just the "managed server" administrator type person. Helped setup nginx, PHP-FPM, configure packages on the server, stuff like that.

The other team managed the web applications and such. I had no clue what we would be in for that day.

There is a phrase, "too many eggs in one basket" - well, this was our first system and we were using it for many things, it hadn't gotten to the point where we needed to split it up or worry about scaling. Until this day, of course.

To me, the planning could have been a bit better, with the ISO image we were providing from the origin being staged on other mirrors, using torrents to offload traffic to P2P, etc. However, that wasn't done proactively, only reactively.

The story

Launch day occurs. I get an email about someone unable to SSH to a specific IP. I check it out - the IP is unreachable, but the server is up. I submit a ticket to the provider, and they let me know why - apparently the server had triggered their DDoS mitigation system once it hit something like ~500mbit; it was automatically flagged as probably being under attack, and the IP was blackholed.

Once we informed them this was all legit, they instantly worked to put us onto a switch that was not DDoS protected, and we resumed taking traffic. This was all done within 15 or 20 minutes, if I recall. I've never seen anything so smoothly handled - no IP changes, completely transparent to us.

I believe the pocket of no traffic (seen in the graph below) was when we were moved off their normal network monitoring and into a Cisco Guard setup. We were definately caught off guard; like I said, I knew we'd get some traffic, but not filling the entire gigabit port. Not a lot of providers would be so flexible about this and handle it with such grace. There are some reports of it being slow, and that is literally because the server itself has too much going on. PHP is trying to handle all the Drupal traffic, and during the night, the disk I/O was at 100% for a long period of time. Oh yeah - since this was the origin, the servers mirroring us had to start pulling stuff from us too 🙂

Luckily we're billed by the gigabyte there, not by 95th like most places, or this would be one heck of a hosting bill. We wound up able to reroute the bandwidth fast enough to not even be charged ANY overage for it!

All in all, without nginx in the mix, I doubt this server would have been able to take the pounding. There was no reverse proxy cache, no Varnish, no Squid, nothing of that nature. I am not sure Drupal was even setup to use any memory caching, and I don't believe memcached was available. There were a LOT of options to reduce load - the main one was just cutting down the bandwidth usage to open up space in the pipe, which was eventually done by removing the ISO file off the origin server, and pushing it to a couple mirror sites. Things calmed down then.

However, it attests to how amazinly efficient nginx is - the entire experience wound up taking only 60 something megabytes of RAM for nginx.

Want the details? See below.

The hardware

  • Intel Xeon 3220 (single CPU, quad-core)
  • 4GB RAM
  • single 7200RPM 250GB SATA disk, no RAID, no LVM, nothing
  • Gigabit networking to the rack, with a 10GbE uplink

The software - to the best of my memory (remember, all on a single box!)

  • nginx (probably 0.7.x at that point)
    • proxying PHP requests to PHP-FPM with PHP 5.2.x (using the patch)
      • Drupal 6.x - as far as I know, no advanced caching, no memcached, *possibly* APC
    • proxying certain hosts for CGI requests to Apache 2.x (not sure if it was 2.2.x or 2.0.x)
    • This server was also a yum repo for the project, serving through nginx
  • PHP-FPM - for Drupal, possibly a couple other apps
  • Postfix - the best MTA out there 🙂
    • I believe amavisd/clamav/etc. was involved for mail scanning
    • Integration with mailman, of course
  • MySQL - 5.0.x, using MyISAM tables by default, I don't believe things were converted to InnoDB
  • rsync - mirrors were pulling using the rsync protocol

The provider

  • SoftLayer - they just rock. Not a paid placement. 🙂

The stats

nginx memory usage
During some of that time... only needed 60 megs of physical RAM. 240 megs including virtual. At 2200+ concurrent connections... eat that, Apache.

root     13023  0.0  0.0  39684  2144 ?        Ss   03:56   0:00 nginx: master process /usr/sbin/nginx
www-data 13024  2.0  0.3  50148 14464 ?        D    03:56   9:30 nginx: worker process
www-data 13025  1.1  0.3  51052 15256 ?        D    03:56   5:38 nginx: worker process
www-data 13026  1.3  0.3  50760 15076 ?        D    03:56   6:13 nginx: worker process
www-data 13027  1.3  0.3  50584 14900 ?        D    03:56   6:22 nginx: worker process

nginx status (taken at some random point)

Active connections: 2258
server accepts handled requests
711389 711389 1483197
Reading: 2 Writing: 2040 Waiting: 216

Bandwidth (taken from the provider's switch)

Exceeded Bits Out: 1001.9 M (Threshold: 500 M)
Auto Manage Method: FCR_BLOCK
Auto Manage Result: SUCCESSFUL

Exceeded Bits Out: 868.1 M (Threshold: 500 M)
Auto Manage Method: FCR_BLOCK
Auto Manage Result: SUCCESSFUL

To give you an idea of the magnitude of growth, this is the amount of gigabytes the server pushes on a normal day:

  • 2009-05-18 155.01 GB
  • 2009-05-17 127.48 GB
  • 2009-05-16 104.21 GB
  • 2009-05-15 152.42 GB
  • 2009-05-14 160.12 GB
  • 2009-05-13 148.6 GB

On launch day and the spillover into the next day:

  • 2009-05-19 2036.37 GB
  • 2009-05-20 2481.87 GB

The pretty pictures

Click for larger versions!

Hourly traffic graph
nginx rocks!
(Note: I said 600M, apparently their threshhold from their router says 500M)

Weekly traffic graph
nginx rocks!

The takeaway

Normally I would never think a server could get slammed with so much while it is having to service so much. Perhaps if it was JUST a PHP/MySQL server, or JUST a static file server, but no - we had two webservers, a mailing list manager, Drupal (which is not the most optimized PHP software), etc. The server remained responsive enough to service requests, on purely commodity hardware.

Categories: nginx, PHP, PHP-FPM

Happy day! PHP-FPM included with PHP 5.3.3RC1, and WordPress 3.0

June 17th, 2010 3 comments

Officially in the 5.3.3RC1 distribution. Sweet! From the NEWS file:

17 Jun 2010, PHP 5.3.3 RC1
- Added FastCGI Process Manager (FPM) SAPI. (Tony)

and on an unrelated note:

- Added support for JSON_NUMERIC_CHECK option in json_encode() that converts numeric strings to integers. (Ilia)

Shouldn't this be called JSON_NUMERIC_CONVERT? or JSON_FORCE_INTEGER? It's not just a "check" - guess it's too late now? 🙂

WordPress 3.0...
WordPress 3.0 came out today. Tonight I'll probably upgrade this site and see how well it works. I'm going to check it in to Subversion first so I can roll back if needed.

Some key changes I wanted to talk about...

  • One thing that was highlighted is the option to use custom header images - which can easily be done right now. I did it well over a year ago in a theme. With post meta you can always load metadata about a post and use it in the theme, so this update seems a bit specific to me, since themes were already customizable. Why build a feature that is so specific? Same with background images/colors...
  • Custom menus/menu editor - this could get cool, the menu editor is the more exciting piece as it will allow a visual way to manage the taxonomy. Not sure how it will mix in with tags and categories though, guess that's "I'll see it when I upgrade."
  • MU merge - finally, I can run multiple installs off the same WP install, hopefully, without wp-config.php hacks. How exactly it works I will have to find out.
  • Custom post types - now all of a sudden you can make any sort of object with custom attributes, which opens the door to things such as the item below.
  • WP e-Commerce says they're going to change from using all their extra tables to using core WP schema. That's awesome.

A couple bones to pick...

  • It's not a rewrite. It's still a blogging tool that is being extended further to be a full-featured that can handle "anything" - however the tables are still named "posts" even though now you can create an arbitrary type of item. I'd like to see it renamed and normalized.
  • All the plugins and themes and such are procedural code, but some inner workings such as the DB layer are OO. That seems amateur to me, and unnecessary.

I'd love to see WP get rewritten. It has a LOT of overhead and includes built in that need calling and a lot of other cruft that I stumble across. Go back to the drawing board with building a list of every feature it has, and look at it from a longer term perspective. It's great to see something keep growing, but when it comes down to it, it is still a fork of b2, which was made for blogging, not for anything and everything.

It's got the right idea with extensibility and such, but to me the core has a lot of code - and lots of code means more complicated execution paths, more "I'll just add this in instead of refactor this old code," more cruft. I'm quite sure I could get as much extensibility out of a fresh rewrite with less than half the code under the hood. Things like text styling for example should be moved to a plugin (I disable all the wptexturize filters for example... throw those in an include and make it enabled by default instead!)

Of course, WordPress does have millions of users so it has a proven track record. I can't complain that much, I do use it myself. For blogging, it's the best tool out there. For other things, it typically leverages plugins which may or may not have decent UIs or APIs to interact with. That's where it shows signs of weakness. It also isn't as strict as Drupal when it comes to code conventions either, which would greatly increase the usability of a lot of plugins.

Categories: PHP, PHP-FPM, WordPress

PHP-FPM and nginx upstart scripts

May 21st, 2010 4 comments

Upstart is becoming the de-facto standard for everything in Ubuntu, and I do enjoy it's process management and re-spawning capabilities.

(Actually, before PHP-FPM I used to use upstart jobs to su - $user php-cgi -b $port :))

These are VERY simple but effective scripts, and would actually be beefed up to be more intelligent (chaining nginx to start after PHP-FPM for example. However, if you do not need PHP, then that's a useless chain. So I kept it simple. I suppose you could add a short delay to start nginx then...)

Note: make sure PHP-FPM is set to daemonized = yes.

Second note: this works for PHP 5.2.x w/ the PHP-FPM patch, on Ubuntu Lucid Lynx (10.04) - anything else YMMV. I am not using PHP-FPM w/ PHP 5.3 yet since I have no environments that I know will support 5.3 code. When I finally get one, I will look for the same opportunity.


description "nginx"

start on (net-device-up and local-filesystems)
stop on runlevel [016]

expect fork
exec /usr/sbin/nginx


description "PHP FastCGI Process Manager"

start on (net-device-up and local-filesystems)
stop on runlevel [016]

expect fork
exec /usr/local/bin/php-cgi --fpm --fpm-config /etc/php-fpm.conf

Once you've done this you can remove all the /etc/init.d/php-fpm, /etc/init.d/nginx and /etc/rc?.d/[K,S]??php-fpm and /etc/rc?.d/[K,S]??nginx symlinks and files. This takes care of all of that.

Feel free to comment and leave better tips, tricks and ideas!

Categories: Development, nginx, PHP, PHP-FPM

Jérôme Loyet is a saint!

December 15th, 2009 No comments

I'd like to announce that Jérôme Loyet stepped up this weekend and hacked up the first round of code to get dynamic process management going. Antony committed it (see below) and it's on its way to being part of PHP core as well. So two major events in only a week and a half or so!


Good job Jérôme. Where were you a couple months ago, buddy? 🙂

Categories: PHP, PHP-FPM

PHP-FPM brought in to PHP core - interesting surprise

December 5th, 2009 No comments

Read it here:

First off, big thanks to Antony Dovgal. I've exchanged words with him in the past about PHP-FPM (and actually other PHP things) but was completely unaware he was working on this.

So, we've got a blessing but also an interesting dilemma on our hands. We've got a wishlist and some bugs to work out. I have a feeling if Antony updated some of the CGI internals it may have resolved some of those bugs. Not sure. I'm trying to get some specifics now - what version of PHP-FPM he brought in, how the community can still support it and how difficult it may be to submit patching, if he thinks a separate management daemon makes more sense than keeping it glued inside of the SAPI (it seems out of place to me for a SAPI to require a proprietary configuration file and daemon .pid, log, etc. files...)

Hopefully I can get ahold of him soon and discuss some of this. I was already mid-discussion with another PHP core developer about how they think the best approach would be to get PHP-FPM aligned with PHP core (they leaned more towards a separate SAPI too.)

My main goal is to make it easy as it can be for PHP-FPM to become an official package or included with PHP so that people who use PHP from repositories on their favorite distributions and such can enjoy the benefits of PHP-FPM without patches or separate downloads. If the management portion does split off, I fully intend on making sure it is aligned properly and is as simple as "apt-get install php5-fpm" or something of that nature. Still easily installed and everything.

Anyway, we'll see how things go. This caught me off guard and now I have to figure out at what point we're at now with development. Jérôme Loyet has expressed interested in trying to convert the configuration file to nginx style - something Andrei had told me he had wanted to do. The XML throws some people off, thinking it's an actual XML parsed document with XML include support and such... also if done right, this will allow PHP-FPM's configuration to support includes, and who knows, maybe variables some day. But for now it would be a lot cleaner to read, and it seems the majority of PHP-FPM users are nginx users already anyway 🙂

Categories: PHP, PHP-FPM

Who ever said open source software was perfect?

December 14th, 2008 No comments

Typically, updates on the open source packages work without a hitch. However, my upgrade last weekend on my servers from Ubuntu Hardy to Intrepid wound up creating a couple major headaches, and at the same time, I noticed a handful of other snafoos happening to open source packages I use daily.

This wound up in server instability, client annoyance, and 20-30 hours solid of trial-and-error compiling, testing, debugging, etc. Even right now, if I forget to hold back the libgpac-dev package from being updated, all videos being converted lose their sound due to MP4Box crashing.

Categories: nginx, PHP, Software

Updates on the HTTP file upload front, part 2

December 3rd, 2008 4 comments

Continuing from ...

I've been doing some research and more hacking. Code should find its way out there sometime soon. Here's my notes since the last installment of the "As The File Upload World Turns"

  • Figured out the appropriate nginx configuration so there is no buffering to disk of the request.
  • Rudamentarily tested browser memory while uploading a 220 meg file - did not appear to use much more than normal (which leads me to believe that Gears is slicing the file up efficiently enough and only grabs the bytes it needs)
  • HTTP authentication will not work. From what I could understand from the code and some random comments, authentication information is not supported by Gears' XHR object.
  • Pause/resume is possible; I've changed the PHP server side piece to accept the byte offset Gears tells it to start at; originally was having issues until I determined why it wasn't fseek()'ing properly 🙂 However this still requires a more stateful approach on the client side. Will probably have to implement a local Gears database and possibly a worker pool setup. This will allow for persistence and other neat things.
  • I've got a decently functioning JavaScript UI which seems to calculate out the average speed, estimated time remaining, etc.

The only thing missing is a better attempt to see if Gears will retry the upload on a failure. I believe it is possible when dealing with a worker pool but this is -very- basic XHR usage at the moment. Perhaps since it is JavaScript-based we can add in our own re-transmission code. That's the next piece I'm going to mess around with.

Stay tuned for the results... (and code, most likely!)

Categories: PHP

Quick code snippet: normalizing a URI (for friendly URLs, etc.)

December 3rd, 2008 1 comment

When you enter the realm of "friendly URLs" "slugs" "nice names" or whatever else you call them, it can make everything a lot better looking. However, if done incorrectly, you can get some duplicate indexed pages and the like. I couldn't sleep and wanted to try approaching this again a different way, and every URI I've thrown at it comes out how I want it.

Why does this matter? Typically, without rewrites and using normal webserver, directory and file semantics, a request for "/foo" should make the webserver bounce you to "/foo/" - but when dealing with rewritten URLs, there is no enforcement of this behavior. A lot of the time (at least with the stuff I'm currently dealing with) the same page shows up with "/foo" or "/foo/" and both are considered unique to a search engine. It's duplication of data which violates the normalization devil in me! Even worse, certain apps might not even process the request the same. "/foo" could load one page, and "/foo/" could load another, or an error. That's worse; when people send URLs out, sometimes they take artistic license with what they look like. This is to thwart all that and force search engines, users, etc. to all view the same URL. I chose to enforce the URL structure ending with "/" as I think it helps establish that "final" signoff of the non-query string portion of the URI.

You could go about this different ways; however, due to the way I have my nginx rewrites done, I can't rely on $_SERVER['QUERY_STRING'] which is how I had originally written it - and I wondered why I was getting some weird behavior. Now I realize I need the function to handle any string that is passed to it, and then I can make this behave appropriately.

Below is the function and some sample code to run it. There's probably a few opcodes that could be saved in trade for a little bit of memory by calculating the length of the URI, position of the "?" if there is one, etc. However, I hate defining variables so much that get used only once (a major pet peeve is when people create a new variable for absolutely no reason, this one would at least save a couple CPU cycles...) - so this is my least-amount-of-code-possible version. Enjoy.

# some example URIs
$uris = array(

foreach($uris as $uri) {
   echo $uri." => ".normalize_uri($uri)."\n";

function normalize_uri($uri) {
   # if there is query string, we want to chop it off and put it aside
   if(strstr($uri, '?')) {
      $query = substr($uri, strpos($uri, '?'), strlen($uri));
      $uri = substr($uri, 0, strpos($uri, '?'));

   # scrub any index.* stuff off the end
   $uri = preg_replace("/index.(\S{0,3})$/", '', $uri);

   # if it doesn't end with a '/', then add one
   if(substr($uri, strlen($uri)-1, strlen($uri)) != '/') {
      $uri .= '/';

   # finally, put the query string back on
   if(!empty($query)) {
      $uri .= $query;

   return $uri;

You'd tie this in with something like a:

header('Location: http://'.$_SERVER['HTTP_HOST'].$uri, true, 301);

To make sure that it is redirecting with a 301 (search engine friendly) header. (Don't hardcode the scheme - https or http, depending on what your site uses.) Something like this should work:

if(isset($_SERVER['HTTPS']) && strtolower($_SERVER['HTTPS']) == 'on') {
   $scheme = 'https://';
} else {
   $scheme = 'http://';

This is how I would throw it all together:

if(isset($_SERVER['REQUEST_URI']) && substr($_SERVER['REQUEST_URI'], strlen($_SERVER['REQUEST_URI'])-1, strlen($_SERVER['REQUEST_URI'])) != '/') {
   if(isset($_SERVER['HTTPS']) && strtolower($_SERVER['HTTPS']) == 'on') {
      $url = 'https://';
   } else {
      $url = 'http://';
   # fill in $_SERVER['REQUEST_URI'] here with whatever is holding the original URI
   $url .= $_SERVER['HTTP_HOST'].normalize_uri($_SERVER['REQUEST_URI']);
   header('Location: '.$url, true, 301);

Now, it is 3:30am and I am trying to compose this inside of WordPress, but I believe that will work.

I apologize - for some reason the indentation is not showing up... remind me to add in that neat code sample plugin soon.

Categories: PHP