Archive

Archive for the ‘Development’ Category

Little-known URI shorthand - the "network-path" reference

July 21st, 2010 mike No comments

I've seen this before, and it was mentioned earlier today at OSCON, but I never knew if it was a browser behavior or a standard. Looks like I got it with some help from IRC.

Say you have a foreign host and you don't want to have to figure out if you're on http:// or https:// and call their assets appropriately so you don't get a mixed-mode warning. You can actually use a syntax that is defined in RFC 3986, specifically section 4.2:

A relative reference that begins with two slash characters is termed a network-path reference; such references are rarely used. A relative reference that begins with a single slash character is termed an absolute-path reference. A relative reference that does not begin with a slash character is termed a relative-path reference.

Which means you can do this:

<img src="//foo.com/bar.jpg" />

and your browser will request http://foo.com/bar.jpg or https://foo.com/bar.jpg, depending on what scheme your browser is currently on.

I was hesitant at first to consider it "okay" but as it is published in the RFC and Chromium's fixed bugs relating to it, it does appear to be a properly supported method that could save you a few keystrokes. Let me know if it doesn't work for you! Be sure to give browser/OS information and conditions to reproduce.

Oh yeah, and the other host needs to be on https as well, of course. I shouldn't really have to say that, though :)

Categories: Development Tags:

PHP-FPM and nginx upstart scripts

May 21st, 2010 mike 4 comments

Upstart is becoming the de-facto standard for everything in Ubuntu, and I do enjoy it's process management and re-spawning capabilities.

(Actually, before PHP-FPM I used to use upstart jobs to su - $user php-cgi -b $port :) )

These are VERY simple but effective scripts, and would actually be beefed up to be more intelligent (chaining nginx to start after PHP-FPM for example. However, if you do not need PHP, then that's a useless chain. So I kept it simple. I suppose you could add a short delay to start nginx then...)

Note: make sure PHP-FPM is set to daemonized = yes.

Second note: this works for PHP 5.2.x w/ the PHP-FPM patch, on Ubuntu Lucid Lynx (10.04) - anything else YMMV. I am not using PHP-FPM w/ PHP 5.3 yet since I have no environments that I know will support 5.3 code. When I finally get one, I will look for the same opportunity.

/etc/init/nginx.conf:

description "nginx"

start on (net-device-up and local-filesystems)
stop on runlevel [016]

expect fork
respawn
exec /usr/sbin/nginx

/etc/init/php-fpm.conf:

description "PHP FastCGI Process Manager"

start on (net-device-up and local-filesystems)
stop on runlevel [016]

expect fork
respawn
exec /usr/local/bin/php-cgi --fpm --fpm-config /etc/php-fpm.conf

Once you've done this you can remove all the /etc/init.d/php-fpm, /etc/init.d/nginx and /etc/rc?.d/[K,S]??php-fpm and /etc/rc?.d/[K,S]??nginx symlinks and files. This takes care of all of that.

Feel free to comment and leave better tips, tricks and ideas!

Categories: Development, PHP, PHP-FPM, nginx Tags:

Getting Capistrano's SSH method to work behind a SOCKS proxy

April 30th, 2010 mike No comments

At work we've got a bit of an annoying proxy setup. It does allow outbound SSH through a SOCKS 5 proxy (no auth), and after a while of Googling (I don't know Capistrano nor Ruby, but the Ruby app developer did) we were able to figure it out.

Assuming you have the Net::SSH::Proxy::SOCKS5 class installed on your system (which may come as part of net-ssh?) it's actually quite simple. Getting the ssh settings right took a while of looking around, but it wound up being quite easy, and since I could not find anything on the net about this, I wanted to post our findings.

It's as simple as adding the following to the applications's config/deploy.rb:

require 'net/ssh/proxy/socks5'

sshproxy = Net::SSH::Proxy::SOCKS5.new('proxy.com', 1080)
set :ssh_options, { :proxy => sshproxy }

Viola.

Obviously there are more options, but we had a hell of a time trying to correlate the options and get them fed into the SSH module (if that's the right term) properly. It'd probably be much simpler if I had known Ruby, or if this problem was in PHP :) Anyway, I try to share what I can when I get it... (this post is only over four months overdue...)

Enjoy!

Categories: Development Tags:

Really?

April 19th, 2010 mike No comments

Google's usually pretty good about pushing standards and best practices but this is just lame.

http://code.google.com/p/chromium/issues/detail?id=41467

From my understanding, if you type in http://www.foo.com in your browser it will now remove the http:// - not that it matters, the browser auto-prepends if it you leave it out, but it's not intuitive at all now, and it's not like it was a change that needed to be done.

This is just going to make more people copy/paste the wrong URLs into pages, not get autolinked properly, etc. How many <a href="www.foo.com/"> type links will there be if this starts spreading... I like being able to copy/paste directly from the address bar.

I don't care if people don't put it in when you type in "yahoo.com" - the issue here is the output and reuse and the general idea of "this is a legitimate URL" - considering the RFC for URLs does maintain that you need a scheme!

"A URL contains the name of the scheme being used (<scheme>) followed by a colon and then a string (the <scheme-specific-part>) whose interpretation depends on the scheme."

Categories: Development, Software Tags:

The UK's got it right - government-collected data access for free

January 21st, 2010 mike 3 comments

"A new website, data.gov.uk, will offer reams of public sector data, ranging from traffic statistics to crime figures, for private or commercial use.

The target is to kickstart a new wave of services that find novel ways to make use of the information."

Awesome. I had an idea like this for infoporn.org a while back - generic data being available for consumption, but I really don't have any origin feeds that aren't already exposed. I'd just be re-syndicating them. But this would be awesome to have, just think of all the mashups you could create depending on what data is exposed.

Ref: http://news.bbc.co.uk/2/hi/technology/8470797.stm

Categories: Consumerism, Development Tags:

I hate wikis

November 16th, 2008 mike No comments

I dislike wikis. At work we were using them. Now we're changing from wikis to an article management system. The idea on paper sounded great, but I'm realizing the pitfalls about the implementation.

There are some traits that wikis have that I like. There's also plenty of bad traits. Also, I'm thinking of MediaWiki when I say this stuff.

Pros:

  • Interlinking and updating
  • Anyone can edit pages by design
  • Templates and categorization can be done by anyone, inline
  • Every page has a discussion option

Cons:

  • Lots of overhead
  • Semi-proprietary syntax (= makes h1, == makes h2, * makes one level lists, ** makes it a two level list, etc.)
  • Inconsistent syntax - some HTML works, some bbcode-ish wiki stuff works, some __HEREDOC__ stuff works
  • Forces everything to be CamelCase
  • No attachments per page, only global
  • By default, wikis have no page-level security, since their mantra is "anyone can edit"

So now that I've said that - I have to say that the ideal approach would be to take the existing CMS approach but add in a few wiki features. Specifically the interlinking. We already have a comment feature on every page. Attachment management is per-page too.

Late night half dazed thoughts:

Link syntax couldn't be something like foo:bar - since javascript:foo would conflict. Perhaps something like [foo:bar] would be a good idea? For links to file attachments, [file:fileID] or [article:slug-title]. This would make life simpler for tracking what files are orphans, can display information about the file in-line in the documents (file type, size, etc.)

Categories: Development, Software Tags:

Large file uploads over HTTP - the final solution (I think)

August 26th, 2008 mike 3 comments

Problem statement: HTTP sucks for file uploads.

You know it. I know it. The problems?

  • No resuming
  • POST multipart/form-data bloats the size of the file due to encoding
  • Slow connections typically time out on large files
  • Any server resets or any other network "burps" on the path from client to server effectively kills the upload
  • People have had moderate success by tuning their webserver and PHP to accept large POSTs, and in general, it works - but not for everyone and it suffers from everything previously noted.

What would the ideal file upload experience support?

  • It should resume after manual pause, a server reset, a timeout, or any other error.
  • It should allow for multiple files being uploaded at once.
  • It should work transparently over HTTP - which means proxies will support it like any normal web request, it can be done over HTTPS (SSL), it will reuse cookies and standard HTTP authentication methods.
  • (Ideally!) the browser would handle this itself without requiring Java, Flash, or any other applets.

With all this in mind, I somehow stumbled across the idea (roughly posted here) based on the time-tested learnings from Usenet and NZB files, and BitTorrent. The main idea? Splitting the file up into manageable segments. There's also some other logic too, but that's the main idea.

Why do I claim this is the final solution?

  • It can reuse the same HTTP/HTTPS connection, so proxies and HTTP authentication can be honored.
  • It doesn't care what speed of your connection - due to the small size of the files, it's easier to get them to the server and each piece can be confirmed one step at a time. No more having to start from the beginning due to a failure or timeout.
  • It will support multiple files at once. The server could (although we might not implement it this way) support multi-threaded uploading of the same file, too, just like BitTorrent or Usenet downloading - upload multiple pieces at the same time and assemble them in the end. We're trying to make a decision whether or not we want to do that right now. The fundamental difference is an implementation detail on the server end.
  • It allows for any client that can split a file up, hash it, encode it and upload it via POST
  • It will still require an applet, since browsers have no support for anything but standard file upload semantics (Although this would be a neat thing to get into a specification)

What's required, how does it work?

As of right now, this is what I have down (it has changed already since the PHP post):

  1. The client contacts the server to begin the transaction. It supplies the following information:
    • Action = "begin"
    • Final filesize
    • Final filename
    • Final file hash (SHA256 or MD5, still haven't determined which one)
    • A list of all the segments - their segment id, byte size, hash (again SHA256 or MD5) - XML or JSON or something
  2. Server sends back "server ready, here's your $transaction_id"
  3. Client starts sending the file, one segment at a time, with the following information:
    • Action = "process"
    • Transaction ID = $transaction_id
    • Segment ID = $segment_id
    • Content = base64 or uuencoded segment (for safe transit)
  4. Server replies back "segment received, transaction id $transaction_id, segment id $segment_id, checksum $checksum"
  5. Client compares the checksum for $segment_id, if it matches, move on to the next segment. If not, retransmit.
  6. When the client is done sending all the segments, client sends message to the server:
    • Action = "finish"
    • Transaction ID = $transaction_id
  7. Server assembles all the segments (if they're separate) and sends to the client:
    • Transaction ID = $transaction_id
    • Checksum = $checksum of the final file
  8. Client compares the checksum to it's own checksum. If it matches, client sends message to server:
    • Action = "complete"
    • Transaction ID = $transaction_id

Viola, done. I think the "protocol" transmits some extra information that isn't needed; so some of this might need to be cleaned up. This is the initial idea though. Props to Newzbin for inventing NZB files which was a big influence in this concept.

I'm somewhat rushing this post out, hopefully it solicits some feedback. I'm going to be revising this and working with a Java developer to work on a client written in Java. Hopefully someday we'll get one with less overhead. I'll post PHP code as I write it too to handle the server portion of it.

Categories: Development Tags:

Google AppEngine - 1) develop site, 2) ?? [get bought by Google], 3) profit

April 10th, 2008 mike No comments

A co-worker and I were discussing the lack of languages in Google's AppEngine program. I'd want PHP, he'd probably prefer .NET... there seems to be a lot of "please add XYZ!" in the bug tracker for it.

Anyway, I have a sneaking suspicion that this is a covert operation for Google to provide it's preferred architecture, authentication, etc. so people create the next generation of massively scalable sites (and/or convert their existing ones or create clones of existing ones) that run on their platform... and then Google can swoop in and buy them. The total headache and cost of bringing it in house is essentially nil, since there are no user conversions, data conversions or anything else to deal with. Just slap Google on it, and viola.

I could be wrong, and I suppose we will see as Google starts addressing these requests. Perhaps they'll be fine with -any- language and still want people to make apps they can easily consume. I guess people will report out as they begin developing and growing their sites and Google approaches them (or not...)

Categories: Development, Software Tags: ,

Disappearing text in IE6 and "dead" JavaScript links

December 1st, 2007 mike 2 comments

We discovered a couple little annoyances with IE6, and I thought it would be useful to publish their workarounds.

Issue #1:

Random text would "disappear" on a page. It actually was still there and would show sometimes after hovering over it. The fix turns out to be setting the CSS for the text in question to "height: 0.01%" - it's so simple but so needless. It shouldn't have to be done. It should just work.

Issue #2:

<a href="javascript:anything" onclick="something()">

Won't work. It will just act like a dead link (or perhaps just ignore the onclick...) This will work however:

<a href="javascript:something()">

Note: Typically I try to keep my href's to be plain "javascript:;" if I need to use anchor-based links, and then chain events off of it using the element's ID and jQuery.

Handling events pre-$(document).ready() in jQuery

October 30th, 2007 mike 3 comments

jQuery is the best thing since velcro. No doubt. However, there is one thing that I've stumbled across that other people seem to have wondered about too. jQuery's got this great $(document).ready() capability to let you know when the DOM is ready and jQuery is loaded. However, what about those events (like a user quickly clicking on something) prior to this happening? If any of those require jQuery's functionality, you're SOL.

For right now, what I figured out is just doing this in the HTML:

<a href="javascript:;" id="somediv">Some link</a>

In the Javascript, it would be this:

$(document).ready(function() {
   $("#somediv").click(function() {
      ... your actions here ...
   });
});

This currently is the only way I could figure it out. This guy had a neat idea basically creating a cache of the events to trigger the minute the DOM is ready. I was thinking of just blindly applying this to all $("a"), but that still requires jQuery to be available, and that's the whole problem to begin with.

This issue makes me nervous because I'm trying to play within the rules that Yahoo! has worked out for optimal performance. However, when you put JavaScript files at the bottom of the page, that means it will be even longer before jQuery is loaded. The more I try to get the pages working with JavaScript files loading at the bottom, the more apparent it becomes that jQuery should still be called at the top/as soon as possible. Unless someone else has figured out a way to make both sides happy...

Categories: Development Tags: