The Life and Times of Michael Shadle

I hate wikis

November 16th, 2008 mike No comments

I dislike wikis. At work we were using them. Now we're changing from wikis to an article management system. The idea on paper sounded great, but I'm realizing the pitfalls about the implementation.

There are some traits that wikis have that I like. There's also plenty of bad traits. Also, I'm thinking of MediaWiki when I say this stuff.

Pros:

Interlinking and updating
Anyone can edit pages by design
Templates and categorization can be done by anyone, inline
Every page has a discussion option

Cons:

Lots of overhead
Semi-proprietary syntax (= makes h1, == makes h2, * makes one level lists, ** makes it a two level list, etc.)
Inconsistent syntax - some HTML works, some bbcode-ish wiki stuff works, some __HEREDOC__ stuff works
Forces everything to be CamelCase
No attachments per page, only global
By default, wikis have no page-level security, since their mantra is "anyone can edit"

So now that I've said that - I have to say that the ideal approach would be to take the existing CMS approach but add in a few wiki features. Specifically the interlinking. We already have a comment feature on every page. Attachment management is per-page too.

Late night half dazed thoughts:

Link syntax couldn't be something like foo:bar - since javascript:foo would conflict. Perhaps something like [foo:bar] would be a good idea? For links to file attachments, [file:fileID] or [article:slug-title]. This would make life simpler for tracking what files are orphans, can display information about the file in-line in the documents (file type, size, etc.)

Categories: Development, Software

Large file uploads over HTTP - the final solution (I think)

August 26th, 2008 mike 3 comments

Problem statement: HTTP sucks for file uploads.

You know it. I know it. The problems?

No resuming
POST multipart/form-data bloats the size of the file due to encoding
Slow connections typically time out on large files
Any server resets or any other network "burps" on the path from client to server effectively kills the upload
People have had moderate success by tuning their webserver and PHP to accept large POSTs, and in general, it works - but not for everyone and it suffers from everything previously noted.

What would the ideal file upload experience support?

It should resume after manual pause, a server reset, a timeout, or any other error.
It should allow for multiple files being uploaded at once.
It should work transparently over HTTP - which means proxies will support it like any normal web request, it can be done over HTTPS (SSL), it will reuse cookies and standard HTTP authentication methods.
(Ideally!) the browser would handle this itself without requiring Java, Flash, or any other applets.

With all this in mind, I somehow stumbled across the idea (roughly posted here) based on the time-tested learnings from Usenet and NZB files, and BitTorrent. The main idea? Splitting the file up into manageable segments. There's also some other logic too, but that's the main idea.

Why do I claim this is the final solution?

It can reuse the same HTTP/HTTPS connection, so proxies and HTTP authentication can be honored.
It doesn't care what speed of your connection - due to the small size of the files, it's easier to get them to the server and each piece can be confirmed one step at a time. No more having to start from the beginning due to a failure or timeout.
It will support multiple files at once. The server could (although we might not implement it this way) support multi-threaded uploading of the same file, too, just like BitTorrent or Usenet downloading - upload multiple pieces at the same time and assemble them in the end. We're trying to make a decision whether or not we want to do that right now. The fundamental difference is an implementation detail on the server end.
It allows for any client that can split a file up, hash it, encode it and upload it via POST
It will still require an applet, since browsers have no support for anything but standard file upload semantics (Although this would be a neat thing to get into a specification)

What's required, how does it work?

As of right now, this is what I have down (it has changed already since the PHP post):

The client contacts the server to begin the transaction. It supplies the following information:
- Action = "begin"
- Final filesize
- Final filename
- Final file hash (SHA256 or MD5, still haven't determined which one)
- A list of all the segments - their segment id, byte size, hash (again SHA256 or MD5) - XML or JSON or something
Server sends back "server ready, here's your $transaction_id"
Client starts sending the file, one segment at a time, with the following information:
- Action = "process"
- Transaction ID = $transaction_id
- Segment ID = $segment_id
- Content = base64 or uuencoded segment (for safe transit)
Server replies back "segment received, transaction id $transaction_id, segment id $segment_id, checksum $checksum"
Client compares the checksum for $segment_id, if it matches, move on to the next segment. If not, retransmit.
When the client is done sending all the segments, client sends message to the server:
- Action = "finish"
- Transaction ID = $transaction_id
Server assembles all the segments (if they're separate) and sends to the client:
- Transaction ID = $transaction_id
- Checksum = $checksum of the final file
Client compares the checksum to it's own checksum. If it matches, client sends message to server:
- Action = "complete"
- Transaction ID = $transaction_id

Viola, done. I think the "protocol" transmits some extra information that isn't needed; so some of this might need to be cleaned up. This is the initial idea though. Props to Newzbin for inventing NZB files which was a big influence in this concept.

I'm somewhat rushing this post out, hopefully it solicits some feedback. I'm going to be revising this and working with a Java developer to work on a client written in Java. Hopefully someday we'll get one with less overhead. I'll post PHP code as I write it too to handle the server portion of it.

Categories: Development

Google AppEngine - 1) develop site, 2) ?? [get bought by Google], 3) profit

April 10th, 2008 mike No comments

A co-worker and I were discussing the lack of languages in Google's AppEngine program. I'd want PHP, he'd probably prefer .NET... there seems to be a lot of "please add XYZ!" in the bug tracker for it.

Anyway, I have a sneaking suspicion that this is a covert operation for Google to provide it's preferred architecture, authentication, etc. so people create the next generation of massively scalable sites (and/or convert their existing ones or create clones of existing ones) that run on their platform... and then Google can swoop in and buy them. The total headache and cost of bringing it in house is essentially nil, since there are no user conversions, data conversions or anything else to deal with. Just slap Google on it, and viola.

I could be wrong, and I suppose we will see as Google starts addressing these requests. Perhaps they'll be fine with -any- language and still want people to make apps they can easily consume. I guess people will report out as they begin developing and growing their sites and Google approaches them (or not...)

Categories: Development, Software

Disappearing text in IE6 and "dead" JavaScript links

December 1st, 2007 mike 2 comments

We discovered a couple little annoyances with IE6, and I thought it would be useful to publish their workarounds.

Issue #1:

Random text would "disappear" on a page. It actually was still there and would show sometimes after hovering over it. The fix turns out to be setting the CSS for the text in question to "height: 0.01%" - it's so simple but so needless. It shouldn't have to be done. It should just work.

Issue #2:

<a href="javascript:anything" onclick="something()">

Won't work. It will just act like a dead link (or perhaps just ignore the onclick...) This will work however:

<a href="javascript:something()">

Note: Typically I try to keep my href's to be plain "javascript:;" if I need to use anchor-based links, and then chain events off of it using the element's ID and jQuery.

Categories: Development

Handling events pre-$(document).ready() in jQuery

October 30th, 2007 mike 3 comments

jQuery is the best thing since velcro. No doubt. However, there is one thing that I've stumbled across that other people seem to have wondered about too. jQuery's got this great $(document).ready() capability to let you know when the DOM is ready and jQuery is loaded. However, what about those events (like a user quickly clicking on something) prior to this happening? If any of those require jQuery's functionality, you're SOL.

For right now, what I figured out is just doing this in the HTML:

<a href="javascript:;" id="somediv">Some link</a>

In the Javascript, it would be this:

$(document).ready(function() {
   $("#somediv").click(function() {
      ... your actions here ...
   });
});

This currently is the only way I could figure it out. This guy had a neat idea basically creating a cache of the events to trigger the minute the DOM is ready. I was thinking of just blindly applying this to all $("a"), but that still requires jQuery to be available, and that's the whole problem to begin with.

This issue makes me nervous because I'm trying to play within the rules that Yahoo! has worked out for optimal performance. However, when you put JavaScript files at the bottom of the page, that means it will be even longer before jQuery is loaded. The more I try to get the pages working with JavaScript files loading at the bottom, the more apparent it becomes that jQuery should still be called at the top/as soon as possible. Unless someone else has figured out a way to make both sides happy...

Categories: Development

jQuery Spy improvements

October 29th, 2007 mike 2 comments

I've made a diff against spy 1.4 to make sure it will not allow multiple spy instances on the page with the same object ID. If .spy() is called again, the old timer will be cleared and the new one (with new settings) will take over. It should still allow multiple *different* spies on the same page, just not two of the same thing (I was having an issue where it would keep reloading the same ID over and over because of a .click() event changing the configuration settings) - this allows real-time changing of the settings (say, the AJAX URL) without spawning additional timers.

I also added in a Math.random() parameter to force reloads every call and changed $.post to $.get - those can easily be removed if desired. 🙂 Essentially I just create an array that has a list of all the spy IDs that are called, if a dupe is detected, the old timer is disabled and the new one starts like normal. Oh, and I changed the epoch behavior, for some reason on my browser it wasn't reporting the right time. I don't see why it needed the spy.epoch calculation at all.

Here's the patch file:
spy-1.4-diff.patch.txt

Feel free to submit your comments. This was the cleanest method I could figure out.

Categories: Development

jQuery's no-conflict mode: yet another reason why it's the best

July 3rd, 2007 mike 42 comments

It took me a bit to find out why jQuery (now bundled with WordPress) was not working as I expected inside of the WP admin area. The script was being called, but my code like $("#foo") was not working. I really had no clue where to begin, since it still has all those old JS libraries/frameworks being called as well. It was due to Prototype being packaged with it and conflicting with the "$" shortcut.

Long story short, jQuery already planned for library conflicts and has a quick solution. The no-conflict mode allows you to define your own shortcut, and of course it works like a charm.

It's easy to do - just put this line in your code somewhere:

var J = jQuery.noConflict();

Now $("#foo") will be J("#foo") and it will not conflict with any other libraries that may be installed. I hope WP gets rid of all the other stuff and goes with pure jQuery and plugins soon enough though 🙂

Categories: Development, WordPress

Newer Entries

The Life and Times of Michael Shadle

Archive

I hate wikis

Large file uploads over HTTP - the final solution (I think)

Google AppEngine - 1) develop site, 2) ?? [get bought by Google], 3) profit

Disappearing text in IE6 and "dead" JavaScript links

Handling events pre-$(document).ready() in jQuery

jQuery Spy improvements

jQuery's no-conflict mode: yet another reason why it's the best

Contact

Categories

Mike Approved

Archives