September 19th, 2007
mike
You think keeping up to date is a good idea, however, the last version of RDC now has an annoying bug. It seems if you use the Windows-L combination to lock your machine (in person), the next time you connect via Remote Desktop the system acts like the Windows key is stuck. I don't know if this happens 100% of the time, but this has been happening to me now consistently for over a week and I finally remembered to see if it is truly an issue or not.
Sure enough, tons of other people have complained, but there doesn't seem to be a permanent fix. There are a couple options - a "soft hack" method where you use the On-screen Keyboard and just click the Windows key on/off, a registry hack that will disable the Windows keys altogether, and quite possibly downgrading to the slightly older RDC version.
Here's some relevant threads. The OSK idea works simple enough; however it shouldn't be an issue no matter what, especially since the Windows key is not passed through RDC anyway.
Hopefully there will be a proper fix soon. I'd rather not have to disable keys or manually have to click buttons every time I connect via RDC...
September 14th, 2007
mike
Wow, once again NTFS has ruined my day by invalidating an entire external USB drive's data. The filesystem looks fine, passes all checks, I can copy the data even using Linux but it acts like it is corrupt. So I try some NTFS recovery software and it's showing me all this "EFS" stuff - like this stuff has some Encrypted File System related issue. Sure enough, I picked up an EFS decryption tool (if you know the password) and it shows every file as being encrypted. Did I ever encrypt it? No. I don't even have EFS readily available in my right-click menu. Even if it was encrypted for some odd reason it would have been using one of only a couple passwords I've used on my desktops... and none of them work.
Meanwhile now I am stuck with a perfect directory listing of these thousands of files that are now digital rubbish. The least NTFS could have done was just corrupt itself like usual and let me try to recover pieces of them. Always something new out of Redmond. Sigh.
Updating the definitions keeps failing on my Active Virus Shield installation. Why? Quickly Googling it I find out that it's suspended and is being replaced with a special version of McAfee VirusScan. This comes only a couple months after I just removed Norton off all my computers and the people I support (family, friends) since it was becoming more of a resource hog as time went on.
Originally I discovered AVS through CPU Magazine, and in their tests Kaspersky Anti-Virus ranked #2 (IIRC) - which was a bonus - it was free, and it was basically the best. Now if I want a free "supported" copy from AOL, I have to change to McAfee, which is already scaring me with its 25 meg download and extra programs like "SecurityCenter" - at least I was able to unselect the firewall.
Instead of a single program with a smaller footprint, I now have SecurityCenter AND VirusScan running. I highly doubt this will run as clean as AVS did. Currently it is struggling to update the windows even. Feels like Norton all over again... sigh.
Mozy's got what seems to be a nicely integrated Windows client (not sure how good their Mac client is) - their service seems simple enough, and you can't go wrong with two gigs free or unlimited (I've been questioning how much it takes before it becomes "abuse") for an extremely low $4.95/month.
However, this going back to my blog post a while ago about the end all backup solution, it still is missing one key thing. While the transfer is encrypted and the file contents are encrypted (optionally), the file and directory names are not. When I asked support, the response back was "how would you know what the file is then?" - to me, it's simple. You have an encryption key, why can't the filenames be encrypted as well using the same key, and decrypted on demand? I understand this complicates things then (how do you store junky filenames/etc...) - you'd most likely need a customized filesystem or some virtual layer between the two to do the translation. However, that would basically make Mozy the king in my book for Windows and I assume Macs (read: any level of savvy end-user who wants their data backed up.)
I would still be shopping though for one to use to backup my servers. Duplicity still seems like the best, as it will compress, encrypt, and do differential/incremental backups and due to the nature of how it works, will also mask the file contents so only the user can see them. Rsync.net has recently announced funding and support to help pump some life back into the project, which is promising. It needs native Windows support (which may be tricky... it needs a POSIX compliant backend) and proper S3 integration without patches or external libraries and hacks.
Of course, I was having fun developing my own little PHP interface to S3, which if possible I could then wind up creating my own tools; however, do I really trust my own coding for my critical data and extremely large filesizes? Because it's over HTTP, it's tricky (or maybe impossible) to byte-serve the data in PHP without precaching it into memory first... ah well.
Someone needs to create an efficient C-based one that can be compiled in Windows, OS X, and Linux. Any takers? I'll pay...
So I've been looking around for the perfect solution for backups - both my home machines and my server farm (including my customers' data) - I think I've figured out the final destination for the data - Amazon's S3 service. However, how to get the data there has been the question.
I have a few basic requirements:
- Encrypted: the content must be encrypted - in transit and at rest. However, if it is encrypted before it is transferred, I think encryption during transit would be optional.
- Nondescript: the third party storage provider should have no knowledge of the file contents NOR the file names; the files should be generic, gibberish or otherwise indistinguishable.
- Compressed: to save bandwidth costs and storage space, the content should be compressed. bzip2 seems to be the winner although a little more CPU intensive. If all else fails, we'll go with gzip.
- Incremental: in addition to compressing the data to save storage space, doing a incremental/differential backup will transfer and store only the files that have changed (and possibly only the content inside of those files, too)
- Versioned: backups are only as good as their source data. If something goes corrupt, or is altered incorrectly, it will overwrite the "good" copy from the last backup. By versioning the files, you can go back in history at any point.
- Cross-platform: I want to run this on my Linux servers as well as in Windows - running as a command-line (crontab in Linux, Scheduled Tasks in Windows - or even better as a service) - note if anything elegant was produced from this venture, I'd share the results so it could be reused anywhere (including you folks on OS X)
- Cheap: I'm sure there are some very expensive enterprise solutions, including using services like rsync.net (more expensive than S3, but allows for using rsync, for instance) and many other options, but for this I need something cost-effective.
The programs/scripts/utilities/whathaveyou I've looked at:
- DAR - supports encryption, incremental archiving
- rsync - supports transfer encryption, incremental transfers
- rdiff-backup - supports incremental archiving, versioning
- Subversion - supports incremental archiving, versioning
- backup-manager
- duplicity - seems to solve nearly everything (?)
Services:
... edited later ...
I've been messing around with this too long now, and it's time to look at the two best options.
So far the best choice seems to be Duplicity:
- Encrypted: yes, all archives are encrypted (and signed) using GPG prior to transfer.
- Non-descript: yes, it packages up the files prior to sending to the third party:
- The remote location will not be able to infer much about the backups other than their size and when they are uploaded.
- Compressed: yes, by default gzip level 6; easy enough to change it to 9 though.
- Differential: yes, on the file and intra-file level - uses the rsync algorithm.
- Incremental: yes, even makes restoring easy too:
- Restoring traditional incremental backups can be painful if one has to apply each incremental backup by hand. Duplicity automatically applies the incrementals to the full backup without any extra intervention.
- Versioned: it appears so.
- Cross-platform: not quite; requires a POSIX-like system, which means running inside of a full Cygwin environment. If this meets my needs, will look into trying to make it a standalone executable (using py2exe perhaps?) and the Cygwin DLL's; otherwise I may go to rent-a-coder and pay someone to port it.
- Cheap: the software is FOSS - so most definately yes.
The other option is a somewhat DIY solution (which on hindsight resembles this one):
- First, use DAR to create the archive - it supports differential, incremental, versioning and compression.
- Next, use GPG to encrypt the DAR archive files.
- Finally, send it to S3 - this is the open option still. I almost have a very simple, clean, procedural PHP script for S3 interaction; however it is not that efficient (from what I can tell, there is no way to support streaming of the upload at the time being, which means the file content is read in to memory completely by the PHP script - unless the files are small enough, that might not be an option at all) - there's also some other options to send the files though, and I will visit those if needed.
Another idea I had was to use Subversion as a transparent change control layer, since it handles versioning properly, and do an svndump, compress and encrypt that, and send that to S3. That would require an extra copy of all the files to be stored in change control though, as well as (it appears) a local SVN server running on the machine.
UPDATE: Things look good. I think this is what I've been looking for (except for the Windows support) - however, I do have a couple suggestions for the duplicity team in the meantime:
- Add support for bzip2 (can be through GPG; I couldn't figure it out myself)
- Add support for altering the compression level using a command line option (for either compressor)
- Add support for defining max file size (5MB is the default) - it's easy enough to edit, should be easy enough to make it a command line parameter (again, I'd do this, but I'm not a Python guru nor want to fuss with implementing it incorrectly)
- Make it run properly in Windows (I don't care how!)
The only question left unanswered, which may be a question with any backup tool is how it deals with corrupt files. Will a corrupt file be transferred, and the corruption will overwrite the corrupt previous version? Can you still access the previous version without the corruption affecting it? I assume if a file is corrupt, its checksum will have changed, so it will treat it like a new version, and allow us to revert back to the pre-corrupted file (assuming it's easy to view the history/when it became corrupt.)