Technical Blog
Posts
A PHP Shell Script for Backups
On Mac servers I love the advent of a neat feature called Time Machine. Time Machine does incremental backups without having to deal with 3rd-party software, or custom shell scripts like the one I'm about to show. Unfortunately, on the Linux side there's always some degree of elbow grease that's required. Which isn't to say you couldn't have Time Machine-like backups on Linux, you certainly can, it just takes a more concentrated technical effort, and you don't get the nifty GUI.
The following is a custom PHP shell script that I wrote for doing automated backups using the rsync command.
<?php
// Backup script extraordinaire!
// rsync docs
// http://optics.ph.unimelb.edu.au/help/rsync/rsync.html
// This requires PHP CLI (Command Line Interface)
// http://www.php.net/
// Just call it from the command line like this:
// php -q /path/to/this/script.php (CGI version)
// php /path/to/this/script.php (CLI version)
//
// Use the -q switch to supress the HTTP headers that PHP outputs by default.
// The folder or disk to backup
$source_volume = '/Volumes/Lothlorien';
// The folder or disk to backup to
$destination_volume = '/Volumes/Backup';
// The number of backups
$backup_count = 31;
//////////////////////////////////////////////////////////////////////////////////////
// Don't edit below this line!
//////////////////////////////////////////////////////////////////////////////////////
// Make sure the max is unlimited.
ini_set('max_execution_time', 0);
// If the destination and the source exist
if (file_exists($destination_volume))
{
if (is_writable($destination_volume))
{
if (file_exists($source_volume))
{
// Get the contents of the cursor file
if (file_exists($destination_volume.'/cursor.txt'))
{
$cursor = @file_get_contents($destination_volume.'/cursor.txt');
}
else
{
$cursor = null;
}
if (!empty($cursor))
{
$cursor = explode('|', $cursor);
$counter = (int) $cursor[0]; // The counter is before the vertical bar
$date = (int) $cursor[1]; // The date is after the vertical bar
}
else
{
// There is no cursor yet
$counter = 1;
$date = 0;
}
// If the day in the cursor and the day now don't match,
// go ahead and do a backup. This ensures that only one
// daily backup is done.
if (date('j', $date) != date('j') || empty($date))
{
// Open the cursor file for writing
$f_cursor = fopen($destination_volume.'/cursor.txt', 'w+');
// Make the Backups directory, if it doesn't exist
if (!file_exists($destination_volume.'/Backups'))
{
mkdir($destination_volume.'/Backups', 0755);
}
// Set up individual backup directories on the first run
for ($i = 1; $i <= $backup_count; $i++)
{
if (!file_exists($destination_volume.'/Backups/'.$i))
{
mkdir($destination_volume.'/Backups/'.$i, 0755);
}
}
// Output which backup is being written to the terminal
echo "Active backup: {$destination_volume}/Backups/{$counter}\n";
// Run the rsync command, and sync up the backup.
`rsync -r {$source_volume} {$destination_volume}/Backups/{$counter} --delete --ignore-errors`;
// Increment the counter for tomorrow's backup.
$counter++;
// Reset the counter back to one if the counter has
// exceded the backup count.
if ($counter > $backup_count)
{
$counter = 1;
}
// Write the cursor to a file on the backup volume, so it can be
// referenced for next time, and so you can determine which
// backup is the most up to date
fwrite($f_cursor, $counter.'|'.mktime());
fclose($f_cursor);
}
else
{
// Otherwise, it's still today's backup, and today's backup has already ran.
// Since today's backup has already ran, sync today's backup again
// Go back one, so you're not syncing to tomorrow's backup
if ($counter > 1)
{
$counter--;
}
else
{
$counter = $backup_count;
}
// Ouput which backup is being written to the terminal
echo "Active backup: {$destination_volume}/Backups/{$counter}\n";
// Run the rsync command, and sync up the backup
`rsync -r {$source_volume} {$destination_volume}/Backups/{$counter} --delete --ignore-errors`;
}
}
else
{
echo "Error: The source volume {$source_volume} does not exist or is not mounted.\n";
}
}
else
{
echo "Error: The destination volume {$destination_volume} is not writable.\n";
}
}
else
{
echo "Error: The destination volume {$destination_volume} does not exist or is not mounted.\n";
}
?>
The backup script makes 31 uncompressed backups of the same content. I prefer uncompressed copies because that makes restoring from backup much easier. The script could easily be modified to archive and compress using the switches offered by rsync. And the script could also be modified to make more or less than 31 backups, I find a month a good round number for backups, combine this script with multiple external hard drives, and you can go back in time as much as you like.
This script backups to the same copy for 24 hours, after 24 hours, it will go on to the next, until it gets to 31, then it will start over at 1 again. I typically set this script to run on a cronjob on Linux or, on a Mac, you can link to it from an AppleScript, then set the Apple Script up as an event in iCal. Alternatively you can automate it via launchd as well if the iCal approach isn't feasible.
Using rsync for the backup processes makes the backup more efficient with time, since it only syncs changes since the last time a backup was done.
A crafty onlooker might see a lot of opportunity for adding command line arguments to this script that would let you passthrough the configurations automatically from the shell... I could have done that too, but the script suited my needs as it is written, so I didn't bother.
Et voilà! A PHP shell script for backups. Help yourself.
Creating a Custom Downtime Message with mod_rewrite
mod_rewrite is a very useful, and very complicated Apache module that lets you manipulate incoming requests. Uses for this module vary from redirecting content from one URL to another for the purpose of maintaing SEO and legacy URLs to simply making your URLs more readable and understandable, and not so weighed down with technical luggage, as it were.
mod_rewrite's documentation is pretty cryptic, but in that page lies most of what you need to know to use it. Though, albeit, you may have to read that page twenty or thirty times to truly understand how to use it.
One use I've discovered that lends itself well to mod_rewrite's magic, is redirecting traffic during a period of planned downtime. In the perfect world we'd never have to deal with downtime, but once in a while it just can't be avoided. The following technique allows you to look more professional while you are down, and gives you a chance to let your users know that the downtime is temporary.
There are a few ways that you could approach this, one method is to use a custom 404 error document, but that method may not be so good, since your HTTP server automatically sets the 404 response code when serving that document, and that can be bad for SEO. Instead, you want content to appear to exist, but you want all incoming requests to be routed to your downtime message.
The Apache configuration required to do this is surprisingly simple.
<IfModule mod_rewrite.c>
RewriteEngine On
#RewriteLog logs/rewrite.log
#RewriteLogLevel 0
RewriteRule ^(.*)logo\.gif$ /images/logo.gif [L]
RewriteRule ^(.*) /index.downtime.html [L]
</IfModule>
The preceding configuration can go inside a Virtual Host container, or it can appear in the main httpd.conf configuration file in lieu of any virtual host configuration, which will cause all traffic for all virtual hosts to encounter your custom error message.
There are two rules in play here: The first looks to see if the request includes the file name for your logo, logo.gif, which is stored at /images/logo.gif which is the path from your DocumentRoot. Without this rule, you won't be able to include your logo in the downtime message, since the request for your logo would result in your downtime error message being displayed. The second rule translates every path that comes in, that doesn't end with "logo.gif" to /index.downtime.html, which again is the path from your DocumentRoot.
Two of the configurations have been commented out, which allow for the specification of an error log, and the error level for that log. The path that you specify for the log must exist, of Apache will fail to start with an error. For more information on the log and the log level, see the mod_rewrite documentation.
Woe is Me oh MobileMe
It seems some are not quite having a smashing experience with Apple's new MobileMe service.
Well, I think both the customers and Apple are to blame for this catastrophic meltdown. To the customers, why, pray tell, would you sign up for a brand-spanking new web-based service of such great complexity and not expect there to be glitches or bugs? Oh, that's right, you're all ignorant to the pains of programming and development and could care less about such things. The onus could never fall on you, because you expect all technology to be perfect all the time and are intolerant of faults, though it is central to human nature.
Clearly Apple knew the limits of their service. You cannot roll-out a new, global, web-driven service for a multi-billion dollar company and not think about things like load and capacity. Perhaps they based capacity on their former, much more inferior dot mac service, and then simply guessed how many more people would be interested in signing up for the service. Point being, there was a ceiling set in place for the number of subscribers that could log on at a given time, and the number of servers they had on hand to service those subscribers. As it turns out this was grotesquely underestimated.
Apple should have done a staggered roll out by invitation only, as Google did with their GMail service. Google, knowing that being the first to offer a fantastically improved, free, web-based mail service, would cause a great deal of demand for that service, so they limited the number of people that could sign up for their service by making them jump through hoops with a "by invitation only" strategy until demand and supply finally met up with one another, and "by invitation only" was lifted.
Apple should have seen that the new iPhone 3G, combined with rolling out their new service for Windows users, combined with a vastly improved service would cause a steep spike in demand.
But another flaw in Apple's execution of this new service is their culture of secrecy, which is a double-edged sword. Their secrecy allows them to create a great deal of hype and anticipation for new products and services, saying so little, that when they say anything at all you hang on their every word. But their secrecy, in this area, also fantastically raises your expectations, such that any flaw that you discover is also magnified- having already set your expectations for nothing short of complete perfection. This works both for and against them. In my opinion, complex services like this one, must be disclosed sooner in the development cycle. Must have an extensive beta testing period, that is open to the public, because allowing only developers to beta test your software won't uncover as many bugs, since developers by their very nature will forgive some imperfections, are more willing to experiment and look for work-arounds, and are generally more savvy. Allowing the general public into beta testing increases the number of issues that you will uncover... simply though naivitety and sheer probability.
Personally, I have nothing but sympathy for Apple, although I must disclose that I have not signed up for this new service, nor will I until I have seen that it is demonstratively stable. My sympathy comes not from the heart of an unrepentant fanboy, but from the heart of a web developer, who has also recently launched a complex website before it was really and truly ready, and spent my hours and days since that launch sifting through my share bugs and complaints.
Our culture has been groomed to demand instant gratification. Companies like Apple, and even Microsoft, and so on, are trying to oblige with the latest features and services. Features and services that are ridiculously complicated, created by small teams of developers with lives just like you and I. And just because the engineers at Apple make gobs of money and work for a multi-billion dollar conglomerate makes them no less human than you or I. We have come to expect more from these companies because somewhere along the line we have been taught that excessive wealth will make anything better, and assume that with such wealth surely a company like Apple can keep throwing money at a problem until it is perfect. But this simply isn't the case. You can only have so many cooks in the kitchen, only so many developers can work together on a project to ever create anything we would consider useful.
I've hit a few bugs in Leopard, and I've hit a few bugs in my new iPhone 3G. I've hit bugs in iTunes and my AppleTV, and Mac OS X Server. Some of them quite perplexing and infuriating. Even despite having to deal with all of that, I still love Apple products, not because I'm a fanboy, but because I can respect the human element involved in development. Having said that, I expect better from Apple, and my patience has its limits too, and if Apple can't get on the ball and focus more on quality rather than quantity, I too may look for alternative products that better suit my needs. In the meantime, I'll generously give this company the benefit of the doubt and remember the human element involved.
As a Mac, Windows, and Linux user, I can see that there is plenty of humble pie to go around for everyone.
Gzipping content may make Firefox unbearably slow
I set out yesterday on an enormous task, to improve the performance of my PHP framework, Hierophant. I've been working on Hierophant since early 2005. One area that I never put much focus into was in the area of download and page load performance. It was always on the todo list, but kept getting pushed back for one reason or another. So yesterday I set out to patch this gaping hole in my framework's functionality.
I began with the most obvious enhancement, supporting gzip for various dynamic HTML documents, CSS, and JavaScript. Gzip can reduce the download size of a document somewhere in the neighborhood of ~67% for most documents. For example, I took Base2's already compressed base2-dom-fp.js (original compressed with Dean Edwards's Packer) and took it from 47KB to 15KB. Stylesheets went from 10KB to 2KB.
In addition to gzip, I implemented aggressive caching, setting the expiration times of scripts, stylesheets, and most images to 10 years.
So, agressive caching and gziping should alleviate a hefty chunk of performance issues, right? It did, in fact, in IE6, IE7, and Safari. But not in Firefox. Firefox 2, and the latest Firefox 3 Beta 5 were slower with gzip turned on than with it disabled. Not just a little bit slower, a lot slower. It takes literally eons for Firefox to load up gzip'd content.
Safari and even, gasp, IE6 and IE7 run circles around Firefox with gzip'd content. On the other hand, disabling gzip in Firefox brought its performance back on par with Safari and IE, which is completely baffling to me.
Ultimately, I did the only thing I could have done and that was to disable gzip for Gecko-based user agents, and left it enabled for all else.
So word to the wise, if you're going to experiment with gzip, Firefox and its Gecko-based brethren may not be up to the task.
Going back to caching, how do you update content that is set to expire 10 years in the future? In my framework I opted to append the last modified time of the file to every path output to HTML. So, for a script you'd see a path that looks something like this:
/Library/base2/lib/base2-dom-fp.js?hFileLastModified=1207280248That causes the document to be stored indefinitely, per the client's settings until the path changes. When the timestamp changes, the file automatically updates on the client-side. And I'm taking the same approach for images. This method allows you to utilize external scripts and stylesheets to their full potential, and as a result makes browsing blazing fast. Combine with other various optimization techniques, and you can make your site even faster than fast!
As a side note, if you've never heard of Hierophant before, I'll be blogging about that more in the future.
OK, "em" Units Just Might Be Useful...
A bit of an about-face in my thinking has taken place recently. A while back I wrote a blog post about how I think "em" units are worthless for their intended purpose, that is to say, scalable font design that can stretch or contract to accommodate changes to the font size.
I still think that em units are probably never going to see the kind of use that some standards advocates hope they will. That's just one cold, hard truth of the web... you can make the tool available, you can outline its benefits in no uncertain terms, but the majority of so-called developers and designers out there are either completely uninformed, don't care, or both. So I don't believe that the usability portion of the em unit will ever be realized.
But after some thought, I did have a change of heart about the em unit itself. You can do some very cool things with em units. One example would be animation. You could design a box so that it's dimensions are all em based, then with a little JavaScript you could animate its size doing nothing more than changing the font size of the containing element.
Don't get me wrong folks, I long for the days of resolution-independent, scalable design, vector graphics, and accessibility. I just don't think em units fulfill the usability role very well, since most developers won't use them. Now that all of the major browsers are implementing the zoom feature, I think that will make the em unit obsolete for that particular application. And yes, zoom isn't perfect, its clear that will need to be refined too. But the em unit is an incredibly useful feature, and I recant saying otherwise.
First Impressions of IE8 Beta 1
It's pretty buggy. Lots of stuff coded purely to standards and not to a particular browser isn't rendering correctly. All in all though, that's expected, this is not the final product, they're still hammering away on some pretty big things.
On the topic of CSS, I'm happy to see some big things from CSS2 that are finally being supported. The table display types, the box-sizing property, generated content. I'm disappointed that they still haven't done much with CSS3 though. Some things in CSS3 seem pretty trivial to me to support, like most CSS3 selectors, for example, or the :target pseudo-class. JavaScript libraries on the client-side have been supporting CSS3 selectors for ages in their various Selector API implementations.
On the JavaScript front, I'm pleased to see that IE8 will support the querySelector() and querySelectorAll() methods from the new W3C Selectors API draft specification. That makes Webkit and IE (of all browsers) the first native implementations. That's a prediction I made that I'm pleased to see fulfilled.
I'm puzzled about some baffling ommissions in the JavaScript arena as well though. Why don't we finally have W3C events? Where are things like DOMContentLoaded, forEach, etc. The IE team, at least, in my opinion, could deliver more here, and again, it seems pretty trivial to me for them to add support for most of these things. So why are the improvements so slim?
Perhaps we can hope that the IE team is not yet finished with their improvements, but IE team, if you're listening, please add support for as much of these trivial things as possible. I appreciate the big, complicated bug fixes, but, in my mind I see a lot of little things that you could easily add support for, but for whatever reason are not. I'm not going to make blanket statements like, why don't you support x specification entirely, nor am I going to spit in the face of what you've accomplished, I can see that what you do have is pretty significant, but try not to overlook the smaller things that make everybody's life easier, that you could probably get an intern to do on a Sunday afternoon.
Satan Needs a New Winter Coat
Temperatures plummeted to new record lows in hell today on the recent news of an about-face to IE8's standards mode default. If you haven't been living under a rock, I'm talking about the recent news that IE8 will, in fact, by default, act like IE8.
If you still need IE7 compatibility in IE8, fret not, there is an out for you in the form of Microsoft's <meta> tag/HTTP header proposal.
Seriously though, this is no less than awesome. The amount of goodwill and thanks shown by the development community on the IE team blog is unprecedented.
I woke up this morning with a strange, unyielding desire to hug Microsoft. It looks like the web is finally going to move forward in a big way.
This makes me feel like the end of Return of the Jedi, when on Endor, Hans and his Ewok friends finally managed to bring down the Empire's shield generator so the Death Star and the Galactic Empire could be obliterated once and for all, and then it was, and a victorious inter-galactic celebration ensued! Showing my true nerd colours.
Why I don't bother with "em" units anymore
I used to think that "em" units were the bee's knees. The cat's meow. The way it should be done.
Then I encountered weird bugs and design glitches and spent hours trying to figure out how to fix them. IE 5.5 was a real bear to whip into submission. IE 6.0 has some trouble with em units too.
Then I saw how Opera handled text zooming by zooming the whole page, rather than just increasing the text size. IE7 followed suit. And now Firefox 3 will support it too.
So I came to a rather obvious conclusion. The simple fact of the matter is that the overwhelming vast majority of websites do not implement designs in em units and never will. At that only a fraction of the percentage of people implementing bonafide, best practice oriented, standards-driven designs will bother with em units because of the design complications it introduces.
On the surface it's a cool idea, and makes you feel good because you're promoting an accessible design. But it doesn't work in the real world, and is a feature of CSS that is doomed to failure because it puts the burden on the website designer/developer rather than the browser, and it's a burden you can completely ignore, so most people do.
When you give it any thought you realize that page zooming is the correct way to approach the problem, because it works for every website, even the bloated, non-standard crap infesting this web's hallowed halls, rather than only the handful that take text resizing into account.
Visual disabilities are pretty common. There are a lot of people who have trouble reading text... however... the "em" battle is a battle that cannot be won... and page zooming is a superior solution to the problem that helps people with visual disabilities surf the whole web without hindering a single site. If I know someone with visual disabilities that need to be able to zoom text, I'd recommend Firefox 3, IE7, or the latest Opera.
Page zooming should be standard. em units (and similar units) should at the very least be abandoned for this purpose if not deprecated all-together.
A Shrink-to-fit Epic: You Too Can Make IE Your Bitch
Yes, you too can make IE your bitch, if you're determined and crass enough.
Throw your sensibilities about logic and standards out the window, because you don't need them here.
Recently, I set out to implement a simple, seemingly remedial technique for custom borders and backgrounds on tabs in none other than everyone's favorite browser, IE6.
It's worth mentioning, IE7 also gave resistance to this technique, but was slightly easier to whip into logic.
A battle of wits was to ensue. Would I let it defeat me? Would I let it send me back to the bowels of table layout or less-semantically reasonable markup? No, damn you, IE, you will not take me without a fight!
This technique is a rather easy exercise in floating elements and custom borders... I set up a list... not unlike this one:
<ul id='tmpProductTabs'>
<li id='tmpProduct-Overview' class='tmpProductTab tmpProductTabSelected'>
<div class='tmpProductTabWrapper'>
<div class='tmpProductTabTopOuter'>
<div class='tmpProductTabTopMiddle'>
<div class='tmpProductTabTopInner'>
</div>
</div>
</div>
<div class='tmpProductTabOuter'>
<div class='tmpProductTabMiddle'>
<div class='tmpProductTabInner'>
<span>Overview</span>
</div>
</div>
</div>
</div>
</li>
<li id='tmpProduct-Warranty' class='tmpProductTab'>
<div class='tmpProductTabWrapper'>
<div class='tmpProductTabTopOuter'>
<div class='tmpProductTabTopMiddle'>
<div class='tmpProductTabTopInner'>
</div>
</div>
</div>
<div class='tmpProductTabOuter'>
<div class='tmpProductTabMiddle'>
<div class='tmpProductTabInner'>
<span>Warranty</span>
</div>
</div>
</div>
</div>
</li>
<li id='tmpProduct-ApplicationNotes' class='tmpProductTab'>
<div class='tmpProductTabWrapper'>
<div class='tmpProductTabTopOuter'>
<div class='tmpProductTabTopMiddle'>
<div class='tmpProductTabTopInner'>
</div>
</div>
</div>
<div class='tmpProductTabOuter'>
<div class='tmpProductTabMiddle'>
<div class='tmpProductTabInner'>
<span>Application Notes</span>
</div>
</div>
</div>
</div>
</li>
<li id='tmpProduct-Firmware' class='tmpProductTab'>
<div class='tmpProductTabWrapper'>
<div class='tmpProductTabTopOuter'>
<div class='tmpProductTabTopMiddle'>
<div class='tmpProductTabTopInner'>
</div>
</div>
</div>
<div class='tmpProductTabOuter'>
<div class='tmpProductTabMiddle'>
<div class='tmpProductTabInner'>
<span>Firmware</span>
</div>
</div>
</div>
</div>
</li>
</ul>
OK, so that's a lot of markup for a simple aesthetic enhancement. Namely, I just want rounded corners on my tabs! For the love of God, why does it have to be such a chore?
I covered a similar technique in my second book, CSS Instant Results, but admittedly, it was just a tad more remedial.
To get the custom border effect, I divide an image created in PhotoShop or your graphic editing environment of choice into nine slices, for the top left corner, top middle border, top right corner, and so on. In HTML, I need three block-level markup elements for each slice, and depending on what copy is being inserted a separate inline or block container for the actual content going in. Then, using nothing more than CSS background properties, I specify a background image for each markup element.
Still with me so far?
So, in this tidbit, I have the top border elements prefixed tmpProductTabTop. To make the slices work, all you do is put the left slice in the outer element (non-repeating, positioned, ahem, top, left), the right slice in the middle element (non-repeating, positioned, top, right), and then in the inner element, you specify the top middle slice, and repeat it along the x-axis, positioned to the top. Then you set the left and right margins of that element equal to the widths of the left and right sides, respectively. Finally, and this is important, if the element has no content, you need to give it a height equal to the height of your slices, otherwise it will be invisible.
Semantically speaking, this is all completely meaningless. It clogs up your page size and bandwidth, but it's pretty safe, in my opinion.
Now, for our old friend IE. To make this lot into tabs, you float the <li> elements, and given them a bit a margin on the left or right sides to separate each tab.
When you float an element, its sizing model becomes shrink-to-fit, aka shrink wrap. The element expands only enough to accommodate its content. However, IE has this mechanism called layout, a vile, evil device designed to torture web developers. Whenever it is triggered, Microsoft actually slaughters a baby kitten, and the world turns just a little more gray. Layout is triggered when you use a select handful of properties, one of which is explicitly setting a width or height on an element.
In terms of floated elements in IE, when a floated element contains a block element, and that block element has "layout", the floated element's head spins off and it switches from shrink-to-fit mode, to expand-to-fit (against the standards, of course). In other words, your layout is screwed, your kids start doing drugs, and Bob's your uncle.
In my case, the little, completely logical thought process of giving that inner element an explicit height wreaked havoc with my layout. How stupid of me to consider that a possibility...
Sigh... but... once you know of this evil plot, you too can overcome it with a quick well-thought-out flank of the enemy's position. Instead of height, set the padding. Padding will not trigger layout.
Ah, but Richard, my layout, it's still screwed. Worry not! For there is more for you to learn! If your sliced images' height is less than the default font size or line-height, you will have to zero out those properties as well to get layout serenity.
But wait, my elements aren't even visible! For the love of all that's holy, give each element a relative position... and they will materialize again.
Thankfully, these woeful requirements do not affect the rendering splendor of better browsers. You may be left with some minor aesthetic glitches in IE6, however, if that is the case use the star html hack to fix them.
I would like to share the actual CSS with you, but I'm too lazy to format it, and I don't yet have a code formatting algorithm for this nifty blog. If popular demand requires it, I shall accommodate.
See also: http://www.satzansatz.de/cssd/onhavinglayout.html
Go in peace.
I Heart Webkit
Let's all have a love-in for Webkit, the first browser to officially support the new W3C Selectors API.

