Pointbeing.net

Some Thoughts on Testing Developers

For reasons I can't quite fathom, I've been thinking a lot about testing developers recently. That's testing developers as part of the hiring process, as opposed to developer testing (which I do bang on about rather a lot, to be fair).

I say I can't fathom the reasons, because we're not actively recruiting right now, nor am I looking to be recruited (though if you have your air conditioning switched on you may be in luck).

So anyway, it's fair to say that before you hire a developer, you want to find out if they're any good at developing, right? And therein lies the problem: how on earth do you measure the candidate's skill level?

I've seen, and used, a few approaches myself, so I'll go over a few of them and see what drops out the other end.

Approaches I've Used

At PropertyMall we gave candidates a test with a few questions that they could do in their own time, with all the resources - books, internet, cups of tea - that they would have in a real developer role.

The test started off with some really simple stuff about printing out some numbers (so you knew that if they got those wrong you could stop reading). It then went through a few relatively straightforward OO PHP and design pattern examples, and ended with a very open question which gave the candidate a chance to just write some code.

One of the questions I liked most gave the candidate a snippet of code and asked them which design pattern it exemplified. The question was multiple choice, the motivation being that we didn't necessarily need a candidate who could recognise a composite (or whatever it was) straight off, but if they could Google the four terms, most likely find a Java or C++ or Smalltalk example, and relate it to the PHP code in front of them, they probably had the makings of a decent developer.

We hired some bloody good people at PropertyMall, so we were presumably doing something right.

I think the key thing is always to get the candidate writing code. At PlayPhone we just cut to the chase and give a fifty-or-so word written description of some behaviour, and send them home to write the code. If it doesn't suck, we invite them back to discuss it - and that's probably the most important part of the test: having the developer explain their approaches, thought processes and motivations.

Approaches I've Come Up Against

I've changed jobs a handful of times myself, so I've come across a few different approaches to testing developers.

Like absolutely every other PHP developer in London I've been put through Brainbench by Allegis, who recruit for IPC. Brainbench is taken online (so with access to the web a requirement, rather than an option!) and is time-limited. I think it was about 45 minutes, and the questions come thick and fast. At the end you're left with a cold, hard numerical score, which is presumably for the benefit of managers and other folks who can't grasp anything complex or organic.

The vendors claim that the test is smart, so the better you're doing, the harder the questions get. I did the PHP and the Perl ones, and I seem to remember the results were quite complimentary about my PHP. However, the system is clearly flawed, as it put me in the top few percent of the nation's Perl developers despite me blindly guessing my way through most of it, grimacing occasionally as I struggled to recall the few dozen lines of Perl I wrote as a student.

Another approach I've come across was that the company would email over a document with a bunch of questions or exercises, you'd have a crack at it, and 45 minutes later they would ring you and discuss it all. That's quite a nice combination of time-limited and open-book, it incorporates the all-important discussion stage, and doesn't waste too much of anyone's time.

Well, I say that, but it turned out that the company involved were such amateurs that they twice failed to keep the appointment, before I got fed up and told them to quit the hell wasting my time. But I liked the idea.

Closed-book can be interesting too. Before I started my current job I interviewed with a promising startup who were looking for a lead developer to head up the technical side of the company. That's quite a challenge and a lot of responsibility, so they were keen to make sure that they got the strongest developer they could find.

The result was a very challenging but rather satisfying time-limited, closed-book test covering everything from hardware performance to regular expressions, dependency injection and even DSLs - not your typical PHP web scripting stuff, for sure.

What surprised me was not my score, but the fact that it was expressed as a percentage. I think I managed something in the low 80s, but the implication that they thought that any of that stuff had a right or wrong answer really set the alarm bells ringing.

It got worse though, and the post-mortem was painful as hell. Now, since the company didn't yet have a technical team (that would have been my job to remedy), they had hired a local "guru" (surely I'm not the only one who raises an eyebrow when that word crops up?) in a kind of consultancy role, to set the test.

In the post-test discussion and interview, it emerged that he'd marked me down for doing a command line svn merge using a syntax with which he was unfamiliar. He was also completely unaware of svnserve, and rather assumed I was making it up; another black mark there. I was given a dressing down for suggesting that PHP makes for a perfectly good templating language (that's only what it was invented for, mate) and again for once having written my own framework. Conversation later turned to the framework he was writing.

The whole thing left a bitter aftertaste, and after a good night's sleep I'd decided I really, really didn't want the job. Conveniently enough, I didn't get it.

But I digress. I don't want to turn this into a personal rant, because - believe it or not - I do actually have a point here, which I can best sum up as:

Who tests the testers?

Really, what reason does the candidate have to believe that the tester is any better than them, or even that they have a clue what they're doing? As you climb the experience ladder over the years, that question becomes more and more pertinent with every rung.

I think this is why I'm uncomfortable with handing out tests that are supposed to have a right and a wrong answer. On the other hand, I don't think that personal opinion (as in the templating example above) has a role to play in testing developers either.

Conclusions, If Any

So you can see what a minefield this can be, and why after eight years of interviewing and being interviewed, this is a nut I'm still to crack.

I guess I've picked up a few crumbs of wisdom over the years though. I definitely feel that open-book tests are the way to go, since they more closely simulate the environment in which developers actually work. I think all tests should be done with an internet connection, at the very least.

I also think it's utterly vital that a test doesn't seek to trick a candidate, or catch them out. You want to know what they can do for you, after all, so give them space to show it.

I think you also want to find out if the candidate is bright, and so you'll want to throw something in there that tests their brain, rather than their experience of a specific programming language. Not for nothing is Joel's book titled "Smart and Gets Things Done".

So what about you? How does your company test developers, and what experiences have you had? Finally, what role does professional certification, such as Zend Certification, play in all of this?

Posted on Sunday, the 10th of August, 2008 | permalink | comment

"PHP Tools for Mobile Web Development" Published

The cover of php|architect's July issue

This is just a quick heads up to say that my article, "PHP Tools for Mobile Web Development" has today been published, and is currently gracing the cover of July's php|architect magazine.

Of course, I jinxed things a little by blogging that it would be published in June, but never mind, we got there.

Big thanks must go to Ciaran for giving the initial draft the once over (on a related note, check out Ciaran's post about web development for the iPhone). Thanks also to my occasional colleague Gerard for clueing me in to the fact that the damn thing had been published.

For what it's worth, php|architect is recommended reading even when I'm not in it, so get yourself over there and get subscribed!

Ok...now to crack on with that second article...

Posted on Tuesday, the 29th of July, 2008 | permalink | comment

Mobilising a Website, Part 2: Strategies

In Mobilising a Website, Part 1: The Problem I noted that this site is practically unusable when viewed using the browser on a mobile handset, and that I'd like to do something about that.

This time around, I'd like to size up some of the approaches and strategies that developers can take in order to make an existing website mobile-friendly.

Option 1: Do Nothing

It may sound flippant, but in the world of development, the option of doing nothing often has to be seriously considered. Every developer out there can recount tales of ridiculous amounts of time and expense being invested in hairbrained projects developing systems that simply were not needed.

This strategy is unparalleled in its cost-effectiveness and simplicity and is naturally very, very easy to estimate and plan.

In this particular case, we could point to the fact that, as we saw, some high-end handsets do a decent enough job of displaying the site as it stands. We can confidently expect mobile browser technology to improve immeasurably over the next couple of years, so there's a lot to be said for holding off.

Unfortunately, this strategy does not solve the original problem. I want my site to work nicely for mobile users now, and it will be a long time before browsers as capable as that of the iPhone dribble down to the majority of consumers.

Moreover, I'm as interested in the process of achieving a mobile-friendly site as I am in the finished article, and that's why we're here: this series of blog posts would be rendered rather brief and anti-climactic if I were to choose the "do nothing" route!

Option 2: Add a Mobile Stylesheet

Quite a few of the problems we saw in part one, in particular those which confounded the Nokia 6230i, were related to the site's stylesheet. For example, both the fixed width of the page and the large banner images are defined in there. The former is the result of the rule width: 790px; being applied to a number of the elements within the page; the latter is specified by applying a background-image property to a <div />.

Fortunately, HTML and XHTML provide a way of specifying that a stylesheet only applies to a certain class of device. This is all achieved by applying a media attribute to the <link /> element which calls in the stylesheet.

By way of an example, the following code extract is designed to send one stylesheet to "full" or desktop web browsers (media="screen") and an alternative stylesheet to mobile browsers (media="handheld"):

<link rel="stylesheet" type="text/css"
		media="screen" href="full.css">
<link rel="stylesheet" type="text/css"
		media="handheld" href="mobile.css">

The full list of possible values for the media attribute can be found here.

Once we're serving a different stylesheet to mobile devices, we can really start to think about customising the user experience for them. For example, a mobile stylesheet could also be used to "hide" certain elements from mobile browsers, by applying a display: none; style rule to them. The long list of links in the sidebar which made the site so difficult to use with the W880i seems like a good candidate for hiding.

This strategy has the compelling benefit of simplicity. By adding one line of HTML, plus a small additional stylesheet to the site we may be able to deliver a mobile-friendly experience. Let's knock up a prototype mobile stylesheet and see how things look.

Here's a small CSS file that "hides" some of the more spurious elements of the page. I'll also specify that links should be displayed in green, purely to make it unambiguous as to whether the new stylesheet is being loaded and applied.

.sidebar {
	display: none;
}

a {
	color: green;
}

Remember that simply by not specifying widths, colours and background images in there, we're sidestepping a lot of the decorative fluff that was causing problems for mobile browsers.

Viewed on the Sony Ericsson, things are looking up:

Pointbeing.net homepage viewed on a Sony Ericsson W880i

Fig 1: Pointbeing.net with a handheld stylesheet, viewed on a Sony Ericsson W880i

We're successfully displaying only the main navigation, an introductory heading and paragraph, 10 links to blog posts and a footer. It's actually a really clean user experience for very little effort - I like it a lot.

Unfortunately, life isn't quite so cheerful for users of the other handsets. The experience on the Nokia 6230i and the iPhone is all but unchanged, but probably for different reasons. The Nokia does occasionally display links in green, so my hunch is that it simply does not support the media attribute, and so loads and applies both stylesheets. There's no corresponding property in WURFL to confirm it, but it certainly looks that way.

Conversely, I'm certain that the iPhone fully understands the media attribute, but considers itself to be a "real" computer. The jury's out on that one [1]. My gut feeling is that screen size should be the deciding factor in these cases, and based on that, the iPhone sits squarely under the heading of a handheld device. It's only fair to point out that the iPhone is not alone in this behaviour: the Nokia N95 and my LG KU990 behave in much the same way.

So the handheld stylesheet seemed like a great idea, but we can summarise several of the drawbacks to this approach as follows:

  • Some devices do not support the media attribute at all
  • Many high-end handsets completely disregard the media attribute and opt for the "full web" stylesheet
  • In the cases where devices both support and honour the media attribute, we're relying on them to support CSS consistently. This is not something we can safely expect from mobile devices. For example, my first attempt was to use visibility: hidden; instead of display: none; but this was disregarded by the W880i
  • Even if we hide elements by using CSS, the full page still has to be downloaded, which is likely to impose both a time and cost expense on the user. Expecting the user to download reams of markup which we don't actually want them to render seems like rather poor form. Furthermore, those full web pages may be larger than the maximum deck size which the device can support
  • Pages are still not tailored to the mobile experience: Cameron Moll talks a lot about contextual relevance, and it's hard to see how my enthusiastic post about The Get Up Kids, consisting of three multi-megabyte YouTube videos embedded in the page is at all relevant to the mobile context

So this strategy isn't ideal. But for so little effort I've actually managed to make the site perfectly usable on a number of handsets. With little to lose, and with Early And Regular Delivery in mind, I'm actually going to put the handheld stylesheet in place right now, while I ponder alternative stategies.

Option 3: Allow the Site Automatically to Adapt to Devices

The principle behind this strategy is one known as "adaptive rendering". In other words, device detection would be done server-side (in my case using PHP) and the client would be sent markup and content tailored specifically to the device. I can think of a couple of ways to achieve this, although I'm sure there are plenty more.

The first option is afforded to us by the fact that the site is based firmly on the MVC pattern, which is fairly common in web sites and applications these days. MVC dictates that the business logic and data (Model), display logic (View) and application flow (Controller) be arranged into discrete components, so as to be independent of each other. In this case we're primarily concerned with the display logic, so it seems feasible to take advantage of the separation and swap in a different View component for mobile devices.

I happen to be using Zend Framework, so this could be achieved by specifying the directory in which the View should look for view scripts and helpers. I imagine it would not be difficult to do this dynamically using methods such as Zend_View_Abstract's setBasePath(), setScriptPath() and setHelperPath().

Another approach would be to adopt the "two-step view" pattern. This pattern is nicely documented by Martin Fowler in his classic Patterns of Enterprise Application Architecture, so I'll quote liberally from there:

Two Step View [splits] the transformation into two stages. The first transforms the model data into a logical presentation without any specific formatting; the second converts that logical presentation with the actual formatting needed. This way...you can support multiple output looks and feels with one second stage each

That sounds rather like what I need. I guess in terms of implementation, I'd be looking at having the first stage generating some common XML format, and then perhaps using XSL Transformations server-side in order to transform the XML into the markup which the device prefers. At the same time, I can opt not to include certain elements, such as the long list of links, in the finished pages.

Again, this can all be done within the View component of the site's code, which is rather gratifying. Still, that does seem like quite a lot of work, and I'm not sure I want to get into what would effectively be writing my own templating engine.

Option 4: Build a Separate Mobile Site

One surefire way of getting myself a working mobile version of the site would be to simply build a standalone mobile site. I've already registered the pointbeing.mobi domain name, and quite honestly do not have any better ideas for what to do with it.

The benefits of this approach would be that I could tailor the mobile site exactly the way I want it, I could roll out features incrementally, and I wouldn't risk making the main Pointbeing.net site's code any more complex than it need be. Admittedly it's pretty simple stuff right now, but I'd like to keep it that way.

I could detect mobile devices as they hit the main site (perhaps using Tera-WURFL) and forward them across to the mobile version. This is a strategy used by a fair number of large sites, such as Flickr (mobile version) and Facebook (mobile version), so it seems like I'd be in good company.

The downsides would be that I would have two sites to maintain, and that I still would not have solved the original problem, that of Pointbeing.net being a bit of a dog when viewed on a mobile browser. That said, if I leave the handheld stylesheet in place, I'd be catering to most cases.

The important thing to remember will be to always provide a link back to the full site for users who feel confident that their browsers will cope with it. I'm not interested in forbidding any users from accessing any content whatsoever: that's too much of a throwback to the dark days of the desktop web when sites would block, say, Opera users, demanding that they to "upgrade" to IE5 in order to gain access.

Conclusions

Creating a standalone mobile-friendly site - either under the .mobi domain or under an "m." subdomain - and forwarding mobile devices on to it is an appealing option, and it's the one towards which I'm leaning right now. I think...

I may well change my mind and opt for the adaptive rendering path. While that route feels a little more ambitious, I certainly would be quite proud if I could pull it off, and have the site adapt itself to devices as if by magic.

Either way, I'm going to have to put some thought into which tools and libraries I'm going to use to create mobile-friendly pages. There's a few out there, some of which I've covered on this site, some in a recent piece I wrote for php|architect, and yet others with which I'm not at all familiar.

Part 3 of this series seems like a good time to start making some decisions about toolkits, and this will in turn entail making some architectural decisions about the code behind the mobile version of pointbeing.net.

Footnotes

[1] Similarly, the debate about whether to deliver "full web" content or a mobile tailored version to such devices continues. A useful piece on the subject appeared recently over at WAP Review.

Many thanks to my colleague Dan Gent, whose remarkably well-timed loan of Cameron Moll's Mobile Web Design helped to inform this post.

Previous Posts in this Series:

Posted on Saturday, the 26th of July, 2008 | permalink | comment

Response to "10 Things a Developer Should Never Ignore"

Earlier this week, I stumbled across Bill Stronge's recent 10 Things a Developer Should Never Ignore over on TechRepublic. It's recommended reading, as it's an interesting piece, filled with useful advice for developers, especially those just getting started in their programming career.

Still, a couple of the points jarred with me a little, and there were a couple which I felt could have been taken further. So here's my response to Bill's 10 Things.

#1: Clarifying User Requirements

I agree with this point in principle, but it's important not to get bogged down in the requirements gathering phase. Try to understand your customer, but don't expect them always to know what they want before they actually see a product coming together.

Furthermore, expect requirements to change over time. This is a constant source of frustration for inexperienced developers. I know because I've been there myself, but over time you will come to expect it and embrace change.

I think this is where agile processes win hands down. Agile urges you to keep it simple, roll out small, incremental changes to a system, and adapt to changing requirements over time. The result is almost always a more valuable product, happier customers, and more satisfied developers.

#2: Collaborating

Agreed. We've all worked with the "hero" programmer, who churns out thousands of lines of code, and jealously guards it from the eyes and - heaven forbid - input of his or her colleagues.

Unfortunately, the only possible outcome of that mindset is libraries that no one else wants to use, and code that no one else wants to maintain.

You don't have to go as far as pair programming (though there's a lot to recommend about it), but just don't let your pride get in the way of talking your ideas through with other developers, asking questions and listening to advice.

#3: Version Control

Agreed. Get your code, configuration files and even documentation into version control and feel the weight lift from your shoulders.

Furthermore, learn how to use version control well. Learn how to use tags and branches properly, and read up on more advanced features, such as svn externals. On this subject, I strongly recommend the Pragmatic Version Control books.

Finally, if you're only familiar with a graphical VC client, such as Tortoise, learn how to do it all from the command line. You'll find it quicker, simpler, and a great deal more powerful.

#4: Basic System Testing

I would take this point a lot further and recommend unit testing and Test-Driven Development. In short, unit testing is writing code to test your code, and Test-Driven Development (TDD) is simply writing the tests before the code.

Bill states that most developers hate testing, but I believe that they only think they do. Programmers love programming, and I like to think of unit testing/TDD as a programming technique rather than a testing technique. In fact, designing and coding a thorough and yet flexible test suite can often be as satisfying a challenge as the project you're actually delivering.

The investment of a little time up front to write some small tests typically pays back a hundredfold in terms of productivity. Time spent debugging is slashed, and you have the pride of delivering code in which you have complete confidence. Furthermore, code built using TDD invariably tends to be cleaner, terser and more flexible than testless code.

Once you become what's known as "test-infected" you're very unlikely ever to voluntarily return to the pain of untested and untestable (also known as detestable) code!

#5: Usability

It's hard to argue with this, but it's even harder to define 'usable'.

Programmers are notoriously hopeless at usability, especially the ones who think they're great at it. You only have to look as far as the clunky Ajax-heavy interfaces of trendy Web 2.0 sites such as Spoonfed [1]for evidence of that. Someone clearly had a whale of a time coding that site, but it's painful for the end user.

Being a programmer myself, I don't have a great number of pearls of wisdom to share when it comes to usability. The best advice I can give is to exercise some self-restraint, keep it simple - boring even - and your users will be far less likely to want to strangle you.

#6: System Performance

Nobody likes slow sites and applications. But please, please pay close attention to the three rules of optimization: Don't, Don't yet, and Profile Before Optimizing.

I don't know if I'd go as far as the commentator who is quoted as saying:

More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason - including blind stupidity

After all, blind stupidity is pretty prevalent. But I certainly have seen some terrible code written, and some deeply regrettable architectural decisions made in the name of optimization. Quite often those "optimizations" tend to lead to far worse performance.

In my experience, two criteria must be in place before you can justify attempting any kind of optimization: i) you must be able to prove that you have a performance problem - anything else is a waste of your time and a waste of your employer's money; and ii) you must be able to measure the problem. If you can't measure the problem, you can't prove that your whizzy optimizations haven't just made things worse.

#7: Comments in Your Code

Time and time again, we're told that comments are a Good Thing, full stop. But the situation is a little less black and white. Comments need to be appropriate.

We've all seen this kind of thing trotted out as an example of poor commenting:

<?php

$i 
+= 1// add 1 to $i

That's plainly stupid, but examples like this tend to miss the real point.

As a rule of thumb, if the code you're writing is complex or obscure enough that you need comments simply to explain what it does, then that piece of code has bigger problems than can be solved just by adding a comment.

In his now classic book, Refactoring, Martin Fowler lists comments as one of his "bad smells in code" for this exact reason. It's much better to strive for shorter, clearer classes and methods. Give them expressive, meaningful names and signatures, and they can become almost self-documenting.

It is however valid to use comments to explain why code does what it does. That line of code that adds 1 to $x may do that to fix a critical bug. In this case a brief comment highlighting the fact, perhaps with a reference to a bug tracker id, will signal to the programmers who have to work with this code in future exactly why that line is there.

#8: Logging

I would urge that Bill's advice to build some helpful logging solutions into the code be followed with great caution. It may help in some specific cases, but spurious logging code can clog up the real application code and make it a great deal more difficult to read.

Furthermore, there's very little that you can do to anger your systems guy more than sending pointless logging code into production, and eating up disk space with log files that nobody ever reads.

#9: Keeping Your Skills Up-to-date

Agreed. Strive for the deepest possible understanding of the tools and technologies you're using, but also read around the subject. Pick up a new language every now and again, read about design patterns and get the hang of some Unix tools, such as sed and awk. Play with benchmarking tools and try to make it to developer conferences and local user groups.

You'll be ten times the programmer you would be otherwise, and you'll thank yourself in that next job interview, because what do you do to keep your skills up-to-date? is a question that's guaranteed to come up.

#10: Taking Pride in Your Work

Agreed. See points 1-9 for more details!

Footnotes

[1] It's only fair to point out that Spoonfed has been redesigned since I linked to it. The new version is greatly improved in some ways, but unfortunately still makes hopelessly gratuitious use of Ajax, which greatly detracts from its usability.

Posted on Saturday, the 12th of July, 2008 | permalink | comment

Benchmarking Zend Download Server

Recently I've started looking into ways that the PHP dev team in which I work can make better use of our Zend Platform installation.

For that reason, the recent Ibuildings/Zend seminar in London on the subject of "Enterprise PHP" was well timed, as it included a pretty detailed run through of a lot of what Platform has to offer.

One feature which really struck me as having the potential to bring performance benefits to one of our systems was the Zend Download Server. Back at the office, I looked into the feature, and ran a few benchmarks. Oddly though, the results don't seem to flatter Zend Download Server.

Zend Download Server

The premise behind Zend Download Server (ZDS) is that tying up valuable Apache HTTPD threads purely to serve static content is overkill, and far from efficient.

This is the same reason why lightweight webservers such as lighttpd are becoming popular. Lightweight webservers are typically run alongside a more powerful server such as Apache, and are dedicated to serving static content, leaving more Apache threads free to deal with the dynamic - for example PHP-based - requests.

ZDS follows that principle, although it works a little differently to lighttpd: it runs as a standalone process, but it hijacks a single Apache thread, thus allowing Apache to delegate the relevant requests down to ZDS.

ZDS can be utilised in a couple of ways. Firstly, there's 'transparent mode', whereby the administrator configures Platform in advance, telling it to hand specific downloads (say, all JPGs and GIFs over 128KB) off to ZDS.

The second option is 'manual mode', whereby the developer hooks directly into ZDS using a simple call to the proprietary zend_send_file() function. zend_send_file() is designed as a drop in replacement for functions such as fpassthru(), which simply read in the contents of a file and send them to output (in this case the HTTP response).

zend_download_file() seemed ideal for my needs, but not wishing to break the third rule of optimization, I decided to do a little benchmarking before I got too excited.

Benchmarking ZDS

I compared the two simplest possible scripts I could come up with. This first example uses the built-in PHP function, fpassthru():

<?php

$file 
fopen('cat2.jpg''r');
fpassthru($file);

And here's the amended version of the script, using zend_send_file()to deliver the file:

<?php

zend_send_file
('cat2.jpg');

Pretty straightforward stuff, all in all. Both scripts are delivering the same file, a JPG image of slightly less than 500KB.

I threw some load at the script using the ab benchmarking tool that ships with Apache HTTPD. Here's an example of the kind of command I ran:

./ab -n 200 -c 10 http://platform/deliver_file.php

The -n argument specifies the total number of requests, while -c specifies the number of concurrent requests that ab will try to make.

Here's an abridged version of the output using fpassthru():

Time taken for tests:   8.849252 seconds

Requests per second:    22.60 [#/sec] (mean)
Time per request:       442.463 [ms] (mean)
Time per request:       44.246 [ms] (mean, across all concurrent requests)
Transfer rate:          10818.09 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        2   87 514.0     14    4082
Processing:    96  337 182.3    281    1127
Waiting:        9   65  64.4     41     366
Total:        106  425 540.2    309    4364

Percentage of the requests served within a certain time (ms)
  50%    309
  66%    383
  75%    463
  80%    495
  90%    613
  95%    785
  98%   3241
  99%   4224
 100%   4364 (longest request)

And the output using zend_send_file():

Time taken for tests:   8.886843 seconds

Requests per second:    22.51 [#/sec] (mean)
Time per request:       444.342 [ms] (mean)
Time per request:       44.434 [ms] (mean, across all concurrent requests)
Transfer rate:          10721.58 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        1   82 530.4     15    4420
Processing:    62  338 183.1    285    1069
Waiting:       12   41  20.1     38     148
Total:         63  420 595.1    304    5022

Percentage of the requests served within a certain time (ms)
  50%    304
  66%    370
  75%    500
  80%    545
  90%    640
  95%    741
  98%   1096
  99%   4992
 100%   5022 (longest request)

I tried a few combinations of the various parameters available to ab, and quite honestly couldn't find any conclusive difference in performance between using ZDS and not using it. In fact, under low or very high load, zend_send_file() seemed to slow things down a little.

Conclusions

I'm pretty surprised - enough to doubt the validity of my tests, I admit. I don't believe for a minute that Zend would make false claims for features of their flagship product, so I must be doing something wrong. But what?

I'm aware that I'm measuring network speed as much as anything, and that the claimed benefits of ZDS centre around reduced load on the server side. But still, who cares if load is down when, at the end of the day, performance doesn't improve?

The one concrete lesson that I can offer up from all of this is that it's always valuable to follow the third rule, and Profile Before Optimizing.

Posted on Saturday, the 12th of July, 2008 | permalink | comment

Fighting Spam and Digitising Books with reCAPTCHA

When I added a comment form to this blog, I wondered how long it would be before I started getting comment spam. Then I wondered if I was flattering myself to think that spam bots would even be interested in my site.

So it's with mixed emotions that I have to admit that right now the number of spam comments I'm receiving is outstripping the number of genuine comments by a ratio of about 10:1.

The time has come to add a CAPTCHA to the comment form.

The Wikipedia article describes the CAPTCHA concept adequately, so I'll merely summarise that a CAPTCHA is a simple test that the poster of the comment is human. I show you a picture of some wonky-looking text, and you type the words you see into the box provided.

Some wonky-looking text

Fig 1: Some wonky-looking text

If you correctly identify the words, I'll assume that you're a real person, and not an evil bot. And your comment will get posted. Simple as that.

reCAPTCHA

I had been meaning to have a play with reCAPTCHA since it caught my eye a few months back. It's a great idea: a totally free CAPTCHA tool, developed by Carnegie Mellon University, that anyone can use on their website.

What makes reCAPTCHA special is that at the same time as you're reading that wonky text and entering the words in the box, you're playing your part in a global effort to digitise pre-computer era books, by deciphering the words that OCR software struggles with. There's a more detailed overview of the project here.

It's kind of a cool idea, so I'm going to co-opt reCAPTCHA to help me fend off those evil spammers. I won't be alone: reCAPTCHA counts sites as large as Facebook, Twitter and StumbleUpon among its users [1].

Implementation

The first step in using reCAPTCHA is to drop in at the reCAPTCHA site and get yourself an account. Of course, you'll have to fill in a reCAPTCHA to do this!

As part of the signup process, you'll be prompted to request a key for your first domain (each key is restricted for use on only one domain, apparently for security reasons). In fact, you receive both a public and a private key, and we'll see how to use those shortly. The whole process takes about two minutes.

Once you're signed up, you're free to start implementing reCAPTCHA. For us PHP users, this is delightfully simple, as the reCAPTCHA guys have thoughtfully knocked up a small library to wrap their API. You can download the library from the project's Google Code pages.

Simply download the code, and unzip it somewhere sane and accessible on the webserver. I'll refer to the installation directory as /path/to/recaptcha for the purposes of this post.

To begin using reCAPTCHA, we'll start by adding some HTML to the comment form in order to display the reCAPTCHA challenge box. The library generates all the HTML we need:

<?php

require_once '/path/to/recaptcha/recaptchalib.php';

// public key as provided during the signup process
$publickey '...';

echo 
recaptcha_get_html($publickey);

It really is that simple, and the reCAPTCHA challenge box shows up as if by magic. With its default theme, it looks like so:

Screenshot of the default reCAPTCHA challenge box

Fig 2: The default reCAPTCHA challenge box

Drop that HTML into the appropriate place in whichever form you want to protect from spam. Once the form is submitted, you can check the validity of the submission as follows:

<?php

require_once '/path/to/recaptcha/recaptchalib.php';

// private key as provided during the signup process
$privatekey '...';

$resp recaptcha_check_answer(
            
$privatekey,
            
$_SERVER['REMOTE_ADDR'],
            
$_POST['recaptcha_challenge_field'],
            
$_POST['recaptcha_response_field']);

if ( 
$resp->is_valid ) {

    
// assume the user is human
    // so post the comment

} else {

    
// CAPTCHA was not entered correctly
    // so redisplay the form
}

Job done, basically. You can theme the actual reCAPTCHA box - to an extent - quite easily, which is nice as the default beige and maroon jarred a little with my fetching grey and turquoise getup. To do that, add a small snippet of Javascript to the form page:

<script type="text/javascript">
var RecaptchaOptions = {
   theme : 'white',
};
</script>

There is also a 'custom' theme which gives you a lot more control over the look and feel, but for the time being I stuck with 'white'. The whistles and bells can wait!

That's really all there is to it. If you like, you can see the finished comment form, replete with reCAPTCHA, for this post. Time will tell what effect this has the amount of spam I receive.

Footnotes

[1] http://news.bbc.co.uk/1/hi/technology/7023627.stm

Posted on Saturday, the 5th of July, 2008 | permalink | comment

Mobilising a Website, Part 1: The Problem

It hasn't escaped my notice that if one happens to visit Pointbeing.net - this very site - using the browser on a mobile phone, the experience is more than a little painful. In fact, more often than not, the site is simply unusable.

The reason for this is that the site does not adapt itself in any way to the smaller screens, slower connection speeds, and idiosyncratic navigation methods found in mobile devices.

In my defence, this is not unusual right now: many, many sites are in the same position (have you ever tried to visit LinkedIn on a mobile?). However, given my faith in the future of the mobile web, and also given what I do for a living [1], this is something of an embarassment. The time has come to mobilise Pointbeing.net.

The Project: Mobilising an Existing Website

Happily enough, this is an interesting problem, and not one which I've actually solved before. I've built a number of mobile web sites and applications from scratch, but I don't have that luxury here. I'm going to have to honour existing content, URLs and users.

And it's not entirely obvious how to go about it! There's a wide range of tools which we may choose to co-opt. Some of these tools, such as WURFL and Tera-WURFL, have been discussed here before, whilst others will be entirely new to me.

Furthermore, there's a variety of approaches which we can take: options include simply creating a mobile-only stylesheet, attempting to adapt the site to various devices, or creating an entirely separate mobile-specific site.

I guess that's why I'm writing this: as I mentioned, I'm not the only developer in this position, and I expect that the process is going to be a bit of an adventure. So this is the first in what I hope will become a series of several posts concerning the project. Future posts will discuss the various tools and approaches in more depth, and will follow the development process through to testing, validation and - fingers crossed - a mobile-friendly Pointbeing.net.

In this first part though, let's get an understanding of the extent of the problem, by viewing the site on a small selection of commonplace mobile handsets.

Trial by Handset: Nokia 6230i

Let's start by visiting the site using the popular Nokia 6230i. This is a simple candybar handset with a 208x208 pixel colour screen [2]. The 6230i was one of the ten best-selling handsets of 2006, so we can assume that there's a reasonable number of users out there.

Screenshot of Pointbeing.net as viewed on a Nokia 6230i mobile phone

Fig 1: Pointbeing.net viewed on a Nokia 6230i

Good lord, what happened there? It looks like the Nokia is doing a reasonable job of rendering the site, and is honouring the stylesheet too. The problem is that the site's dimensions are so fixed that the 208x208 screen can only display a tiny portion of the page at a time. This is known as "keyhole" mode browsing - at least on devices that let you opt out of it. Unfortunately the Nokia doesn't have this option - not least because I haven't provided any way for it to do so.

I'm afraid that for the time being, this site is effectively unusable for Nokia 6230i owners. Sorry guys.

Trial by Handset: Sony Ericsson W880i

Maybe we were unlucky, and simply made a poor choice of handset. Ever optimistic, let's try a different device from a different manufacturer: this time the Sony Ericsson W880i [3].

Pointbeing.net homepage viewed on a Sony Ericsson W880i

Fig 2: Pointbeing.net viewed on a Sony Ericsson W880i

Here we see very different behaviour indeed. It's interesting that the W880i happens to honour the media="screen" attribute which I've applied to my stylesheet's <link /> tag. (It's perhaps even more interesting that the Nokia did not honour that attribute). The result is that the W880i displays the whole page, with no styling rules applied. In a way, that's a pretty sane strategy.

Unfortunately, I'm not providing a user-friendly experience to owners of this handset either: the user has to scroll through list after list of links - to my Flickr photos, to Martin Fowler's "Bliki", and to other parts of Pointbeing.net - before they reach the meaty content of the page. That's no fun at all. Sorry, W880i users, I appear to have failed you too.

Trial by Handset: Apple iPhone

You'll notice that I've so far made a distinctly pedestrian choice of handsets, and I think that I'm right to: whilst your marketing department may all be kitted out with iPhones and Blackberries, the 6230i and the W880i are the kind of phones that are in the hands of real users - especially amongst youngsters and in the developing world, the two key demographics which are adopting the mobile internet in droves.

Still, for the sake of completeness, let's have a look at the site using the Jesus phone itself: Apple's fabled iPhone [4]. I'll connect over wifi, since the £369 iPhone doesn't have 3G. (And don't get me started on those headphones).

Pointbeing.net homepage viewed on an Apple iPhone

Fig 3: Pointbeing.net viewed on an Apple iPhone

As much as I'd love to criticise the iPhone, I have to admit that it does a pretty swell job of displaying the front page, albeit somewhat squashed into a few hundred pixels. I don't have a teenager present, so I can't work out how to use the multi-touch interface to zoom in. But I'm led to believe that it's quite possible.

Still, the fact remains that rather than providing a tailored, mobile-friendly experience ourselves, we're relying on the device to take care of the usability, and we're expecting the user to drag, point and zoom their way around the page. I consider myself pretty switched on when it comes to gadgetry, and yet I can't work out how to do it.

All the same, the iPhone admittedly provides by far the best experience of the three handsets. I won't get too excited though, because as it stands, the only person ever to have visited this site using an iPhone is - you guessed it - me.

Conclusions

Well, that was fun. We've learned that the site is effectively unusable on all but the priciest of mobile handsets, and even then it's a bit of a chore. My ego has taken something of a blow, so I need to crack on with mobilising the site.

I do hope you'll join me for part 2 where I'll weigh up the various strategies and approaches which can be employed to make an existing web site mobile-friendly.

Footnotes

[1] If you hadn't guessed, I'm a programmer. I'm currently working for PlayPhone, a major player in the mobile entertainment field. Needless to say, a large part of my day involves working on mobile web projects.

[2] Full details of the Nokia 6230i's specs from Tera-WURFL.

[3] Full details of the Sony Ericsson W880i's specs from Tera-WURFL.

[4] Full details of the Apple iPhone's specs from Tera-WURFL.

Posted on Sunday, the 29th of June, 2008 | permalink | comment

Zend_Search_Lucene Quick Start

I recently had a spontaneous urge to add a search form to my weblog - this one you're reading right now - and it seemed like a good opportunity to have a look at Zend_Search_Lucene.

I'm really impressed with the simplicity and power of the module. Sadly the documentation, whilst extensive, isn't particularly clear - so here's a quick overview of getting Zend_Search_Lucene up and running.

For the uninitiated, Apache Lucene is an open-source indexing and search tool written in Java, and Zend_Search_Lucene is the purely PHP5 implementation of Lucene [1] that ships with Zend Framework.

Indexing

Before we can do any searching, we need to initialise an index. This is done through the Zend_Search_Lucene::create() method. Indexes are stored on disk, so we will need to create a directory which is readable and writeable by whichever user the script will run as. I've imaginatively called that /path/to/index for the purposes of this post.

Here's an example script which initialises the index, and adds three documents to it, ready for searching:

<?php

$index 
Zend_Search_Lucene::create('/path/to/index/');

$doc = new Zend_Search_Lucene_Document();
$doc->addField
    
Zend_Search_Lucene_Field::unIndexed(
        
'title''Item number 1') );
$doc->addField
    
Zend_Search_Lucene_Field::text(
        
'contents''cow elephant dog hamster') );
$index->addDocument($doc);

$doc = new Zend_Search_Lucene_Document();
$doc->addField
    
Zend_Search_Lucene_Field::unIndexed(
        
'title''Item number 2') );
$doc->addField
    
Zend_Search_Lucene_Field::text(
        
'contents''cow aardvark dog hamster') );
$index->addDocument($doc);

$doc = new Zend_Search_Lucene_Document();
$doc->addField
    
Zend_Search_Lucene_Field::unIndexed(
        
'title''Item number 3') );
$doc->addField
    
Zend_Search_Lucene_Field::text(
        
'contents''cow elephant dog esquilax elephant') );
$index->addDocument($doc);

$index->commit();

It's important not to overlook that final call to commit() - nothing will work without that. The 'title' field is unIndexed as we won't be searching on it, merely displaying it in our list of results. The 'contents' field is text, and this will be indexed for searching.

Where you get your document data from is completely up to you. It might be an RSS feed, a website crawler or - as in my case - a tiny PHP cron script which queries the weblog table in my database.

Either way, that's our index created. Since an index is no use unless you query it, let's have a look at how we can do that.

Searching

Here's about the simplest search you can possibly do with Zend_Search_Lucene:

<?php

$index   
Zend_Search_Lucene::open('/path/to/index/');
$results $index->find('contents:elephant');

foreach ( 
$results as $result ) {
    echo 
$result->score' :: '$result->title"\n\";
}

The 'contents:elephant' query specifies that we wish to search for documents whose 'contents' field contains the term 'elephant'. That runs in a flash, and produces the following output:

0.61871843353823 :: Item number 3
0.5 :: Item number 1

As you can see, the two Zend_Search_Lucene_Document objects which contain the word 'elephant' are returned, ordered by descending 'score'. Item 3 contains the word twice, which is why it receives the highest score.

Of course, there are far more features than I've even hinted at here, so I'll more than likely return to Zend_Search_Lucene in a further post looking at some of the more advanced stuff, but for now, that's your lot.

Footnotes

[1] Incidentally, the index files created by Zend_Search_Lucene are entirely compatible with those created by Apache Lucene, allowing the two implementations to interoperate happily, should the need arise.

Posted on Tuesday, the 3rd of June, 2008 | permalink | comment

My php|architect Article to be Published in June

I purposefully didn't mention this here before now, as I didn't want to jinx anything.

But the time has come, and I'm pleasantly surprised to be able to report that my article - named something along the lines of "PHP Tools for Mobile Web Development" - is to be published in the June edition of php|architect magazine.

This will be my first ever contribution to php|architect, so it's a huge compliment that as well as being published somewhat sooner than expected, it looks like becoming the cover feature for June.

We're currently in the final stages of editing, and I'm really enjoying working with editor Steph Fox to turn this into something worth publishing. Stay tuned for further news.

Posted on Sunday, the 1st of June, 2008 | permalink | comment

Clay Shirky on the Cognitive Surplus

I came across this via a recent post on Jeremy Zawodny's blog, and found it fascinating.

I've been meaning to post something about this for a while, ideally accompanied by insightful and witty commentary. But that didn't happen so I figured I'd let Clay's presentation, from this year's Web 2.0 Expo, speak for itself.

In short, by the term Cognitive Surplus, Clay is referring to the huge amount of spare time and spare brain power that you guys have. That cognitive surplus has so far been swallowed up by the cultural black hole of TV, but little by little people are turning away from TV, and towards more interactive media, specifically the 'net. The upshot of all of this is that we might just be in the throes of something that rivals the industrial revolution in its significance.

I think he may well be right. You make up your own mind.

Posted on Tuesday, the 27th of May, 2008 | permalink | comment