Weblog

August '08 Issue of php|architect Magazine Out Now

The cover of php|architect's August 08 issue

I've just spotted that the August issue of php|architect magazine is now available for download, and it's a top quality issue as always, with articles on writing Wordpress plugins and Facebook apps, an introduction to Adobe's Flex, and finally James Cauwelier's case study of scaling out an e-Commerce site to support a million products.

I'm really pleased to have been involved with the technical editing of this issue, and there's a certain swelling of pride in spotting one's name in the editorial credits (alongside Richard Harrison, I note; Richard being the man responsible for putting ElePHPants on the desks of most of London's PHP developers).

Posted on Friday, the 29th of August, 2008 | permalink | comment

Presentations on Slideshare

I've been doing a bit of presenting at work recently, which has meant getting my head around making up slides (using OpenOffice, of course). It all feels a little bit Dilbert, in a way.

Anyway, there's nothing particularly groundbreaking or PlayPhone-specific about these slides, so I've put them up on Slideshare in case anyone fancies a look.

The first presentation I did was on Zend Platform. Apologies for the garish yellow template.

Today's was an introduction to unit testing in PHP. It's necessarily quite introductory as it's intended for developers with little to no testing experience.

The demo code I knocked up for this one is sat over on demo.pointbeing.net.

Part 2 will be somewhat hairier, and I'll look at mocks, fixtures, some testing best-practices and a few other bits and bobs.

Posted on Friday, the 22nd of August, 2008 | permalink | comment

"PHP Tools for Mobile Web Development" Published

The cover of php|architect's July 08 issue

This is just a quick heads up to say that my article, "PHP Tools for Mobile Web Development" has today been published, and is currently gracing the cover of July's php|architect magazine.

Of course, I jinxed things a little by blogging that it would be published in June, but never mind, we got there.

Big thanks must go to Ciaran for giving the initial draft the once over (on a related note, check out Ciaran's post about web development for the iPhone). Thanks also to my occasional colleague Gerard for clueing me in to the fact that the damn thing had been published.

For what it's worth, php|architect is recommended reading even when I'm not in it, so get yourself over there and get subscribed!

Ok...now to crack on with that second article...

Posted on Tuesday, the 29th of July, 2008 | permalink | comment

Benchmarking Zend Download Server

Recently I've started looking into ways that the PHP dev team in which I work can make better use of our Zend Platform installation.

For that reason, the recent Ibuildings/Zend seminar in London on the subject of "Enterprise PHP" was well timed, as it included a pretty detailed run through of a lot of what Platform has to offer.

One feature which really struck me as having the potential to bring performance benefits to one of our systems was the Zend Download Server. Back at the office, I looked into the feature, and ran a few benchmarks. Oddly though, the results don't seem to flatter Zend Download Server.

Zend Download Server

The premise behind Zend Download Server (ZDS) is that tying up valuable Apache HTTPD threads purely to serve static content is overkill, and far from efficient.

This is the same reason why lightweight webservers such as lighttpd are becoming popular. Lightweight webservers are typically run alongside a more powerful server such as Apache, and are dedicated to serving static content, leaving more Apache threads free to deal with the dynamic - for example PHP-based - requests.

ZDS follows that principle, although it works a little differently to lighttpd: it runs as a standalone process, but it hijacks a single Apache thread, thus allowing Apache to delegate the relevant requests down to ZDS.

ZDS can be utilised in a couple of ways. Firstly, there's 'transparent mode', whereby the administrator configures Platform in advance, telling it to hand specific downloads (say, all JPGs and GIFs over 128KB) off to ZDS.

The second option is 'manual mode', whereby the developer hooks directly into ZDS using a simple call to the proprietary zend_send_file() function. zend_send_file() is designed as a drop in replacement for functions such as fpassthru(), which simply read in the contents of a file and send them to output (in this case the HTTP response).

zend_download_file() seemed ideal for my needs, but not wishing to break the third rule of optimization, I decided to do a little benchmarking before I got too excited.

Benchmarking ZDS

I compared the two simplest possible scripts I could come up with. This first example uses the built-in PHP function, fpassthru():

<?php

$file 
fopen('cat2.jpg''r');
fpassthru($file);

And here's the amended version of the script, using zend_send_file()to deliver the file:

<?php

zend_send_file
('cat2.jpg');

Pretty straightforward stuff, all in all. Both scripts are delivering the same file, a JPG image of slightly less than 500KB.

I threw some load at the script using the ab benchmarking tool that ships with Apache HTTPD. Here's an example of the kind of command I ran:

./ab -n 200 -c 10 http://platform/deliver_file.php

The -n argument specifies the total number of requests, while -c specifies the number of concurrent requests that ab will try to make.

Here's an abridged version of the output using fpassthru():

Time taken for tests:   8.849252 seconds

Requests per second:    22.60 [#/sec] (mean)
Time per request:       442.463 [ms] (mean)
Time per request:       44.246 [ms] (mean, across all concurrent requests)
Transfer rate:          10818.09 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        2   87 514.0     14    4082
Processing:    96  337 182.3    281    1127
Waiting:        9   65  64.4     41     366
Total:        106  425 540.2    309    4364

Percentage of the requests served within a certain time (ms)
  50%    309
  66%    383
  75%    463
  80%    495
  90%    613
  95%    785
  98%   3241
  99%   4224
 100%   4364 (longest request)

And the output using zend_send_file():

Time taken for tests:   8.886843 seconds

Requests per second:    22.51 [#/sec] (mean)
Time per request:       444.342 [ms] (mean)
Time per request:       44.434 [ms] (mean, across all concurrent requests)
Transfer rate:          10721.58 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        1   82 530.4     15    4420
Processing:    62  338 183.1    285    1069
Waiting:       12   41  20.1     38     148
Total:         63  420 595.1    304    5022

Percentage of the requests served within a certain time (ms)
  50%    304
  66%    370
  75%    500
  80%    545
  90%    640
  95%    741
  98%   1096
  99%   4992
 100%   5022 (longest request)

I tried a few combinations of the various parameters available to ab, and quite honestly couldn't find any conclusive difference in performance between using ZDS and not using it. In fact, under low or very high load, zend_send_file() seemed to slow things down a little.

Conclusions

I'm pretty surprised - enough to doubt the validity of my tests, I admit. I don't believe for a minute that Zend would make false claims for features of their flagship product, so I must be doing something wrong. But what?

I'm aware that I'm measuring network speed as much as anything, and that the claimed benefits of ZDS centre around reduced load on the server side. But still, who cares if load is down when, at the end of the day, performance doesn't improve?

The one concrete lesson that I can offer up from all of this is that it's always valuable to follow the third rule, and Profile Before Optimizing.

Posted on Saturday, the 12th of July, 2008 | permalink | comment

Fighting Spam and Digitising Books with reCAPTCHA

When I added a comment form to this blog, I wondered how long it would be before I started getting comment spam. Then I wondered if I was flattering myself to think that spam bots would even be interested in my site.

So it's with mixed emotions that I have to admit that right now the number of spam comments I'm receiving is outstripping the number of genuine comments by a ratio of about 10:1.

The time has come to add a CAPTCHA to the comment form.

The Wikipedia article describes the CAPTCHA concept adequately, so I'll merely summarise that a CAPTCHA is a simple test that the poster of the comment is human. I show you a picture of some wonky-looking text, and you type the words you see into the box provided.

Some wonky-looking text

Fig 1: Some wonky-looking text

If you correctly identify the words, I'll assume that you're a real person, and not an evil bot. And your comment will get posted. Simple as that.

reCAPTCHA

I had been meaning to have a play with reCAPTCHA since it caught my eye a few months back. It's a great idea: a totally free CAPTCHA tool, developed by Carnegie Mellon University, that anyone can use on their website.

What makes reCAPTCHA special is that at the same time as you're reading that wonky text and entering the words in the box, you're playing your part in a global effort to digitise pre-computer era books, by deciphering the words that OCR software struggles with. There's a more detailed overview of the project here.

It's kind of a cool idea, so I'm going to co-opt reCAPTCHA to help me fend off those evil spammers. I won't be alone: reCAPTCHA counts sites as large as Facebook, Twitter and StumbleUpon among its users [1].

Implementation

The first step in using reCAPTCHA is to drop in at the reCAPTCHA site and get yourself an account. Of course, you'll have to fill in a reCAPTCHA to do this!

As part of the signup process, you'll be prompted to request a key for your first domain (each key is restricted for use on only one domain, apparently for security reasons). In fact, you receive both a public and a private key, and we'll see how to use those shortly. The whole process takes about two minutes.

Once you're signed up, you're free to start implementing reCAPTCHA. For us PHP users, this is delightfully simple, as the reCAPTCHA guys have thoughtfully knocked up a small library to wrap their API. You can download the library from the project's Google Code pages.

Simply download the code, and unzip it somewhere sane and accessible on the webserver. I'll refer to the installation directory as /path/to/recaptcha for the purposes of this post.

To begin using reCAPTCHA, we'll start by adding some HTML to the comment form in order to display the reCAPTCHA challenge box. The library generates all the HTML we need:

<?php

require_once '/path/to/recaptcha/recaptchalib.php';

// public key as provided during the signup process
$publickey '...';

echo 
recaptcha_get_html($publickey);

It really is that simple, and the reCAPTCHA challenge box shows up as if by magic. With its default theme, it looks like so:

Screenshot of the default reCAPTCHA challenge box

Fig 2: The default reCAPTCHA challenge box

Drop that HTML into the appropriate place in whichever form you want to protect from spam. Once the form is submitted, you can check the validity of the submission as follows:

<?php

require_once '/path/to/recaptcha/recaptchalib.php';

// private key as provided during the signup process
$privatekey '...';

$resp recaptcha_check_answer(
            
$privatekey,
            
$_SERVER['REMOTE_ADDR'],
            
$_POST['recaptcha_challenge_field'],
            
$_POST['recaptcha_response_field']);

if ( 
$resp->is_valid ) {

    
// assume the user is human
    // so post the comment

} else {

    
// CAPTCHA was not entered correctly
    // so redisplay the form
}

Job done, basically. You can theme the actual reCAPTCHA box - to an extent - quite easily, which is nice as the default beige and maroon jarred a little with my fetching grey and turquoise getup. To do that, add a small snippet of Javascript to the form page:

<script type="text/javascript">
var RecaptchaOptions = {
   theme : 'white',
};
</script>

There is also a 'custom' theme which gives you a lot more control over the look and feel, but for the time being I stuck with 'white'. The whistles and bells can wait!

That's really all there is to it. If you like, you can see the finished comment form, replete with reCAPTCHA, for this post. Time will tell what effect this has the amount of spam I receive.

Footnotes

[1] http://news.bbc.co.uk/1/hi/technology/7023627.stm

Posted on Saturday, the 5th of July, 2008 | permalink | comment