MySQL versus PostgreSQL: Adding a 'Last Modified Time' Column to a Table

Posted in Databases and Programming on Friday, the 14th of March, 2008.

This is the second post here detailing my ongoing adventures with PostgreSQL. This time I had a requirement to add a "timestamp" column to a table. The point of this being to allow us to track the "last modified" time of a row, without requiring that the application code manage the timestamp itself.

There's a lot of reasons why you might wish to do this. In this case it was to simplify syncing the data into a data warehouse. More specifically, to allow the DBA to easily identify rows which have changed since the last import.

Having done this a couple of times in MySQL, I assumed that the process would be straightforward. I should know better by now!

MySQL

MySQL provides a TIMESTAMP column type, for exactly this purpose. It's formatted as a standard DATETIME, but can optionally be configured to automatically set itself to the current time when the row is inserted (on by default) and/or updated (off by default). It's a breeze to add a column of this type to a table, like so:

ALTER TABLE mytable
    ADD lastmodified TIMESTAMP 
        DEFAULT CURRENT_TIMESTAMP 
        ON UPDATE CURRENT_TIMESTAMP;

Notice the crucial ON UPDATE clause, which tells MySQL to update the timestamp when the row is modified.

All rows in the lastmodified column are initially set to the 'zero' date, i.e. 0000-00-00 00:00:00. If we want to backfill the column with another value we can do that in a further step. This example sets each row's timestamp to the current date and time:

UPDATE mytable
    SET lastmodified=CURRENT_TIMESTAMP

Pretty straightforward, all in all.

PostgreSQL

Let's dive right in and view the Postgres solution in all its glory:

ALTER TABLE mytable 
    ADD lastmodified TIMESTAMP;
 
ALTER TABLE mytable 
    ALTER COLUMN lastmodified 
        SET DEFAULT CURRENT_TIMESTAMP;
 
UPDATE mytable
    SET lastmodified=CURRENT_TIMESTAMP;
 
CREATE OR REPLACE FUNCTION update_lastmodified_column()
        RETURNS TRIGGER AS '
  BEGIN
    NEW.lastmodified = NOW();
    RETURN NEW;
  END;
' LANGUAGE 'plpgsql';
 
CREATE TRIGGER update_lastmodified_modtime BEFORE UPDATE
  ON mytable FOR EACH ROW EXECUTE PROCEDURE
  update_lastmodified_column();

That's a fair deal more complex. The big obstacle here is that Postgres doesn't have the equivalent of MySQL's TIMESTAMP type. It does a have type named TIMESTAMP, but this is analogous to MySQL's DATETIME. We'll have to reverse engineer the behaviour we want.

We'll step though the process and have a closer look at what's involved. We'll start by creating the column and specifying that inserted rows should automatically default to the current date and time. As we've seen before, PostgreSQL won't allow us to do that in one step.

ALTER TABLE mytable 
    ADD lastmodified TIMESTAMP;
 
ALTER TABLE mytable 
    ALTER COLUMN lastmodified 
        SET DEFAULT CURRENT_TIMESTAMP;

Again, we may wish to backfill the column:

UPDATE mytable
    SET lastmodified = CURRENT_TIMESTAMP;

As an aside, I like that Postgres has some convenient shorthand values for commonly used dates. Alongside CURRENT_TIMESTAMP, there's such options as "yesterday", "epoch" and the cosmologically questionable "-infinity". They're documented here.

So far we've created the column, backfilled it with sane values, and any newly inserted rows will be timestamped appropriately. But what about updates to rows? This is where the fun really begins.

Lacking MySQL's ON UPDATE clause, which we noted earlier, we need to create a TRIGGER. A trigger is a kind of "event handler" which is fired off (well, triggered) in response to some action, in this case UPDATEs. To make things even more interesting, Postgres deviates from the SQL99 standard by not allowing us to run SQL directly within the trigger: any functionality we require must be defined in a stored proc:

CREATE OR REPLACE FUNCTION update_lastmodified_column() 
        RETURNS TRIGGER AS '
  BEGIN
    NEW.lastmodified = NOW();
    RETURN NEW;
  END;
' LANGUAGE 'plpgsql';

NEW is a special keyword which refers to the new version of the row. Similarly, OLD is available to us, should we need to access the previous values within the row.

Finally, we have to attach that proc to the relevant table by use of the trigger:

CREATE TRIGGER update_lastmodified_modtime BEFORE UPDATE
  ON mytable FOR EACH ROW EXECUTE PROCEDURE
  update_lastmodified_column();

MySQL's single command has succesfully been implemented as a dozen or so lines of Postgres code. I'll give round two to MySQL, but I'll give points to PostgreSQL for making my day a little more interesting than it might otherwise have been!

Comments

Posted by Daniel on Tuesday, the 10th of February, 2009.

Thanks! This was helpful.

Posted by fansipans on Friday, the 10th of April, 2009.

Unless I'm missing something, all of the trigger contortions above can be obviated by setting the column's default value to "now()".

One of the nice features of PostgreSQL is that you can have a function call as the default, and it will be evaluated on INSERT.

test=# create table foo (bar integer, baz timestamp default now());
CREATE TABLE
test=# insert into foo (bar) values (12);
INSERT 0 1
test=# select * from foo;
bar | baz
-----+----------------------------
12 | 2009-04-10 09:09:52.942531
(1 row)

Posted by fansipans on Friday, the 10th of April, 2009.

Oh dur, I didn't notice you needed it on UPDATE - right. Then you'd need a trigger ;)

Posted by Kevin F on Thursday, the 28th of May, 2009.

As an extension to the discussion, mySQL only supports one such timestamp per table. Thus, if your requirements are to track an INSERT date and an UPDATE date column, for instance, then you would also have to use mysql triggers, as in postgres. It's a strange restriction. beware.

Posted by westcoast coder on Friday, the 24th of July, 2009.

Thanks for this post. It saved my day as I've been scratching to do so since it was so easy to do in MySQL and struggling to do in postgres

Posted by Daniele Pignedoli on Tuesday, the 28th of July, 2009.

Great article, i found it very usefull... anyway i hope the postgres team will soon create a automatic function for this case (as the serial aka auto_increment)

Posted by Ben Maden on Saturday, the 26th of September, 2009.

Great article... I was looking for a way to do this across all tables. I'm sure there must be a way.... not found it but did find this...

Installed the SPI module
http://www.postgresql.org/docs/current/static/contrib-spi.html

psql -U postgres db_name < ./contrib/moddatetime.sql
(needed yum install postgres-contrib first)

So any table can now have the following trigger added....

CREATE TRIGGER table_update BEFORE UPDATE
ON table_name FOR EACH ROW EXECUTE PROCEDURE
moddatetime('updated_at');

Similar to your code but you can now pass in the update field name as required... just in case you don't have a convention.

Thanks again for the great post.

cheers,

Ben

Enter your comment: