MySQL versus PostgreSQL: Adding an Auto-Increment Column to a Table

Posted in Databases and Programming on Wednesday, the 5th of March, 2008.

The bulk of my database experience (almost eight years now) has been with the popular open-source MySQL database management system. MySQL has progressed significantly over the years, and has grown into a remarkable product. It finally has all the must-have features such as views, stored procs and referential integrity, coupled with the blistering performance for which MySQL has always been known. In short, it rocks.

But I digress. I've recently been having to get to grips with PostgreSQL (an old version of course - 7.1 or so - just to make life really interesting). It's largely intuitive, but there are quirks around most corners. This is my favourite so far: I recently needed to add an auto-incrementing integer "id" column to a table.

MySQL

This sort of thing will be second nature to MySQL developers:

ALTER TABLE mytable ADD myid INT UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE;

One SQL command - not bad.

PostgreSQL

It turned out not to be so easy with our Postgres installation. For a start, there are no auto_increment columns. So, as with several other major RDBMSs, the solution is to create a "sequence", which is kind of like a pseudo-table which acts as a counter:

CREATE SEQUENCE mytable_myid_seq;

Next, we have to add our new column to the table, and specify that for each new row it should take its value from the sequence, using the NEXTVAL() function. For reasons best known to the Postgres guys, you can't do this in one step:

ALTER TABLE mytable ADD myid INT UNIQUE;

And then:

ALTER TABLE mytable ALTER COLUMN myid SET DEFAULT NEXTVAL('mytable_myid_seq');

We're getting there. We now have an auto increment column. The problem is that Postgres won't backfill this with values automatically: all pre-existing rows are currently null for this column. Let's change that:

UPDATE mytable SET myid = NEXTVAL('mytable_myid_seq');

Job done. Well, some time later, the job will be done. That final step is one of the slowest things you can possibly ask Postgres to do. For a mid-sized table (around 5,000,000 rows, with a handful of small numeric and text columns), that took about 2.5 hours on powerful hardware - so you'll want to leave this for a quiet time. Fortunately Postgres treats the UPDATE as an atomic transaction: nothing is committed until the command completes, so it will be difficult for you to leave the data in an inconsistent state.

Send to:

Comments

Posted by Paul F on Thursday, the 8th of May, 2008.

Postgres auto-increment column in 1 step:
alter table mytable add myid serial;

Type 'serial':
1. Sets type to int.
2. Creates sequence for you.
3. Populates unique values in all existing rows using sequence.
4. sets default on column to nextval of sequence.

Still 1-0 to mysql?

Posted by Simon on Saturday, the 24th of May, 2008.

Hi Paul - thanks for stopping by.

That's certainly true in the latest Postgres releases, but is in fact a relatively new feature.

Unfortunately, for the time being, I'm lumbered with 7.something, where that SQL fails with the infamous "ERROR: adding columns with defaults is not implemented" message.

Hence the long-winded workaround!

Posted by Andy Chapman on Friday, the 25th of July, 2008.

You might also say that MySQL 5 is relatively new too. So if we're comparing old databases together, MySQL 4 was not in the same ball park as Postgres 7.

Posted by Simon on Sunday, the 27th of July, 2008.

Hi Andy, thanks for your comment. It's really nice to see how many Postgres guys have stumbled upon these posts :)

MySQL5 isn't especially new, but that's hardly the point: MySQL has had auto increment columns since...well, longer than anyone cares to remember.

My post isn't about which is the better RDBMS, it's merely about my own experiences getting rudimentary functionality out of a version of Postgres which, to be fair, is still widely in production use.

Posted by James on Thursday, the 31st of July, 2008.

Apparently the SERIAL type was added in version 6.4 back in 1998. You sure it doesn't work in version 7.1?
http://www.postgresql.org/docs/6.4/static/release.htm#AEN12985

Posted by Simon on Friday, the 1st of August, 2008.

I'm afraid so. The existence of the SERIAL type is not the issue, it's the ability to add it to an existing table (see the title of the post).

Apologies if the title, post and comments don't make it clear. I would reword things but I'm failing to see the ambiguity right now.

The longer answer is that adding a SERIAL to a table is merely a shorthand for adding a column and setting a default for it in one step, which is not allowed until much later versions of Postgres.

Posted by Matthew Miller on Wednesday, the 6th of August, 2008.

I developed a system that used on Postgres 7.x almost three years ago and was able to create my auto incrementing id using SERIAL. The problem, as you pointed out, was on an existing table. How the table came to existence without an auto incrementing field is not made clear, but surely that should have been added when table was created, right? I like MySQL, especially with the progress it has made (as you noted), but I don't see it having an upper hand in this case because: 1) you're using a well outdated version of PostgreSQL, and 2) the table in question was not created with the auto incrementing field from the start. I was able to add a SERIAL field with to a table that had none and have backfill; however, this was on 8.2.5 so this particular argument doesn't carry much weight because of the version mismatch.

On a side note, why must this textarea for comments be so tiny!? Ciao!

Posted by Marko on Thursday, the 30th of October, 2008.

You are comparing two products on a single attribute (from more than thousand attributes). So before making final conclusions a fair comparison should be made.

Posted by JP on Monday, the 10th of November, 2008.

PG, the only conclusion I could find here is that its more work to add an auto-increment column to a pg db < v7.3 than to a mysql db.
Though I would have liked it if the comparison was based on the current version of both dbs and their structural/implementation differences for auto-incrementing columns were compared. Then I think the title would be more apt. IMHO.

cheers!

Join the discussion! Post your comment.