TheRarestWords | RarestNews | Suggestan | TheCraziestIdeas| SemanticKernelBot | Flim.me | My dev.blog | Йои Хаджи


20th Jul 2008

I don’t get it - real web application with PostgreSQL vs mySQL MyISAM vs mySQL InnoDB (with Django’s ORM, 2008)

UPDATE: This has been Reddit. Read the comments. The main thing to understand that those results are for default settings of both databases for my case and my priorities. Yours could (and maybe even should) be different.

Well, this and last year I hear everywhere that PostgreSQL is the way to go and that usage of mySQL in 2008 makes people puke… But without any real arguments (besides “Postgres is the way to go”).Well, I don’t usually buy into fashion-style technologies shopping (it’s when someone can’t prove something’s better that what I use) and this time it wouldn’t be an exception.

Ok, so scouring the Internet I’ve found some comparative tests. Mostly in form of “INSERT 10000 items WITH COMMIT AT THE END”. Okay, how many people actually inserted 10000 items in a real web-application (besides dumping-restoring-moving data)? Some people did, but they were both unavailable for comments :) Just kidding.

Ok, so since I’m with Django - moving to Postgres and testing my application (RarestNews) should be a snap, isn’t it? Just change the database string in settings.py and install PostgreSQL, right? Wrong! :) But there’s a time for everything step-by-step.

PostgreSQL: first steps, first falls

Okay, so the installation is pretty straightforward “yum install postgres postgres-server” then initdb, createdb, CREATE USER (BTW this is feature I find much easier to use with postgres than with mysql - I couldn’t even remember that GRANT syntax)… so here we go the database is running.

Do we have something like phpMyAdmin? We do? phpPgAdmin? Great. Let’s install that too. Okay, let’s create a table and and entry. Okay! Great. Wait a minute… Why is it “approximate number of rows” and it equals 0 even if there’s a row. Well, part of a nature’s plan I guess.

Let’s move on. So, changing the adapter in settings.py to python-psycopg2, syncdb and viola! The tables are there. So those are both SQLs, right, so let’s mysqldump and load it into PostgreSQL. Ugh-ha.. That doesn’t work? That’s two different SQLs? Okay, whatever..

Moving from mySQL to postgres with Django

So, obviously after a few miserable attempts I’ve found no easy way to move my data to Postgres. Okay, but we’re in Django! Let’s use dumpdata and JSON to move the data, like many claim to do.

Well, changing DATABASE_ENGINE = ‘mysql’, manage.py dumpdata, DATABASE_ENGINE = ‘postgresql_psycopg2′, manage.py loaddata.. and voila!

“DecodeError: expecting property name in object at position 1″

Whaaa? Very informative. 27 megabytes of JSON and that’s all?? Great! So, that obviously didn’t work.

Alternative move

Well, let’s start over and re-create from the scratch, at least the table structure is there. Firing the girl up and let’s go. After a few hours of scouring the web, let’s see the results. But first, let’s backup the data.

pg_dump rarest > dump.sql

Hmm. That wasn’t that hard. Let’s try deleting and restoring it!

psql -U < dump.sql

ERROR:  insert or update on table “news_****” violates foreign key constraint “news_****_fkey”
DETAIL:  Key (
****_id)=(34) is not present in table “news_****“.

What in the name of Saint Asparagus? I’ve just dumped that data. I mean YOU, Posgres, dumped your own data and NOW YOU REFUSE TO restore it? And the database wasn’t even accesed during dump/restore! Rock-solid super-stable never-lose-a-byte database? I REALLY don’t get it.

Okay, before that crapportunity offered me a new view on database consistency (I’ve never experienced anything like that with “unstable” mySQL - if it dumps - it reads!) I did collect the data and frankly I’m unimpressed.

Django has nice db.queries interface in Debug mode which log time take to execute each query and reset_queries() even sweetens the deal.

The table shows avg time (in ms) for 1 query to execute in RarestNews application (all queries are the same, Django ORM is used, database setup is default everywhere, no tuning, database is small enough to fit in memory). It’s an average of about ~1000 queries.

Queries are (it’s a real working web application so there are many types): selects by primary keys, a few order_by’s with indexes, a few unique constraints, joins up to 3 tables, some denormalization present.

Children
processes
mySQL
InnoDB
mySQL
MyISAM
PostgreSQL Comment
10 1.00 Not really a test, just a reference point
25 1.57 1.63 1.85 That seems consistent
50 3.90 4.01 Going nose-to-nose
75 3.20 3.40 13.00 WHaaaaa?!

Okay, let’s see - first all three engines go nose-to-nose and.. Well.. After reading some material on PostgreSQL I was expecting it to perform best under a lot of concurrency! But it’s 4 times slower!!

Well, no. Okay, actually to be fair at first the PostgreSQL average was 25ms (vs ~3 milliseconds on mySQL) on average query with high up to 85ms on a single query. But I thought - hey, let’s give it a chance and after some warm up I’ve cleaned the data and re-collected it. It was down to 6ms but at some point it began to collect to 20+ms again, then lowering and settled around 13ms at 1000 queries point.

Well.. Sorry, I still don’t buy the idea that PostgreSQL is absolutely superior to mySQL (c’mon! “20.1 is not a valid Integer and that’s why I rollback whole transaction” That’s a superior DB??! Can’t it make it just 20? Is that so hard? mySQL does it without sweating) and also like I said before it lost all my data in backup process (without hardware failure). I wouldn’t even complain about it with mySQL - it doesn’t present itself as ultra-stable replacement for other databases (although from experience - it is).

Django and PostgreSQL? Do they mix?

The developers of Django like to tell that they use only PostgreSQL and nothin else. Well.. there was one more problem. Django wouldn’t even work with PostgreSQL until I applied #3460 patch.It was showing me “current transaction is aborted, commands ignored until …” error with not even my SQL (something from Django’s insides) [just in case - yes, I use SVN version]

Django developers like to procastinate. I mean it’s been open for one year. And the patch? Change ONE NUMBER from 1 to 0! The status of a patch without which the Django doesn’t even work… a year after? “Design decision needed” Dammit, a year is a lot of time to make a decision!

I mean I love Django! It’s great! But the devs should accept more patches from community (most of reasonable patches are still “design decision needed”).

This entry was posted on Sunday, July 20th, 2008 at 7:25 pm and is filed under site.

Subscribe via RSS: or e-mail (the form in right sidebar).