my-post-engine : a postgres-engine for MySQL

Introduction

For a while, I’ve been trying to push some people toward PostgreSQL. It covers their use-cases better than MySQL, should provide greater performance, and should help me keep the hairs on my head.

But most of the time, they are relunctant unless they can really try it. Sadly, when you have huge databases, huge piece of code depending on the MySQL behaviour, you’re in a dead-end.

So I tried to find a way for this MySQL-depending code to speak to a PostgreSQL database seemlessly, in order to ease possible migrations…

What is it ?

After many failed attempts translating MySQL queries to a PostgreSQL syntax on the fly, I decided only MySQL could understand MySQL queries, and hence I decided to take the problem in a different way.

MySQL has a notion of storage engine. It allows you to decide how your tables are stored.

Most known storage engines are MyISAM (the “famous”, transaction-less, lock-full engine) and InnoDB (with transaction support, but say farewell to full-text and spatial indexing)…
Other included, but less known storages, are Blackhole (aka /dev/null) and Federated, where your MySQL server exposes tables coming from other MySQL servers.

And that’s when the crazy idea came : why couldn’t I write a PostgreSQL engine for MySQL ?

How does it work ?

Well, the storage API for MySQL is quite simple : read row, search in an index, delete row, modify row, insert row… That’s really easy to implement that with several simple SQL queries. But… it can not be done with standard-compliant SQL queries. You need to be able to identify each row of a table with a guaranteed unique identifier. Most of tables have a primary key, perfect for that. But you don’t always have that luck. And when you don’t have it, you have to start relying on other tricks, specific to your database engine. MySQL supports limit on delete/update, even without any order by statement. PostgreSQL doesn’t. Instead, we can rely on hidden, system-specific columns like the ctid in PostgreSQL… So your engine can not be really generic.

Anyway… well, using mixed tricks, the end result is quite convincing.

First step, create a simple table with the postgres engine.

CREATE TABLE `todo_pg` ( 
  `id` int(11) NOT NULL AUTO_INCREMENT, 
  `domain` varchar(255) NOT NULL, 
  `date_todo` datetime NOT NULL,  
  `state` varchar(127) NOT NULL,  
  PRIMARY KEY (`id`)
) ENGINE=POSTGRES DEFAULT CHARSET=utf8 CONNECTION='host=localhost port=5432 dbname=todo password=mypass user=mysql'

Now, you can start playing with it.

mysql> select * from todo_pg;
Empty set (0.00 sec)

mysql> insert into todo_pg (domain, date_todo, state) values ('postgresql.org', now(), 'toVisit');
Query OK, 1 row affected (1.14 sec)

mysql> select * from todo_pg;
+----+----------------+---------------------+---------+
| id | domain         | date_todo           | state   |
+----+----------------+---------------------+---------+
|  4 | postgresql.org | 2013-04-01 01:05:52 | toVisit |
+----+----------------+---------------------+---------+
1 row in set (0.00 sec)

mysql> insert into todo_pg (domain, date_todo, state) values ('mysql.org', now(), 'toVisit');
Query OK, 1 row affected (0.08 sec)

mysql> select last_insert_id();
+------------------+
| last_insert_id() |
+------------------+
|                5 |
+------------------+
1 row in set (0.00 sec)

mysql> select * from todo_pg;
+----+----------------+---------------------+---------+
| id | domain         | date_todo           | state   |
+----+----------------+---------------------+---------+
|  4 | postgresql.org | 2013-04-01 01:05:52 | toVisit |
|  5 | mysql.org      | 2013-04-01 01:06:04 | toVisit |
+----+----------------+---------------------+---------+
2 rows in set (0.00 sec)

mysql> update todo_pg set state='done' where domain like '%.org';
Query OK, 2 rows affected (0.19 sec)
Rows matched: 2  Changed: 2  Warnings: 0

mysql> select * from todo_pg;
+----+----------------+---------------------+-------+
| id | domain         | date_todo           | state |
+----+----------------+---------------------+-------+
|  4 | postgresql.org | 2013-04-01 01:05:52 | done  |
|  5 | mysql.org      | 2013-04-01 01:06:04 | done  |
+----+----------------+---------------------+-------+
2 rows in set (0.00 sec)

mysql> truncate todo_pg;
Query OK, 0 rows affected (0.23 sec)

mysql> select * from todo_pg;
Empty set (0.05 sec)

It ain’t no joke, that really works.

Current status

Well, the current implementation is experimental, lacks broad testing, has possible performance issues, and installation is a real pain because MySQL supports plugins, but don’t provide neither build instructions nor .h files to help you build such plugins… It’s a pity writing plugins for MySQL, really… Installation is sucks, only distributions could provide it along their MySQL packages :/

The code is available in gitorious : https://gitorious.org/my-post-engine

Don’t hesitate, have fun testing it. Play with it, crash it, fork it… I only hope it will be usefull to someone 🙂