My first PostgreSQL patch/hack

After almost two years of lobbying at work to promote PostgreSQL instead of MySQL, after two years of tears against poor DB typing, lack of advanced SQL features, traps in what seemed like basic needs (select * from myview where a=5, if myview is not «simple», triggers a full scan on the view result)… we are finally starting to deploy some PostgreSQL databases.

I wrote my dear sysadmin friends documentations, argumentations. Among my «it’s simpler» arguments, I promoted application_name as a way to quickly know who is hurting that bad your SQL server, along with the so simple ps faux to know who is doing what on your server… That’s the simplest monitoring possible, but when you’re in a hurry at 3AM, it can help you… Sadly, the application_name is not displayed in the ps output. I was looking for a tiny thing to patch PostgreSQL in order to discover its code a bit, it seemed like a good opportunity.

And let’s share also how I did the patch to highlight how easy patching PostgreSQL is 🙂 (the patch has been written together with this blog entry)

 

0) System setup

First thing, you need a minimal development environment for PostgreSQL. Your tools : your favorite compiler (GCC), make, git, $EDITOR, ack-grep and a bit of patience if your CPU is slow (postgresql compiles quite quickly).


~$ cd ~/projects
~/projects$ git clone git://git.postgresql.org/git/postgresql.git
~/projects$ cd postgresql
~/projects/postgresql$ git checkout -b application-name-in-proctitle

1) Exploring the code

We want to modify how the proctitle is constructed. Under BSD systems, it’s done through the setproctitle call, easier to find than the game with ARGV under linux. Soo… let’s search it.

–/projects/postgresql$ ack-grep --type cc setproctitle
src/backend/utils/misc/ps_status.c
42: * use the function setproctitle(const char *, ...)
265: * apparently setproctitle() already adds a `progname:' prefix to the ps
329: setproctitle("%s", ps_buffer);

src/include/pg_config.h
422:/* Define to 1 if you have the `setproctitle’ function. */

Well, only one call… it’s gonna be simple, let’s look at this function. I stripped out comments and ifdefs.

void
set_ps_display(const char *activity, bool force)
{
/* A few exit cases.... */
/* .................... */

/* Update ps_buffer to contain both fixed part and activity */
strlcpy(ps_buffer + ps_buffer_fixed_size, activity,
ps_buffer_size – ps_buffer_fixed_size);

ps_buffer_cur_len = strlen(ps_buffer);

setproctitle(“%s”, ps_buffer);
}

Well, this is quite simple, the activity is added to a buffer containing the fixed portion of the proc title. To prevent useless memory allocations, the same buffer is reused at each call.
Our application_name can change during one connexion, we will need it in this function to dynamically append it before the activity when available.
But where is application_name available ?

~/projects/postgresql$ ack-grep --type cc application_name

This gives us more results, but there is an obvious one :
src/include/utils/guc.h
228:extern char *application_name;

It’s a simple per process global variable…

2) Patching…

So our patch is very straigh-forward now, even for C beginners.

~/projects/postgresql$ $EDITOR src/backend/utils/misc/ps_status.c

The main part we will change is around here :
strlcpy(ps_buffer + ps_buffer_fixed_size, application_name,
ps_buffer_size - ps_buffer_fixed_size);
ps_buffer_dynamic_len = strlen(application_name);


~/projects/postgresql$ git diff

[ too long for a blog, http://pastebin.com/vF9S5CMB ]

It’s not the cleanest patch, but it’s a start.

3) Testing

Compiling and testing PostgreSQL is simple, and can be done along another running instance.

~/projects/postgresql$ ./configure --prefix $HOME/bin/psql-master --enable-debug
~/projects/postgresql$ make -j 6
~/projects/postgresql$ make -j 3 install

Et voilà…

Now that your PostgreSQL has been installed in ~/bin/psql-master, let’s create an instance and start it.
~/projects/postgresql$ cd ~/bin/psql-master/bin
~/bin/psql-master/bin$ ./initdb ../data


If you have a PostgreSQL running on port 5432, you’ll have to change the port in the ../data/postgresql.conf file. I changed it to 5435.
And now…

~/bin/psql-master/bin$ ./postgres -D ../data
LOG: database system was shut down at 2013-06-17 23:51:29 CEST
LOG: database system is ready to accept connections
LOG: autovacuum launcher started

Let’s test it, in another console :

~/bin/psql-master/bin$ ./psql -p 5435
psql (9.4devel)
Type "help" for help.
postgres=# select pg_sleep(10);

And while the sleep is running, we can do a ps :

snoopy 14791 0.0 0.0 164128 4056 ? Ss 23:53 0:00 | \_ postgres: snoopy postgres [local] psql SELECT

And if I change the application name…

snoopy 14791 0.0 0.0 164128 4320 ? Ss 23:53 0:00 | \_ postgres: snoopy postgres [local] my_buggy_script.pl SELECT

Our first patch is working !

But it’s not ready to be submitted yet… Maybe people are using the current ps output and awk/cut-ing it to extract data. Better not crash these by adding a configuration option maybe.

4) Back to code…

First, you must commit your current patch. Git is your friend, use it 🙂

~/projects/postgresql$ git add src/backend/utils/misc/ps_status.c
~/projects/postgresql$ git commit -m "Display the application_name in proctitle before the current activity"
[application-name-in-proctitle 28a6b94] Display the application_name in proctitle before the current activity
1 file changed, 17 insertions(+), 2 deletions(-)

Now, we can look at adding a new configuration option.
We know that a configuration option exists to disable update_process_title entirely. We can just look at how it works, and copy it.

Here is the code in set_ps_display :

/* update_process_title=off disables updates, unless force = true */
if (!force && !update_process_title)
return;

And where is that variable used/defined ?

~/projects/postgresql$ ack-grep --type cc update_process_title
src/backend/utils/misc/guc.c
1039: {"update_process_title", PGC_SUSET, STATS_COLLECTOR,
1043: &update_process_title,

src/backend/utils/misc/ps_status.c
35:bool update_process_title = true;
295: /* update_process_title=off disables updates, unless force = true */
296: if (!force && !update_process_title)


a few if (update_process_title)


src/include/utils/ps_status.h
15:extern bool update_process_title;

What is that guc thingy ? The file heading comment says :

* guc.c
*
* Support for grand unified configuration scheme, including SET
* command, configuration file, and command line options.

This is the file we are looking for.

Well, let’s just copy the way the update_process_title setting is created…

This is really straigh-forward, nothing interesting here. I named the setting application_in_process_title…


~/projects/postgresql$ git diff --stat src
src/backend/utils/misc/guc.c | 10 ++++++++++
src/backend/utils/misc/ps_status.c | 4 ++--
src/include/utils/ps_status.h | 2 ++
3 files changed, 14 insertions(+), 2 deletions(-)

The full patch is available here : http://pastebin.com/eg0NZGGr

And we have our configuration setting, allowing us to switch between application_name in the process title and the current behaviour.

5) Conclusion

In less than one hour, with no PostgreSQL code experience, it’s really easy to hack a simple patch like this one.
After having nightmares through the MySQL code, it’s a real pleasure to hack on PostgreSQL code : it’s simple, commented, logical…

Now I can start the review process for that patch maybe 🙂