the joe shaw seminar series

Yesterday I fixed a bug in libbeagle that was causing some strange behavior when sending asynchronous requests to the daemon.

The problem was a common bug found when using watches on glib’s GIOChannel. This misuse is so common that I need to describe it and comment on it here.

The code often looks like this:

static gboolean
watch_foo_cb (GIOChannel *channel, GIOCondition cond, gpointer user_data)
{
	if (cond & G_IO_HUP) {
		 /* Received a hangup on the socket, bail out! */
		 return FALSE;
	}

	if (cond & G_IO_IN) {
		 /* Handle reading data here! */
	}

	return TRUE;
}

static void
do_watch_foo (GIOChannel *channel)
{
	g_io_add_watch (channel,
			G_IO_IN | G_IO_HUP | G_IO_ERR,
			watch_foo_cb,
			NULL);
}

Can anyone see what the bug here is? I’ve run into it so much, it’s become immediately obvious to me: We are handling the hangup case
(G_IO_HUP) without also handling the available data case (G_IO_IN).
These two cases are not mutually exclusive!

The GIOCondition is a bitfield, not a switchable integer value. This means that:

  • Values need to be checked using logical AND.

    Wrong: if (cond == G_IO_HUP)
    Right: if (cond & G_IO_HUP)

    In the example code I showed above, we do this correctly.

  • We need to handle multiple potential conditions! It is a quite common to get both G_IO_IN and G_IO_HUP in the same callback, if the other side sent a bunch of data and then closed the socket. In our code above, we are bailing out if our connection was hung up, even if we had data to read off the channel.

    In libbeagle, this meant that instead of sending 100 responses to 100 requests, we were getting maybe 15 responoses. You can try Raphael’s test case to see this in action.

The right way to write the watch_foo_cb function is like this:

static gboolean
watch_foo_cb (GIOChannel *channel, GIOCondition cond, gpointer user_data)
{
	if (cond & G_IO_IN) {
		 /* Handle reading data here! */
	}

	if (cond & G_IO_HUP) {
		 /* Received a hangup on the socket, don't watch this channel anymore! */
		 return FALSE;
	}

	return TRUE;
}

In most cases, it’s enough to simply move your hangup or error handling to after you handle the incoming data. But YMMV.

So remember, when you set up a watch on a GIOChannel, remember to handle all of your potential conditions within your callbacks!