Python daemon threads considered harmful

Update April 2015: Reading it again years later, I regret the tone of this post. I was frustrated at the time and it comes across now as just smarmy. Still, I stand by the principal idea: that you should avoid Python’s daemon threads if you can.
Update June 2015: This is Python bug 1856. It was fixed in Python 3.2.1 and 3.3, but the fix was never backported to 2.x. (An attempt to backport to the 2.7 branch caused another bug and it was abandoned.) Daemon threads may be ok in Python >= 3.2.1, but definitely aren’t in earlier versions.

The other day at work we encountered an unusual exception in our nightly pounder test run after landing some new code to expose some internal state via a monitoring API. The problem occurred on shutdown. The new monitoring code was trying to log some information, but was encountering an exception. Our logging code was built on top of Python’s logging module, and we thought perhaps that something was shutting down the logging system without us knowing. We ourselves never explicitly shut it down, since we wanted it to live until the process exited.

The monitoring was done inside a daemon thread. The Python docs say only:

A thread can be flagged as a “daemon thread”. The significance of this flag is that the entire Python program exits when only daemon threads are left."

Which sounds pretty good, right? This thread is just occasionally grabbing some data, and we don’t need to do anything special when the program shuts down. Yeah, I remember when I used to believe in things too.

Despite a global interpreter lock that prevents Python from being truly concurrent anyway, there is a very real possibility that the daemon threads can still execute after the Python runtime has started its own tear-down process. One step of this process appears to be to set the values inside globals() to None, meaning that any module resolution results in an AttributeError attempting to dereference NoneType. Other variations on this cause TypeError to be thrown.

The code which triggered this looked something like this, although with more abstraction layers which made hunting it down a little harder:

try:
    log.info("Some thread started!")
    try:
        do_something_every_so_often_in_a_loop_and_sleep()
    except somemodule.SomeException:
        pass
    else:
        pass
finally:
    log.info("Some thread exiting!")

The exception we were seeing was an AttributeError on the last line, the log.info() call. But that wasn’t even the original exception. It was actually another AttributeError caused by the somemodule.SomeException dereference. Because all the modules had been reset, somemodule was None too.

Unfortunately the docs are completely devoid of this information, at least in the threading sections which you would actually reference. The best information I was able to find was this email to python-list a few years back, and a few other emails which don’t really put the issue front and center.

In the end the solution for us was simply to make them non-daemon threads, notice when the app is being shut down and join them to the main thread. Another possibility for us was to catch AttributeError in our thread wrapper class – which is what the author of the aforementioned email does – but that seems like papering over a real bug and a real error. Because of this misbehavior, daemon threads lose almost all of their appeal, but oddly I can’t find people really publicly saying “don’t use them” except in scattered emails. It seems like it’s underground information known only to the Python cabal. (There is no cabal.)

So, I am going to say it. When I went searching there weren’t any helpful hints in a Google search of “python daemon threads considered harmful”. So, I am staking claim to that phrase. People of The Future: You’re welcome.