danielcasanueva.eu

Haskell and kill signals

In Haskell (at least with GHC), asynchronous exceptions may be masked using mask. This allows the programmer to control when asynchronous exceptions are raised. The bracket function uses masking to ensure that clean-up code is run, even in the presence of exceptions. We can write some simple code to demonstrate this:

import Control.Exception (bracket)
import Control.Concurrent (threadDelay)

main :: IO ()
main =
  bracket
    (putStrLn "Open resource")
    (\_ -> putStrLn "Close resource")
    (\_ -> threadDelay 20_000_000) 

Running this code produces the following output:

Open resource
<20 seconds idle>
Close resource

The Close resource line should be printed, even if an exception is thrown during the bracketed action (in this case, the 20 seconds wait). Additionally, since the cleanup code is run masked, it won't receive any asynchronous exceptions. We can trigger an UserInterrupt exception in the main thread of our program by pressing CTRL+C on the keyboard before the 20 seconds run out. Doing this, produces the following output:

Open resource
^CClose resource

As expected, Close resource is printed right after interrupting the process. This works because GHC installs a signal handler for SIGINT. When a SIGINT signal is received (as when pressing CTRL+C), it is caught and an UserInterrupt exception is thrown in the main thread asynchronously. This exception interrupts threadDelay. Then bracket catches that exception, runs the cleanup code (printing Close resource), and re-raises the exception. So far so good.

The problem

Now suppose our program is a long running one. We have it running in the background. At some point, we need to restart it (maybe because of an update). So we kill it using kill or maybe indirectly through systemd. Then we check the logs and find the clean-up code wasn't called at all! What happened?

When we tested our example above, the process was interrupted via a SIGINT signal. However, kill sends by default a SIGTERM signal. But GHC didn't install any signal handler for SIGTERM. So the process was terminated abruptly.

Indeed, if we run our example, and use kill instead of CTRL+C, this is the output we get:

Open resource
Terminated

The solution

To solve this problem, we need to install the signal handler ourselves. Thankfully, this is very simple. The module System.Posix.Signals from the unix package has all we need.

import Control.Concurrent (threadDelay, myThreadId, throwTo)
import Control.Exception (bracket, AsyncException (UserInterrupt))
import System.Posix.Signals (installHandler, sigTERM, Handler (CatchOnce))

main :: IO ()
main = do
  mainThreadId <- myThreadId
  _ <- installHandler sigTERM
         (CatchOnce $ throwTo mainThreadId UserInterrupt)
         Nothing
  bracket
    (putStrLn "Open resource")
    (\_ -> putStrLn "Close resource")
    (\_ -> threadDelay 20_000_000) 

Now, when we kill the process before the wait ends, we get:

Open resource
Close resource

Just as we wanted.

Note that the choice of UserInterrupt here was arbitrary. Another exception may be used. Also, CatchOnce was used, which means that only the first SIGTERM signal will be caught.

Opinion

I think GHC, in addition to the SIGINT handler, should install a handler for SIGTERM by default, throwing an exception different from UserInterrupt. We don't need to have default handlers for all signals, but SIGTERM is used most of the time for long-running processes, and I've witnessed (and experienced myself) a few instances of people running into issues because they trusted their brackets.