Week 8: Signals
Signal basics
A signal is an asynchronous alert sent from one process to another (or from a process to itself). There is a fixed set of named signals, and they carry no data other than their signal number. Signals have a variety of uses:
SIGTERM
requests the process to terminate.SIGKILL
forcibly kills the process.SIGCHLD
is sent to a parent process when a child process exits.SIGSEGV
is sent to a process when it triggers a segmentation fault via invalid memory access (normally this signal is not caught).SIGWINCH
alerts a terminal program that the window has been resized.
In a process, each signal has a corresponding disposition that determines what happens when the signal is received. The possibilities are:
- Ignore the signal.
- Perform the default action, which depends on the signal.
SIGTERM
terminates the process, for instance, whileSIGWINCH
does nothing. - Catch the signal and call a signal handler function.
Though the special signals SIGKILL
and SIGSTOP
cannot be ignored or caught.
A process can set a signal type to be blocked (also referred to as masked). If an incoming signal is blocked by the recipient process, the kernel will keep it pending until the process unblocks it (or sets the disposition to "ignore"). Masking is usually temporary – otherwise you would just set the signal to be ignored.
System calls
To send a signal, use the kill
function, which despite its name can send any signal and does not necessarily terminate the target.
int kill(pid_t pid, int sig);
pid
is normally positive, in which case the signal is sent to the process with that PID. If zero or negative, the behavior differs; read the man page for details.
sigaction
is how you change the disposition of a signal.
struct sigaction {
void (*sa_handler)(int);
sigset_t sa_mask;
int sa_flags;
// more fields
};
int sigaction(int signum, struct sigaction* act, struct sigaction* oldact);
You can use it like this:
void handler(int signo) {
// ...
}
struct sigaction act = { 0 };
act.sa_handler = handler;
int r = sigaction(SIGTERM, &act, NULL);
Set sa_handler
to the special values SIG_IGN
to ignore the signal or SIG_DFL
to set it to the default disposition.
sa_mask
in struct sigaction
is the set of signals that will be temporarily masked while the signal handler is executing.
sigprocmask
lets you query and set the process's signal mask:
int sigprocmask(int how, sigset_t* set, sigset_t* oset);
how
can be one of:
SIG_BLOCK
– block all signals inset
SIG_UNBLOCK
– unblock all signals inset
SIG_SETMASK
– set the signal mask toset
(all signals in the set are blocked; all signals not in the set are unblocked)
The first two only change the signals that are in set
, while SIG_SETMASK
changes every signal. If oset
is not NULL
, then it will be set to the process's current signal mask. If you just wish to query and not set the mask, then pass NULL
as the second argument.
sigprocmask
takes two signal set (sigset_t
) arguments. There are some helper functions for manipulating signal sets: (man 3 sigsetops
)
int sigemptyset(sigset_t* set);
int sigfillset(sigset_t* set);
int sigaddset(sigset_t* set, int signo);
int sigdelset(sigset_t* set, int signo);
int sigismember(sigset_t* set, int signo);
sigfillset
selects all signals in the set; sigemptyset
unselects them; sigaddset
and sigdelset
select or unselect, respectively, individual signals; sigismember
tests if a signal is selected in the set.
We can wait for any signal to arrive with pause
, or a particular signal (or set of signals) with sigwait
:
int pause();
int sigwait(sigset* set, int* sig);
But we'll see in the next section that these functions are tricky to use safely.
Signals are hard
Signals are one of the trickiest parts of Linux. Signals can arrive at any time, and signal handlers execute concurrently with the rest of the program, so they have all the drawbacks and difficulties of concurrency that we saw last week.
Because signals can arrive at random times, a program's data structures may be in an inconsistent state when a signal handler is invoked. Only a subset of C standard library functions are safe to call from a signal handler – see man 7 signal-safety
. Example: It's not safe to call malloc
from a signal handler, because the signal might have been received in the middle of a different call to malloc
!
With that in mind, it's a good idea to do as little as possible in the signal handler. Ideally, you just set a flag and return, and test the flag in your program's main loop.
pause
is perilous. Consider the following snippet:
sigaction(SIGCONT, &(struct sigaction){.sa_handler=my_handler}, NULL);
puts("Send me SIGCONT to continue!");
pause();
Do you see the bug?
If SIGCONT
is delivered after puts
but before pause
, then pause
will block until it is sent a second time – unlikely in this toy example, perhaps, but a very easy mistake to make in real programs.
We can fix it with sigsuspend
, which atomically sets the signal mask and then goes to sleep, restoring the old signal mask when woken up:
sigaction(SIGCONT, &(struct sigaction){.sa_handler=my_handler}, NULL);
sigset_t set, oldset;
sigemptyset(&set);
sigaddset(&set, SIGCONT);
sigprocmask(SIG_BLOCK, &set, &oldset);
// SIGCONT is blocked.
puts("Send me SIGCONT to continue!");
sigsuspend(&oldset);
// SIGCONT is unblocked until `sigsuspend` returns.
sigprocmask(SIG_SETMASK, &oldset, NULL);
// SIGCONT is unblocked.
This example was based on the blog post "The perils of pause(2)" by Julian Squires.
Another problem is that signals can interrupt system calls – if certain syscalls are pending when a signal is delivered, the syscall will be cancelled and return EINTR
. Only system calls that can block forever can be interrupted by signals. For instance, reading input from a terminal device, or waiting on a lock, or listening for a network connection. But not disk I/O. This design choice is the subject of a famous essay from the 1990s.
There are two things you can do about interrupted syscalls:
- Include
SA_RESTART
inact.sa_flags
when passed tosigaction
. Syscall interrupted by the signal will be automatically restarted – though some syscalls can't be restarted (seeman 7 signal
). - Wrap any potentially blocking syscalls with a loop that retries on
EINTR
. This can be annoying in practice because you don't always know whether a syscall can block – for instance, callingread
on a file descriptor that may point to a file on disk (won't block) or to standard input (might block).
Signals and multithreading
There are some nuances to how multithreading and signal handling interact:
- Disposition of a signal is per-process, not per-thread. However, signal masks are per-thread.
- Some signals are delivered to the entire process (in which case the signal handler, if one is registered, is run on an arbitrary thread), and some are delivered to a particular thread (for example, hardware exceptions triggered by that thread).
Homework exercises
- (★) What are the three possible dispositions for a signal?
Solution
A signal can be (a) ignored, (b) left to its default action, or (c) caught by a signal handler. - (★) What is the difference between "masked" and "pending" for a signal?
Solution
If a signal is masked, then an incoming signal will be set as pending and not delivered until the signal is unmasked (or set to be ignored). - (★) True or false:
SIGTERM
can be ignored.Solution
True.SIGKILL
is the equivalent that cannot be ignored. - (★★) Final project (web server): Your web server should shut down in an orderly fashion when it receives a
SIGTERM
signal. The first time it is received, it should stop accepting new connections, but continue to service existing connections. If received a second time, it should close the existing connections and exit. - (★★) Final project (database): Depending on how you implemented entry deletion, your database could end up with wasted space for deleted entries. Write a
compact()
function that gets rid of unused disk space. Install a signal handler so thatcompact
is called whenSIGUSR1
is received. (But don't call it in the signal handler!) Add a command to your client program to send this signal. (You'll need a way for the client program to discover the server's PID.)Hint
One typical approach is for the server to write its PID to a file at a known location at start-up. - (★★) Write a program that shows what happens when the same signal is delivered twice before the program can handle it. Is the signal handler invoked twice, or only once?
Solution
POSIX allows for either behavior;double_delivery.c
shows that on Linux, the signal is only delivered once. - (★★) What happens when a signal handler is executing and the same signal is delivered? What if it's a different signal?
Solution
reentrant_handler.c
shows that a signal is masked while its handler is being executed, but the arrival of a different signal can cause its handler to be invoked in the middle of the first handler! - (★★) If multiple pending signals are unblocked, what order are they delivered in?
Solution
The order is not specified;multiple_pending.c
shows that they do not necessarily arrive in the order they are sent. - (★★) What happens to signal dispositions, masks, and pending signals upon
fork
? What aboutexec
?Solution
Signal dispositions and masks are inherited acrossfork
, but not pending signals. (sig_fork.c
) Signal masks and pending signals are inherited acrossexec
;SIG_IGN
dispositions are inherited, but signal handlers are not. (sig_pre_exec.c
andsig_post_exec.c
) This can cause problems (see the question onsignalfd
below), but there's at least some logic to it: pending signals belong to a process, so they can inherited acrossexec
(which does not create a new process) but not acrossfork
; an installed signal handler cannot be inherited acrossexec
because the old signal handler doesn't exist in the new program. - (★★) What interface does your programming language provide for signal handling? Are you allowed to call arbitrary functions from the signal handler?
Solution
Python's signals API is similar to the C API, but user-defined Python signal handlers are not invoked from within the low-level C signal handler; they are queued up in the virtual machine for the next time it is convenient to invoke them. This means that the restrictions on non-reentrant functions do not apply inside Python signal handlers – though it is still probably not a good idea to do non-trivial work inside of them. - (★★★) Look up the "self-pipe trick". What is it? When would you use it?
Solution
The self-pipe trick entails (a) opening a pipe with the read and write end in the same process, and (b) having signal handlers that are justwrite(pipefd, signo, 1)
. Then, you can read from the other end of the pipe when it's convenient, and handle signals synchronously. - (★★★) Linux has an alternative interface for handling signals called
signalfd
. Read the man page and write an example program. What does it make easier? What are the drawbacks?Solution
signalfd
lets you listen for signals on a file descriptor instead of using signal handlers.signalfd.c
shows how it works. It avoids all the nastiness of asynchronous signal handlers, and it works nicely withselect
orepoll
if your program has a main event loop. The article "signalfd is useless" describes some of the pitfalls: one big one is that you have to mask the signals you want to handle, and signal masks are inherited by child processes, so you have to be careful to reset the signal mask whenever you spawn a subprocess. (Though that creates a race condition…)