Detecting termination from a starter process

If you've created a set of processes using a starter process, then all those processes are children of that process, with the exception of those that have called procmgr_daemon().

If all you want to do is detect that one of those children has terminated, then a loop that blocks on wait() or sigwaitinfo() will suffice. Note that when a child process calls procmgr_daemon(), both wait() and sigwaitinfo() behave as if the child process died, although the child is still running.

The wait() function will block, waiting until any of the caller's child processes terminate. There's also waitpid(), which lets you wait for a specific child process, wait3(), and wait4(). Lastly, there is waitid(), which is the lower level of all the wait*() functions and returns the most information.

The wait*() functions won't always help, however. If a child process was created using one of the spawn*() family of functions with the mode passed as P_NOWAITO, then the wait*() functions won't be notified of its termination!

What if the child process terminates, but the parent hasn't yet called wait*()? This would be the case if one child had already terminated, so wait*() returned, but then before the parent got back to the wait*(), a second child terminates. In that case, some information would have to be stored away about the second child for when the parent does get around to its wait*().

This is in fact the case. The second child's memory will have been freed up, its files will have been closed, and in general the child's resources will have been cleaned up with the exception of a few bytes of memory in the process manager that contain the child's exit status or other reason that it had terminated and its process ID. When the second child is in this state, it's referred to as a zombie. The child will remain a zombie until the parent either terminates or finds out about the child's termination (e.g., the parent calls wait*()).

What this means is that if a child has terminated and the parent is still alive but doesn't yet know about the terminated child (e.g., hasn't called wait*()), then the zombie will be hanging around. If the parent will never care, then you may as well not have the child become a zombie. To prevent the child from becoming a zombie when it terminates, create the child process using one of the spawn*() family of functions and pass P_NOWAITO for the mode.