This version of this document is no longer maintained. For the latest documentation, see http://www.qnx.com/developers/docs.

Processes

Process creation
Process termination
Detecting process termination

As we stated in the Overview chapter, the Neutrino OS architecture consists of a small microkernel and some number of cooperating processes. We also pointed out that your applications should be written the same way -- as a set of cooperating processes.

In this chapter, we'll see how to start processes (also known as creating processes) from code, how to terminate them, and how to detect their termination when it happens.

For another perspective, see the Processes and Threads and Message Passing chapters of Getting Started with QNX Neutrino.

Starting processes -- two methods

In embedded applications, there are two typical approaches to starting your processes at boot time. One approach is to run a shell script that contains the command lines for running the processes. There are some useful utilities such as on and nice for controlling how those processes are started.

The other approach is to have a starter process run at boot time. This starter process then starts up all your other processes. This approach has the advantage of giving you more control over how processes are started, whereas the script approach is easier for you (or anyone) to modify quickly.

Process creation

The process manager component of procnto is responsible for process creation. If a process wants to create another process, it makes a call to one of the process-creation functions, which then effectively sends a message to the process manager.

Here are the process-creation functions:

exec*() family of functions: execl(), execle(), execlp(), execlpe(), execv(), execve(), execvp(), execvpe()
fork()
forkpty()
popen()
spawn()
spawn*() family of functions: spawn(), spawnl(), spawnle(), spawnlp(), spawnlpe(), spawnp(), spawnv(), spawnve(), spawnvp(), spawnvpe()
system()
vfork()

When you start a new process, it replaces the existing process if:

You specify P_OVERLAY when calling one of the spawn* functions.
You call one of the exec* routines.

The existing process may be suspended while the new process executes (control continues at the point following the place where the new process was started) in the following situations:

You specify P_WAIT when calling one of the spawn* functions.
You call system().

There are several versions of spawn*() and exec*(). The * is one to three letters, where:

l or v (one is required) indicates the way the process parameters are passed
p (optional) indicates that the PATH environment variable is searched to locate the program for the process
e (optional) indicates that the environment variables are being passed

For details on each of these functions, see their entries in the Library Reference. Here we'll mention some of the things common to many of them.

Concurrency

Three possibilities can happen to the creator during process creation:

The child process is created and runs concurrently with the parent. In this case, as soon as process creation is successful, the process manager replies to the parent, and the child is made READY. If it's the parent's turn to run, then the first thing it does is return from the process-creation function. This may not be the case if the child process was created at a higher priority than the parent (in which case the child will run before the parent gets to run again).
This is how fork(), forkpty(), popen(), and spawn() work. This is also how the spawn*() family of functions work when the mode is passed as P_NOWAIT or P_NOWAITO.
The child replaces the parent. In fact, they're not really parent and child, because the image of the given process simply replaces that of the caller. Many things will change, but those things that uniquely identify a process (such as the process ID) will remain the same. This is typically referred to as "execing," since usually the exec*() functions are used.
Many things will remain the same (including the process ID, parent process ID, and file descriptors) with the exception of file descriptors that had the FD_CLOEXEC flag set using fcntl(). See the exec*() functions for more on what will and will not be the same across the exec.
The login command serves as a good example of execing. Once the login is successful, the login command execs into a shell.
Functions you can use for this type of process creation are the exec*() and spawn*() families of functions, with mode passed as P_OVERLAY.
The parent waits until the child terminates. This can be done by passing the mode as P_WAIT for the spawn*() family of functions.
Note that what is going on underneath the covers in this case is that spawn() is called as in the first possibility above. Then, after it returns, waitpid() is called in order to wait for the child to terminate. This means that you can use any of the functions mentioned in our first possibility above to achieve the same thing if you follow them by a call to one of the wait*() functions (e.g. wait() or waitpid()).

Using fork() and forkpty()

As of this writing, you can't use fork() and forkpty() in a process that has threads. The fork() and forkpty() functions will simply return -1 and errno will be set to ENOSYS.

Many programmers coming from the Unix world are familiar with the technique of using a call to fork() followed by a call to one of the exec*() functions in order to create a process that's different from the caller. In Neutrino, you can usually achieve the same thing in a single call to one of the spawn*() functions.

Inheriting file descriptors

The documentation in the Library Reference for each function describes in detail what the child inherits from the parent. One thing that we should talk about here, however, is file-descriptor inheritance.

With many of the process-creation functions, the child inherits the file descriptors of the parent. For example, if the parent had file descriptor 5 in use for a particular file when the parent creates the child, the child will also have file descriptor 5 in use for that same file. The child's file descriptor will have been duplicated from the parent's. This means that at the filesystem manager level, the parent and child have the same open control block (OCB) for the file, so if the child seeks to some position in the file, then that changes the parent's seek position as well. It also means that the child can do a write(5, buf, nbytes) without having previously called open().

If you don't want the child to inherit a particular file descriptor, then you can use fcntl() to prevent it. Note that this won't prevent inheritance of a file descriptor during a fork(). The call to fcntl() would be:

fcntl(fd, F_SETFD, FD_CLOEXEC);

If you want the parent to set up exactly which files will be open for the child, then you can use the fd_count and fd_map parameters with spawn(). Note that in this case, only the file descriptors you specify will be inherited. This is especially useful for redirecting the child's standard input (file descriptor 0), standard output (file descriptor 1), and standard error (file descriptor 2) to places where the parent wants them to go.

Alternatively this file descriptor inheritance can also be done through use of fork(), one or more calls to dup(), dup2(), and close(), and then exec*(). The call to fork() creates a child that inherits all the of the parent's file descriptors. dup(), dup2() and close() are then used by the child to rearrange its file descriptors. Lastly, exec*() is called to replace the child with the process to be created. Though more complicated, this method of setting up file descriptors is portable whereas the spawn() method is not.

Process termination

A process can terminate in one of two basic ways:

normally (e.g. the process terminates itself)
abnormally (e.g. the process terminates as the result of a signal's being set)

Normal process termination

A process can terminate itself by having any thread in the process call exit(). Returning from the main thread (i.e. main()) will also terminate the process, because the code that's returned to calls exit(). This isn't true of threads other than the main thread. Returning normally from one of them causes pthread_exit() to be called, which terminates only that thread. Of course, if that thread is the last one in the process, then the process is terminated.

The value passed to exit() or returned from main() is called the exit status.

Abnormal process termination

A process can be terminated abnormally for a number of reasons. Ultimately, all of these reasons will result in a signal's being set on the process. A signal is something that can interrupt the flow of your threads at any time. The default action for most signals is to terminate the process.

Note that what causes a particular signal to be generated is sometimes processor-dependent.

Here are some of the reasons that a process might be terminated abnormally:

If any thread in the process tries to use a pointer that doesn't contain a valid virtual address for the process, then the hardware will generate a fault and the kernel will handle the fault by setting the SIGSEGV signal on the process. By default, this will terminate the process.
A floating-point exception will cause the kernel to set the SIGFPE signal on the process. The default is to terminate the process.
If you create a shared memory object and then map in more than the size of the object, when you try to write past the size of the object you'll be hit with SIGBUS. In this case, the virtual address used is valid (since the mapping succeeded), but the memory cannot be accessed.

To get the kernel to display some diagnostics whenever a process terminates abnormally, configure procnto with multiple -v options. If the process has fd 2 open, then the diagnostics are displayed using (stderr); otherwise; you can specify where the diagnostics get displayed by using the -D option to your startup. For example, the -D as used in this buildfile excerpt will cause the output to go to a serial port:

[virtual=x86,bios +compress] .bootstrap = {
    startup-bios -D 8250..115200
    procnto -vvvv
}

You can also have the current state of a terminated process written to a file so that you can later bring up the debugger and examine just what happened. This type of examination is called postmortem debugging. This happens only if the process is terminated due to one of these signals:

Signal	Description
SIGABRT	Program-called abort function
SIGBUS	Parity error
SIGEMT	EMT instruction
SIGFPE	Floating-point error or division by zero
SIGILL	Illegal instruction executed
SIGQUIT	Quit
SIGSEGV	Segmentation violation
SIGSYS	Bad argument to a system call
SIGTRAP	Trace trap (not reset when caught)
SIGXCPU	Exceeded the CPU limit
SIGXFSZ	Exceeded the file size limit

The process that dumps the state to a file when the process terminates is called dumper, which must be running when the abnormal termination occurs. This is extremely useful, because embedded systems may run unassisted for days or even years before a crash occurs, making it impossible to reproduce the actual circumstances leading up to the crash.

Effect of parent termination

In some operating systems, if a parent process dies, then all of its child processes die too. This isn't the case in Neutrino.

Detecting process termination

In an embedded application, it's often important to detect if any process terminates prematurely and, if so, to handle it. Handling it may involve something as simple as restarting the process or as complex as:

Notifying other processes that they should put their systems into a safe state.
Resetting the hardware.

This is complicated by the fact that some Neutrino processes call procmgr_daemon(). Processes that call this function are referred to as daemons. The procmgr_daemon() function:

detaches the caller from the controlling terminal
puts it in session 1
optionally, closes all file descriptors except stdin, stdout, and stderr
optionally, redirects stdin, stdout, stderr to /dev/null

As a result of the above, their termination is hard to detect.

Another scenario is where a server process wants to know if any of its clients disappear so that it can clean up any resources it had set aside on their behalf.

Let's look at various ways of detecting process termination.

Using the High Availability Framework

The High Availability Framework provides components not only for detecting when processes terminate, but also for recovering from that termination.

The main component is a process called the High Availability Manager (HAM) that acts as a "smart watchdog". Your processes talk to the HAM using the HAM API. With this API you basically set up conditions that the HAM should watch for and take actions when these conditions occur. So the HAM can be told to detect when a process terminates and to automatically restart the process. It will even detect the termination of daemon processes.

In fact, the High Availability Manager can restart a number of processes, wait between restarts for a process to be ready, and notify the process that this is happening.

The HAM also does heartbeating. Processes can periodically notify the HAM that they are still functioning correctly. If a process specified amount of time goes by between these notifications then the HAM can take some action.

The above are just a sample of what is possible with the High Availability Framework. For more information, see the High Availability Framework Developer's Guide

Detecting termination from a starter process

If you've created a set of processes using a starter process as discussed at the beginning of this section, then all those processes are children of the starter process, with the exception of those that have called procmgr_daemon(). If all you want to do is detect that one of those children has terminated, then a loop that blocks on wait() or sigwaitinfo() will suffice. Note that when a child process calls procmgr_daemon(), both wait() and sigwaitinfo() behave as if the child process died, although the child is still running.

The wait() function will block, waiting until any of the caller's child processes terminate. There's also waitpid(), which lets you wait for a specific child process, wait3(), and wait4(). Lastly, there is waitid(), which is the lower level of all the wait*() functions and returns the most information.

The wait*() functions won't always help, however. If a child process was created using one of the spawn*() family of functions with the mode passed as P_NOWAITO, then the wait*() functions won't be notified of its termination!

What if the child process terminates, but the parent hasn't yet called wait*()? This would be the case if one child had already terminated, so wait*() returned, but then before the parent got back to the wait*(), a second child terminates. In that case, some information would have to be stored away about the second child for when the parent does get around to its wait*().

This is in fact the case. The second child's memory will have been freed up, its files will have been closed, and in general the child's resources will have been cleaned up with the exception of a few bytes of memory in the process manager that contain the child's exit status or other reason that it had terminated and its process ID. When the second child is in this state, it's referred to as a zombie. The child will remain a zombie until the parent either terminates or finds out about the child's termination (e.g. the parent calls wait*()).

What this means is that if a child has terminated and the parent is still alive but doesn't yet know about the terminated child (e.g. hasn't called wait*()), then the zombie will be hanging around. If the parent will never care, then you may as well not have the child become a zombie. To prevent the child from becoming a zombie when it terminates, create the child process using one of the spawn*() family of functions and pass P_NOWAITO for the mode.

Sample parent process using wait()

The following sample illustrates the use of wait() for waiting for child processes to terminate.

/* 
 * waitchild.c
 *
 * This is an example of a parent process that creates some child
 * processes and then waits for them to terminate. The waiting is
 * done using wait(). When a child process terminates, the
 * wait() function returns.
*/

#include <spawn.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

main(int argc, char **argv)
{
    char                *args[] = { "child", NULL };
    int                 i, status;
    pid_t               pid;
    struct inheritance  inherit;

    // create 3 child processes
    for (i = 0; i < 3; i++) {
        inherit.flags = 0;
        if ((pid = spawn("child", 0, NULL, &inherit, args, environ)) == -1)
            perror("spawn() failed");
        else
            printf("spawned child, pid = %d\n", pid);
    }

    while (1) {
        if ((pid = wait(&status)) == -1) {
            perror("wait() failed (no more child processes?)");
            exit(EXIT_FAILURE);
        }
        printf("a child terminated, pid = %d\n", pid);
        
        if (WIFEXITED(status)) {
            printf("child terminated normally, exit status = %d\n",
                WEXITSTATUS(status));
        } else if (WIFSIGNALED(status)) {
            printf("child terminated abnormally by signal = %X\n",
                WTERMSIG(status));
        } // else see documentation for wait() for more macros
    }
}

The following is a simple child process to try out with the above parent.

#include <stdio.h>
#include <unistd.h>

main(int argc, char **argv)
{
    printf("pausing, terminate me somehow\n");
    pause();
}

The sigwaitinfo() function will block, waiting until any signals that the caller tells it to wait for are set on the caller. If a child process terminates, then the SIGCHLD signal is set on the parent. So all the parent has to do is request that sigwaitinfo() return when SIGCHLD arrives.

Sample parent process using sigwaitinfo()

The following sample illustrates the use of sigwaitinfo() for waiting for child processes to terminate.

/* 
 * sigwaitchild.c
 *
 * This is an example of a parent process that creates some child
 * processes and then waits for them to terminate.  The waiting is
 * done using sigwaitinfo().  When a child process terminates, the
 * SIGCHLD signal is set on the parent.  sigwaitinfo() will return
 * when the signal arrives.
*/

#include <errno.h>
#include <spawn.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/neutrino.h>

void
signal_handler(int signo)
{
    // do nothing
}

main(int argc, char **argv)
{
    char                *args[] = { "child", NULL };
    int                 i;
    pid_t               pid;
    sigset_t            mask;
    siginfo_t           info;
    struct inheritance  inherit;
    struct sigaction    action;

    // mask out the SIGCHLD signal so that it will not interrupt us,
    // (side note: the child inherits the parents mask)
    sigemptyset(&mask);
    sigaddset(&mask, SIGCHLD);
    sigprocmask(SIG_BLOCK, &mask, NULL);

    // by default, SIGCHLD is set to be ignored so unless we happen
    // to be blocked on sigwaitinfo() at the time that SIGCHLD
    // is set on us we will not get it.  To fix this, we simply
    // register a signal handler.  Since we've masked the signal
    // above, it will not affect us.  At the same time we will make
    // it a queued signal so that if more than one are set on us,
    // sigwaitinfo() will get them all.
    action.sa_handler = signal_handler;
    sigemptyset(&action.sa_mask);
    action.sa_flags = SA_SIGINFO; // make it a queued signal
    sigaction(SIGCHLD, &action, NULL);

    // create 3 child processes
    for (i = 0; i < 3; i++) {
        inherit.flags = 0;
        if ((pid = spawn("child", 0, NULL, &inherit, args, environ)) == -1)
            perror("spawn() failed");
        else
            printf("spawned child, pid = %d\n", pid);
    }

    while (1) {
        if (sigwaitinfo(&mask, &info) == -1) {
            perror("sigwaitinfo() failed");
            continue;
        }
        switch (info.si_signo) {
        case SIGCHLD:
            // info.si_pid is pid of terminated process, it is not POSIX
            printf("a child terminated, pid = %d\n", info.si_pid);
            break;
        default:
            // should not get here since we only asked for SIGCHLD
        }
    }
}

Detecting dumped processes

As mentioned above, you can run dumper so that when a process dies, dumper writes the state of the process to a file.

You can also write your own dumper-type process to run instead of, or as well as, dumper. This way the terminating process doesn't have to be a child of yours.

To do this, write a resource manager that registers the name, /proc/dumper with type _FTYPE_DUMPER. When a process dies due to one of the appropriate signals, the process manager will open /proc/dumper and write the pid of the process that died -- then it'll wait until you reply to the write with success and then it'll finish terminating the process.

It's possible that more than one process will have /proc/dumper registered at the same time, however, the process manager notifies only the process that's at the beginning of its list for that name. Undoubtedly, you want both your resource manager and dumper to handle this termination. To do this, request the process manager to put you, instead of dumper, at the beginning of the /proc/dumper list by passing _RESMGR_FLAG_BEFORE to resmgr_attach(). You must also open /proc/dumper so that you can communicate with dumper if it's running. Whenever your io_write handler is called, write the pid to dumper and do your own handling. Of course this works only when dumper is run before your resource manager; otherwise, your open of /proc/dumper won't work.

The following is a sample process that demonstrates the above:

/*
 *  dumphandler.c
 *
 *  This demonstrates how you get notified whenever a process
 *  dies due to any of the following signals:
 *
 *  SIGABRT
 *  SIGBUS
 *  SIGEMT
 *  SIGFPE
 *  SIGILL
 *  SIGQUIT
 *  SIGSEGV
 *  SIGSYS
 *  SIGTRAP
 *  SIGXCPU
 *  SIGXFSZ
 *
 *  To do so, register the path, /proc/dumper with type
 *  _FTYPE_DUMPER. When a process dies due to one of the above
 *  signals, the process manager will open /proc/dumper, and
 *  write the pid of the process that died - it will wait until
 *  you reply to the write with success, and then it will finish
 *  terminating the process.
 *
 *  Note that while it is possible for more than one process to
 *  have /proc/dumper registered at the same time, the process
 *  manager will notify only the one that is at the beginning of
 *  its list for that name.
 *
 *  But we want both us and dumper to handle this termination.
 *  To do this, we make sure that we get notified instead of
 *  dumper by asking the process manager to put us at the
 *  beginning of its list for /proc/dumper (done by passing
 *  _RESMGR_FLAG_BEFORE to  resmgr_attach()).  We also open
 *  /proc/dumper so that we can communicate with dumper if it is
 *  running.  Whenever our io_write handler is called, we write
 *  the pid to dumper and do our own handling.  Of course, this
 *  works only if dumper is run before we are, or else our open
 *  will not work.
 *
*/

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <string.h>
#include <unistd.h>
#include <sys/iofunc.h>
#include <sys/dispatch.h>
#include <sys/neutrino.h>
#include <sys/procfs.h>
#include <sys/stat.h>

int io_write (resmgr_context_t *ctp, io_write_t  *msg,
              RESMGR_OCB_T *ocb);

static int  dumper_fd;

resmgr_connect_funcs_t  connect_funcs;
resmgr_io_funcs_t       io_funcs;
dispatch_t              *dpp;
resmgr_attr_t           rattr;
dispatch_context_t      *ctp;
iofunc_attr_t           ioattr;

char    *progname = "dumphandler";

main(int argc, char **argv)
{
    /* find dumper so that we can pass any pids on to it */
    dumper_fd = open("/proc/dumper", O_WRONLY);

    dpp = dispatch_create();

    memset(&rattr, 0, sizeof(rattr));
    rattr.msg_max_size = 2048;

    iofunc_func_init(_RESMGR_CONNECT_NFUNCS, &connect_funcs,
                     _RESMGR_IO_NFUNCS, &io_funcs);
    io_funcs.write = io_write;

    iofunc_attr_init(&ioattr, S_IFNAM | 0600, NULL, NULL);

    resmgr_attach(dpp, &rattr, "/proc/dumper", _FTYPE_DUMPER,
                  _RESMGR_FLAG_BEFORE, &connect_funcs,
                  &io_funcs, &ioattr);

    ctp = dispatch_context_alloc(dpp);

    while (1) {
        if ((ctp = dispatch_block(ctp)) == NULL) {
            fprintf(stderr, "%s:  dispatch_block failed: %s\n",
                             progname, strerror(errno));
            exit(1);
        }
        dispatch_handler(ctp);
    }
}

struct dinfo_s {
    procfs_debuginfo    info;
    char                pathbuffer[PATH_MAX]; /* 1st byte is
                                                 info.path[0] */
};

int
display_process_info(pid_t pid)
{
    char            buf[PATH_MAX + 1];
    int             fd, status;
    struct dinfo_s  dinfo;
    procfs_greg     reg;

    printf("%s: process %d died\n", progname, pid);

    sprintf(buf, "/proc/%d/as", pid);

    if ((fd = open(buf, O_RDONLY|O_NONBLOCK)) == -1)
        return errno;

    status = devctl(fd, DCMD_PROC_MAPDEBUG_BASE, &dinfo,
                    sizeof(dinfo), NULL);
    if (status != EOK) {
        close(fd);
        return status;
    }

    printf("%s: name is %s\n", progname, dinfo.info.path);

    /*
     * For getting other type of information, see sys/procfs.h,
     * sys/debug.h, and sys/dcmd_proc.h
     */
     
    close(fd);
    return EOK;
}

int
io_write(resmgr_context_t *ctp, io_write_t *msg,
         RESMGR_OCB_T *ocb)
{
    char    *pstr;
    int     status;
    
    if ((status = iofunc_write_verify(ctp, msg, ocb, NULL))
        != EOK)
        return status;

    if (msg->i.xtype & _IO_XTYPE_MASK != _IO_XTYPE_NONE)
        return ENOSYS;

    if (ctp->msg_max_size < msg->i.nbytes + 1)
        return ENOSPC; /* not all the message could fit in the
                          message buffer */

    pstr = (char *) (&msg->i) + sizeof(msg->i);
    pstr[msg->i.nbytes] = '\0';

    if (dumper_fd != -1) {
        /* pass it on to dumper so it can handle it too */
        if (write(dumper_fd, pstr, strlen(pstr)) == -1) {
            close(dumper_fd);
            dumper_fd = -1; /* something wrong, no sense in
                               doing it again later */
        }
    }
    
    if ((status = display_process_info(atoi(pstr))) == -1)
        return status;
    
    _IO_SET_WRITE_NBYTES(ctp, msg->i.nbytes);
    
    return EOK;
}

Detecting the termination of daemons

What would happen if you've created some processes that subsequently made themselves daemons (i.e. called procmgr_daemon())? As we mentioned above, the wait*() functions and sigwaitinfo() won't help.

For these you can give the kernel an event, such as one containing a pulse, and have the kernel deliver that pulse to you whenever a daemon terminates. This request for notification is done by calling procmgr_event_notify() with PROCMGR_EVENT_DAEMON_DEATH in flags.

See the documentation for procmgr_event_notify() for an example that uses this function.

Detecting client termination

The last scenario is where a server process wants to be notified of any clients that terminate so that it can clean up any resources that it had set aside for them.

This is very easy to do if the server process is written as a resource manager, because the resource manager's io_close_dup() and io_close_ocb() handlers, as well as the ocb_free() function, will be called if a client is terminated for any reason.