slm

QNX SDP8.0Utilities ReferenceUtilities

System launch and monitor: launch complex applications consisting of many processes that must be started in a specific order

Syntax:

slm [-avV] [-b seconds] [-D debug_mode] [-n subsystem_path]
    [-p priority] [-P search_path] [-r recovery_mode]
    [-R frequency/sec|min|hour][-s comp_name] [-t polling_interval]
    [-T total_wait][-x comp_name] config_file

Runs on:

QNX OS

Options:

-a
Adopt running processes. Use this option to integrate SLM with an existing system where some processes may already be running. If you place component entries for all relevant system processes in the configuration file, SLM will adopt these processes at startup as if it had launched them itself (and can thus control the processes via the command interface or restart them automatically if they terminate abnormally; see Normal vs. abnormal termination).

The adoption mechanism works only if the component in the configuration file that corresponds to the running process uses exactly the same arguments (specified using the args element) as the running process and the arguments are in the same order. Otherwise, SLM does not recognize that the component corresponds to the running process.

-b seconds
Specify a back-off period for all components. When a component terminates abnormally, SLM waits this number of seconds before it attempts to restart it. If the restart is not successful, it adds this number of seconds to the wait time for each subsequent attempt. For example, if you start SLM with -b 2, SLM waits two seconds before it attempts to restart the process for the first time, waits four seconds before the second attempt, waits six seconds before the third attempt, and so on. After SLM successfully starts a component, it resets the back-off period to its original value.

You can set a different back-off period for an individual component by specifying a repair element with a backoff=seconds attribute for the component in the SLM configuration file.

If not specified, the back-off period is 0 seconds.

-D debug_mode
Specify when to use the <SLM:debug> argument list (instead of the normal <SLM:args> list). One of: cmd (default), startup, or always. With cmd, the debug list is used only when the module is started using slmctl with -d. With startup, all components launched at startup (see the -s option) initially use the debug list, but then honor the -d option of subsequent restarts. With always, the debug list is always used.
-n subsystem_path
Set the access point (default is /dev/slm) for client applications to write control and query commands. For more information on the control and query commands, see slmctl.
-p priority
Set the priority of the SLM server thread (default is 30).
-P search_path
Set the search path for executables (default is $PATH). When launching a process, SLM looks in the search path to find the executable if the corresponding command element doesn't contain a full path.
-r recovery_mode
Set the recovery mode for components monitored by SLM. One of: none, stop, or restart (the default). The action specified with the -r option is performed when a component terminates abnormally if that component doesn't override this setting in its repair element.
-R frequency

Set how frequently SLM attempts to recover a component that has terminated abnormally. The frequency argument specifies the maximum number of recovery attempts as an integer and one of the following suffixes, separated by a forward slash: sec (seconds), min (minutes), or hour. For example, 1/min. Default is 2/min (2 times per minute).

-s comp_name
Name a component or module to launch when slm starts. For convenience, you can use the built-in pseudo-modules all and none (default is all).
-t polling_interval
Set the polling interval in milliseconds for the wait property. Default is 100.
-T total_wait
Set the total wait time in milliseconds. Default is 50000 (50 seconds).
-v
Specifies output verbosity (messages are written to slog2info). The -v option is cumulative; each additional v adds a level of verbosity, up to 7 levels. Default level is warning messages.
-V
Log output messages to the console. The -V option is cumulative; each additional V adds a level of verbosity. Default level is error messages.
-x comp_name
Name a component or module to terminate when slm terminates. For convenience, you can use the built-in pseudo-modules all and none (default is all).

Description:

The System Launch and Monitor (SLM) service automates the management of complex, multi-process applications that must be started in a specific order.

A configuration file controls SLM's behavior. It specifies the processes to run, their properties, and any interprocess dependencies. You can include other XML files using XML external entities, if needed.

SLM uses the information in the configuration file to internally construct a directed acyclic graph (DAG). SLM uses the DAG to determine the order in which it starts the processes.

Similarly, when a process fails, SLM determines any dependent processes to terminate and restart, when SLM starts the process again.

SLM can specify that a process is critical to the system functionality, meaning if the process dies, the system crashes (for more information, see the description of the launch attribute that is available for the configuration file command element).

When you start SLM, you must make sure that slogger2 is running and specify a configuration file, but all the other parameters are optional.

Client applications can control SLM using the slmctl utility or by directly writing commands to the /dev/slm interface. For more information including the control and query commands, see slmctl.

SLM configuration file

SLM uses an XML configuration file to determine the appropriate order for starting processes. The configuration file lists all the programs for SLM to manage, any dependencies between the programs, the commands for launching the processes, and other properties.

Configuration file structure

The root XML element of the configuration file is system. All element names start with SLM:, so the root element (and the outline of the file) looks like this:

<SLM:system>
    -- component and module descriptions --
</SLM:system>

Components

A process managed by SLM is represented by a component. You must provide a component name (usually based on the process name) to use within the configuration file when specifying interprocess dependencies or membership in a module.

All component elements are children of the root element and contain other elements that describe the properties of individual components. The component element uses the following syntax:

<SLM:component name="component_name">
    <SLM:ability> ability </SLM:ability>
    <SLM:args> args </SLM:args>
    <SLM:cd> directory </SLM:cd>
    <SLM:command [launch=" launch_option[,launch_option]... "]>
                 executable_path</SLM:command>
    <SLM:debug> command_args </SLM:debug>
    <SLM:depend [state="session|stateless"]>
                component_name </SLM:depend>
    <SLM:envvar [clear="none|login|all"]>
               environment_variables </SLM:envvar>
    <SLM:groups> gid_1[,gid_2]... </SLM:groups>
    <SLM:priority> priority_algorithm </SLM:priority>
    <SLM:repair [backoff=seconds]>  default|none|stop|restart  </SLM:repair>
    <SLM:rlimit> resource:soft_limit:hard_limit[,resource:soft_limit:hard_limit,...] </SLM:rlimit>
    <SLM:runmask> component_runmask  </SLM:runmask>
    <SLM:stderr [iomode="w[+]|a[+]"]> filename </SLM:stderr>
    <SLM:stdin [iomode="r[+]"]> filename </SLM:stdin>
    <SLM:stdout [iomode="w|a"]> filename </SLM:stdout>
    <SLM:stop 
        [stop="none|signal"] [child="false|true"] [timeout="timeout_time"]>
        data </SLM:stop>
    <SLM:tty> filename </SLM:tty>
    <SLM:type> type_name </SLM:type>
    <SLM:user> uid|:gid|uid:gid </SLM:user>
    <SLM:waitfor [wait="none|delay|pathname|exits|blocks"]
                 [polltime="poll_time:timeout_time"]> data </SLM:waitfor>
</SLM:component>

Only the command element is mandatory—all components must have a path to the binary. The remaining elements are optional.

<SLM:ability>

<SLM:ability> ability </SLM:ability>
ability

A procnto ability to give the process. This element is equivalent to the –A option of the on command, and the syntax of the ability specification is the same. Specify an ability element for each required ability.

Using many ability specifications to launch processes is generally a bad idea; using types to configure abilities is simpler and safer.

<SLM:args>

<SLM:args> args </SLM:args>
args
The list of command-line arguments to launch the process. Use spaces to separate individual arguments. If an argument includes embedded spaces, enclose it in single or double quotes.

If you are specifying arguments for a built-in command, you separate arguments with spaces and sets of arguments with a semi-colon (;). For more information, see the <SLM:command> description.

If the component corresponds to a running process that SLM will adopt as if SLM had launched it (specified by the -a option) make sure that the arguments are identical to the ones the running process uses.

<SLM:cd>

<SLM:cd> directory </SLM:cd>
directory
The directory to switch into when launching the process; this directory becomes the process's working directory.

<SLM:command>

<SLM:command [launch=" launch_option[ launch_option]... "]>
             executable_path</SLM:command>
launch=" launch_option[ launch_option]..."
Specify one or more of the following options, separated by spaces:
  • builtin — The name of a built-in SLM command. The following options are valid:
    • chmod path mode — Call chmod() and pass it the specified path and mode values.
    • no_op — Does nothing. Can be used to wait for a filepath or detect whether a process started outside of SLM is ready.
    • link existing new — Call link() and pass the specified arguments to it.
    • mkdir path mode — Call mkdir() and pass it the specified path and mode values.
    • pathmgr_symlink path symlink — Call pathmgr_symlink() and pass the specified arguments to it.
    • pathmgr_unlink path — Call pathmgr_unlink() and pass the specified arguments to it.
    • remove filename — Call remove() and pass the specified argument to it.
    • symlink path symlink — Call symlink() and pass the specified arguments to it.
    • system command — Call system() and pass the specified argument to it.
    • unlink path — Call unlink() and pass the specified argument to it.
    You specify arguments to the functions in the <SLM:args> element. Separate arguments with spaces and sets of arguments with a semi-colon (;). For example, the following entry creates two arguments to use with pathmgr_symlink:
    <SLM:args>path1 symlink1; path2 symlink2</SLM:args>
    The sample configuration files provided below include a component that uses the builtin option.
  • critical — Start a process with the POSIX_SPAWN_CRITICAL flag. This flag indicates that if this process dies, the system crashes.
  • pathname — When calling posix_spawn(), pass the full pathname in argv[0] instead of truncating the value to a filename. This information is required by some utilities, such as sshd.
  • session — To start a process as a session leader, the launch attribute of the <SLM:command> element must include the value session and the <SLM:component> element must have a <SLM:tty> child element. The <SLM:tty> element value specifies where to redirect the stdin, stdout, and stderr of the process. See the examples for more information on how to use SLM to start a shell.
executable_path
Specifies the path (absolute or relative) of the binary or script to execute.

<SLM:debug>

<SLM:debug> command_args </SLM:debug>
command_args
An alternative list of command-line arguments to use when SLM launches the process in debug mode. For example, including -vvvvvv in the list starts the associated process with increased verbosity when it's run in debug mode.

<SLM:depend>

<SLM:depend [state="session|stateless"]> component_name </SLM:depend>
state="session|stateless"
  • session— if a component must be stopped or restarted, SLM first stops the component that depends on it (specified by component_name). Typically, you specify session when there is a client-server relationship between the components and the server maintains client information.
  • stateless— if a component must be stopped or restarted, the component that depends on it is unaffected. For example, if there is a client-server relationship between the components but the server doesn't maintain any client information, it's not necessary to restart clients if the server is restarted.
component_name
The name of the prerequisite component. A component can have zero or more dependencies.

You must define a separate element for each dependency.

SLM won't start a component until all the prerequisites are running and any waitfors are complete.

<SLM:envvar>

<SLM:envvar [clear="none|login|all"]> environment_variables </SLM:envvar>
clear="none|login|all"
Specifies changes to environment variables. By default, the variables are inherited from SLM. The clear attribute specifies which current environment variables to clear or preserve:
  • none—Preserve all current environment variables (default).
  • login—Clear all environment variables except for any that the envvar element specifies and BAUD, DISPLAY, HZ, PHOTON, SYSNAME, TERM, TZ, HOME, LOGNAME, PATH, SHELL, TERM, and USERNAME.
  • all—Clear all current environment variables.
environment_variables
A list of environment variables to either merge with or override the current environment variables. Use the format VAR=value to specify each variable.

<SLM:groups>

<SLM:groups> gid_1[,gid_2]... </SLM:groups>
gid_1[,gid_2]...
A list of group IDs that specifies the group access list for the component's process.

<SLM:priority>

<SLM:priority> priority_algorithm </SLM:priority>
priority_algorithm
An alphanumeric value that indicates the priority level and scheduling policy to assign the process (e.g., 10r).
  • fSCHED_FIFO (FIFO scheduling)
  • rSCHED_RR (Round-robin scheduling)
  • oSCHED_OTHER (other scheduling)
For descriptions of the scheduling policies, see Scheduling polices in the Programmer's Guide.

<SLM:repair>

<SLM:repair [backoff=seconds]>  default|none|stop|restart  </SLM:repair>
Specifies the action to take if the component terminates abnormally:
  • backoff=seconds—SLM waits the specified number of seconds and then attempts to restart the failed component. If the restart is not successful, it adds this number of seconds to the wait time for each subsequent attempt. For an example, see the -b command-line option description.
  • default—SLM performs the action specified by the -r command-line option.
  • none—SLM takes no recovery action.
  • stop—SLM stops any other components that depend on the component that failed.̵̵
  • restart—SLM restarts the failed component.

<SLM:rlimit>

<SLM:rlimit> resource:soft_limit:hard_limit[,resource:soft_limit:hard_limit,...] </SLM:rlimit>
resource
A system resource to limit the consumption of by the component's process.

For information on possible resources and the actions taken when limits are exceeded, see prlimit() in the C Library Reference.

soft_limit
The soft limit for the resource.
hard_limit
The hard limit for the resource.

<SLM:runmask>

<SLM:runmask> component_runmask  </SLM:runmask>
component_runmask
A value that is interpreted as a bitmask, which specifies on which processors a process can run. It is a 32-bit integer and can be specified using any format that strtol() recognizes.

For example, the decimal value 5 corresponds to the bitmask 00000101, which allows the thread to run on CPUs 0 and 2.

Only specify the runmask once.

A valid runmask is always inherited by children.

For more information about runmasks, see the Multicore Processing chapter of the QNX OS Programmer's Guide.

<SLM:stderr>

<SLM:stderr [iomode="w[+]|a[+]"]> stderr_filename </SLM:stderr>
iomode="w[+]|a[+]"
The access mode: overwrite (w), read and overwrite (w+; default), append (a), or read and append (a+).
stderr_filename
Name of the file to which the standard error stream (stderr) is redirected.

<SLM:stdin>

<SLM:stdin [iomode="r[+]"]> stdin_filename </SLM:stdin>
iomode="r[+]"
The access mode: read only (r) or read and write (r+).
stdin_filename
Name of the file to which standard input (stdin) is redirected.

<SLM:stdout>

<SLM:stdout [iomode="w|a"]> stdout_filename </SLM:stdout>
iomode="w|a"
The access mode: overwrite (w) or append (a).
stdout_filename
Name of the file to which standard output (stdout) is redirected.

<SLM:stop>

<SLM:stop 
        [stop="none|signal"] [child="false|true"] [timeout="timeout_time"]>
        stop_data </SLM:stop>
stop="none|signal"
The signal setting (the default) causes SLM to send a signal number or name to the underlying process. The none setting disables the signaling; in this case, SLM takes no action to stop a process.
child="false|true"
When set to false (the default), no child of the process is terminated. When set to true, SLM uses application groups to reliably terminate all children of the process before it terminates the process itself.
timeout="timeout_time"
The maximum amount of time to try to stop a process nicely, in milliseconds. If the process can't be stopped nicely, SIGKILL is sent to it. For no timeout, specify 0 (the default).
stop_data

Contains the signal number or name to send the process to stop it. By default, SIGTERM is sent, but you can change this to any signal. Because a signal name does not need to begin with "SIG", all of the following example values are valid:

  • 15
  • TERM
  • SIGTERM
The name string is case-insensitive.
If repeated failed attempts to stop the process fail, SIGKILL is sent.
This element value isn't needed when the stop attribute is set to none.

<SLM:tty>

<SLM:tty> tty_filename </SLM:tty>
tty_filename
Name of the file to which stderr, stdin, and stdout are redirected to when a process is opened as the session leader.

<SLM:type>

<SLM:type> type_name </SLM:type>
type_name
Name of the security type to launch the component as. The name is a label that reflects the security policy being enforced. Generally, you should pick a name based on what you're trying to launch. For information about security policies, see the Security Policies chapter in the Security Developer's Guide.

<SLM:user>

<SLM:user> uid|:gid|uid:gid </SLM:user>
uid|:gid|uid:gid
Assigns a user ID (UID), group ID (GID), or both to the underlying process. The values can be names to look up in /etc/passwd and /etc/group.

<SLM:waitfor>

<SLM:waitfor [wait="none|delay|pathname|exits|blocks"]
                 [ polltime=poll_time:timeout_time]> waitfor_data </SLM:waitfor>
wait="none|delay|pathname|exits|blocks"
Once a component has been launched, SLM can wait for that component to set itself up before starting any dependent components. Values:
  • none (the default)—Causes SLM to start other dependent components immediately.
  • delay—SLM pauses for the specified number of milliseconds before it starts the dependent components.
  • pathname—SLM probes for the appearance of the specified pathname.
  • exits—SLM waits for the process to exit with the specified exit code. If the exit code is different from the expected one, SLM restarts the process.
  • blocks—SLM waits for a specified thread in the process to reach the RECV-blocked state.
polltime=poll_time:timeout_time

Use with wait="pathname" or wait="exits" to specify a polling interval and total wait time (both in milliseconds) that override the global values.

For example, polltime="100:20000" results in polling every 100 milliseconds and timing out after 20 seconds.

waitfor_data
Contains data for the specified wait condition:
  • none — No data required.
  • delay — A time in milliseconds (e.g., 5000 for a 5-second delay).
  • pathname — A path.
  • exits — The expected exit code (default is 0).
  • blocks — A thread ID.

Modules

You can group components into modules. The processes within a module could make up a subsystem or could be used to establish a set of system states, such as a base level of operation and various higher levels. Modules must be named so they can be internally referenced. Each module must be described in an element, as follows:

<SLM:module name="device_monitors">
    -- module description --
</SLM:module>

To list the components within a module, use the member element. There are no attributes for member elements; the element values refer to member components by the internal names defined in their respective component elements. Modules cannot contain depend elements.

Note:

You can include multiple components in a module by using one member element with wildcards in the component names. For example, you can write:

<SLM:member>devb-*</SLM:member>

Components and modules may be specified in any order in the XML configuration file, but SLM raises an error if any circular dependencies are found.

Reusing SLM modules and components

You can define modules and components for reuse in one or more SLM files. This can be useful for breaking up your SLM modules and components to reuse in different SLM configuration files.

In the SLM configuration file where you are reusing modules and components from other SLM files, you need to define the filenames of where these reusable sections reside. The syntax to do so is as follows:

<!DOCTYPE SLM_system [
    <!ENTITY inclusion_name SYSTEM 'filename'>
]>
where inclusion_name is a name that you use in your SLM configuration file to identify the reusable entities and filename is a separate file on your system where your reusable SLM modules and components are defined.

At the point in your SLM configuration file where you want to include the reusable entities, include them by specifying the following:

&inclusion_name;

For example, in your system you have a file called my_reusable_modules.xml where you have defined the SLM modules and components that can be included in different SLM configuration files. Then, in one of your SLM configuration files, you can define an entity named reuseModules and later include it:

<!DOCTYPE SLM_system [
    <!ENTITY reuseModules SYSTEM 'my_reusable_modules.xml'>
]>
...
<SLM:system>
    ...
    <!-- Include the contents of what's specified in 'my_reusable_modules.xml'
            by specifying the entity 'reuseModules' -->
    &reuseModules;
    ...
</SLM:system>

Sample configuration files

Suppose you want to automate the setup of your system's IP connectivity. This would require running io-sock, which creates an IP socket for network traffic, running if_up to wait for an interface to be ready for configuration, and then running ifconfig to bind an IP address to the socket. You can create a module that includes three components that correspond to io-sock and the two utilities. You can then describe the dependency of if_up on io-sock, and ifconfig on if_up, in the component entries. The XML file would then look like this:

<SLM:system>
    <SLM:component name="io-sock">
        <SLM:command>/sbin/io-sock</SLM:command>
        <SLM:args>-m phy -m pci -m em</SLM:args>
        <SLM:waitfor wait="pathname">/dev/socket</SLM:waitfor>
    </SLM:component>
    <SLM:component name="if_up">
        <SLM:depend>io-sock</SLM:depend>
        <SLM:command>/sbin/if_up</SLM:command>
        <SLM:args>-p em0</SLM:args>
        <SLM:waitfor wait="exits"></SLM:waitfor>
    </SLM:component>
    <SLM:component name="ifconfig">
        <SLM:depend>if_up</SLM:depend>
        <SLM:command>/sbin/ifconfig</SLM:command>
        <SLM:args>em0 192.168.1.5 up</SLM:args>
        <SLM:waitfor wait="exits"></SLM:waitfor>
    </SLM:component>
    <SLM:module name="net-setup">
        <SLM:member>io-sock</SLM:member>
        <SLM:member>if_up</SLM:member>
        <SLM:member>ifconfig</SLM:member>
    </SLM:module>
</SLM:system>

The following example shows how to use SLM to start a shell:

<SLM:component name="console"> 
    <SLM:command launch="session">/bin/ksh</SLM:command> 
    <SLM:args>-l</SLM:args> 
    <SLM:tty>/dev/ser1</SLM:tty> 
    ... 
</SLM:component>

The following example shows how sshd could be started by SLM (so that sshd could be monitored):

<SLM:component name="sshd">
    <SLM:command launch="pathname">/system/xbin/sshd</SLM:command> 
    <SLM:args>-D</SLM:args>
    ... 
</SLM:component> 

The following example shows how to use the builtin option to call a built-in SLM function (system()) and pass arguments to it using the args element:

<SLM:component name="root-fs">
     <SLM:depend>mount-fs</SLM:depend>    
     <SLM:command launch="builtin">system</SLM:command>
     <SLM:args>setconf _CS_HOSTNAME __HOSTNAME__</SLM:args>
</SLM:component>

Normal vs. abnormal termination

SLM considers a process to have terminated normally in the following situations only:
  • SLM terminates a component's process because:
    • a stop action was created by executing slmctl stop component.
    • a dependency required SLM to stop the component's process.
  • The component is configured with a waitfor=exits and the component's process exits with the expected exit code.
All other process terminations are considered abnormal and cause SLM to restart the component's process. If a process has died too frequently in a certain time period, SLM stops trying to restart the process even though the termination is abnormal.
Page updated: