Controlling Resources Using the Thread Scheduler

This chapter includes:

Overview

Overview

The thread scheduler is a component of the QNX adaptive partitioning architecture. The thread scheduler helps solve the problem of controlling the consumption of resources in a system. For example, we might want to control these resources to:

prevent an application from consuming too many resources, such that it starves another application
maintain a reserve of resources for emergency purposes, such as a disaster-recovery system, or a field-debugging shell
limit well-behaved applications to a set share of the resource allocation for the system. For example, when a QNX user builds a system that serves several end users, the QNX user might want to bill their end users by the amount of throughput or capacity they are allocated on the shared system.

However, the details for controlling a resource are very different depending on the type of resource; controlling scheduling (time partitions).

Question	Answer for the thread scheduler
When do you get more of the resource?	More time appears.
How much history of resource consumption does the adaptive partitioning system use to make decisions?	Time usage over the last 100 milliseconds (a rolling window). The 100ms is configurable; however, it is typically short.
Hierarchy of partitions: Does the partition size limit of a parent limit the size of the child partitions?	Yes. A child partition can never be given a larger CPU share than its parent partition. When a child scheduler partition is created, we subtract the child's budget (partition size) from the size of its parent so that a child is separate from its parent. Why: The hierarchical accounting rules needed for a child partition to be a component of a parent partition are too CPU-time intensive for scheduling because scheduling operations occur thousands of times every second, and continue forever.
Is there a limit to the number of partitions?	Yes. There is a maximum of eight scheduler partitions. Why: For every scheduling operation, the thread scheduler must examine every partition before it can pick a thread on which to run. That may occur 50000 times per second on a 700MHz x86 (i.e. a slow machine). So it's important to limit the number of scheduler partitions to keep the scheduler overhead to a minimum.
Is the hierarchy of partitions represented in a path namespace?	No. Scheduler partitions are named in a small flat namespace that is unique to the thread scheduler.
In what units are partitions sized?	The percentage of CPU time.
What do the terms guaranteed, minimum size, and maximum size mean for partitions?	The size, or budget of a scheduler partition is the guaranteed minimum amount of CPU time that threads (in partitions), will be allowed to consume over the next 100ms rolling window. Scheduler partitions do not have a maximum size (i.e. an amount of consumption that would cause the thread scheduler to stop running threads in a partition), because they were using too much of the system's resources. Instead, the thread scheduler allows a partition to overrun or exceed its budget when other partitions are not using their guaranteed minimums. This behavior is specific to scheduling. It's designed to make the most possible use of the CPU at all times (i.e. keep the CPU busy if at least one thread is ready to run).
What mechanism enforces partition consumption rules? When are these rules applied?	Every timer interrupt (typically, every millisecond), every message/pulse send, receive or reply, every signal, every mutex operation, and on every stack fault, and including many times for process manager operations (creation/destruction of processes or threads and open() operations on elements of the path namespace.) Enforcement mechanism: If a partition is over budget (meaning that the consumption of CPU time over the last 100 milliseconds exceeds the partition's size, and other partitions are also demanding time) and a thread wants to run, the thread scheduler doesn't run the thread; it runs some other thread. Only when enough time has elapsed, so that the average CPU time use of that partition (over the last 100 milliseconds) falls below the partition's size, will the scheduler run the thread. However, the thread is guaranteed to eventually run.
Can we say that a partition has members? What is the member?	Yes, threads are members of scheduler partitions. We say they're running in a scheduler partition. However, a mechanism designed to avoid priority-inversion problems means that occasionally threads can temporarily move to other partitions. The different threads of a process may be in different scheduler partitions.
What utility-commands are used to configure partitions?	The `aps` command using options for scheduler partitions only.