1.3.1 Processes
Uniprocessor computers can only accomplish one task at a time. To
simulate concurrency, multitasking techniques are used. Time-sharing
systems allowing the users simultaneous access have been around for a
long time. The standard Unix term for an active task in an operating
system is a process. A process can be thought of as a computer
program in action. Associated with a process are a number of system
resources, such as a memory address space, files, and other system
resources. The process also includes a program counter, a stack
pointer, a set of registers. Many process can be running concurrently,
each process in its own protected virtual address space. Limited
communication between process is possible thru secure kernel mechanism
(signals, sockets, shared memory). This separation of processes must
guarantee that one process cannot corrupt another process
(Fig. 1.4).
The operating system is able to emulate multiple simultaneous tasks by
giving each process a chance to run on the CPU for a limited amount of
time. Intelligent scheduling of the processes can give the user the
impression of truly simultaneous activities. There are many occasions
when a process does not require the CPU, for example, when the program
is waiting for user input. While the process waits for input/output,
the system gives the CPU to other deserving processes. The suspending
of one process and the resuming of another is called a process
context switch. The system saves the registers, the program counter,
the stack pointer of the suspended process and re-installs those of
another, runnable process. What process is scheduled next is
determined by the scheduling algorithm.
|
|
Figure 1.4:
The resources, registers, and virtual address space of a process.
|
1.3.2 Threads
A process executes one program. Processes run in their own address
space and have their own resources. Communication between processes is
limited, mainly to prevent one process from corrupting another. In
many cases concurrency within a given program can be desirable. The
old way of designing multiple tasks within a process was to fork of a
child process to perform a subtask. In the the eighties, the general
operating system community embarked upon several research efforts
focused on threaded program designs, as typified by the Mach OS
developers at Carnegie-Mellon [Loe92]. With the dawn of
the nineties, threads became established in various UNIX operating
systems.
The thread model takes a process and divides it into two parts. The
first part contains resources used across the whole program, such as
program instructions, global data, the virtual memory space, and file
descriptors. This part is still referred to as the process. The second
part contains information related to the execution state, such as
program counter and a stack. This part is referred to as a thread
[NBPF96]. A process can integrate several threads
(Fig. 1.5). Threads are the units of
execution; they act as points of control within a process. The process
provides the contained threads with an address space and other
resources.
|
|
Figure 1.5:
The structure of the threads inside a process.
|
Not all operating systems provide threads. For those systems that do
not provide threads it is possible to use a thread library. Languages
implementations that define threads as a language constructs generally
do not rely on threads provided by the operating system but provide
their own thread system. The overwhelming reason to use threads over
multi-processes lies in the following benefit: threads require less
program and system overhead to run than processes do. This translates
into a performance gain. We give a concrete example taken from the
Real-Time Encyclopedia: on
the IRIX systems developed by Silicon Graphics Inc. a process context
switch takes 50 micro-sec, while a thread context switch takes
6 micro-sec. Another advantage is that all threads share the
resources of the process in which they exist. Multiple processes can
use shared memory if specially set up. However, communication between
processes involves a call to the system and is more expensive than
communication between threads. Communication between threads can
usually be done in the user address space.
The operating system continuously selects a single thread to run from
a systemwide collection of all threads that are not waiting for the
completion of some I/O or are blocked by some other activity. It is
beneficial to give those threads that perform important task advantage
over those that do background work. The choice of any given thread for
special scheduling treatment is determined by the setting of the
thread's scheduling priority, scheduling policy, and scheduling scope.
When the system needs to determine what thread will run, it should
find out which thread needs it most. The threads priority determines
which thread, in relation to the other threads, get preferential
access to the CPU. In general, the system holds an array of priority
cues. When looking for a runnable thread the system starts looking in
the highest priority queue and works its way down to the lower
priority queues to find the first thread. When and how long the thread
runs is determined by the scheduling policy. Well known scheduling
policies are first-in-first-out (FIFO), round robin, and time sharing
with priority adjustments. FIFO picks the first thread out of the
queue and lets it run till it blocks on a I/O operation. The blocked
thread is then inserted at the end of the queue. The round robin
policy lets the thread run for fixed amount of time, called a
quantum. If the thread does not block before the quantum, the
scheduler blocks the thread and places it at the end of the queue.
The time sharing with priority adjustments will increase the threads
priority if it blocks before the quantum expires. Threads blocking on
I/O are given privileges over CPU consuming threads using all their
quantum.
Furthermore, threads can be scheduled within system scope or within
process scope. When scheduling occurs within system scope, the thread
competes against all runnable processes and threads on the system
level. The system scheduler determines which threat or process will
run. When scheduling occurs in process scope, the thread competes with
the other threads within the program. The system scheduler determines
which process will run, the process scheduler determines which thread
will run. Some systems allow the developer to specify in what scope
the thread will be scheduled. However, in many cases threading is
provided by a threads library in which case threads are always
scheduled within process scope.
Access to the shared resources must be synchronized. If uncontrolled
access was allowed race conditions may occur: two threads access the
resource at the same time leaving the program in a corrupted state.
There exist several techniques to assure the synchronization between
threads: semaphores, mutex variables, locks, condition variables, and
monitors. For this text we will only use the notion of critical
zones. A critical zone is a piece of code that must be executed
atomically. This means that only one process can be inside a critical
zone at a time. When another process tries to enter a critical zone,
it looses the CPU and is put in a waiting line until the critical zone
is deserted.
1.3.3 Real-time tasks
A real-time task responds to unpredictable external events in a timely
predictable way. After an event occurred an action has to be taken
within a predetermined time limit. Missing a deadline is considered a
severe software fault. A real-time task must be sure it can acquire
the CPU when it needs to, and that it can complete the job within time
delay. Since several real-time tasks might be running in the system
concurrently, conflicts may arise. Since real-time tasks require the
cooperation from the system we will discuss briefly real-time
operating systems (RTOS). According to [TM97] for a
operating system to be labeled real-time, the following requirements
should be met:
- The RTOS has to be multi-threaded and preemptible. To
achieve this the scheduler should be able to preempt any
thread in the system and give the resource to the thread that
needs it most. The Linux system, for example, can not be
preempted: all kernel threads run till completion even if
real-time threads are waiting.
- The notion of thread priority has to exist. In an ideal
situation, a RTOS gives the resources to the thread that has
the closest deadline to meet. The system must know which
thread has to finish its job and how much time the job will
take. This is far too difficult to implement and OS developers
therefore introduced the concept of priority levels.
- The RTOS has to support predictable thread
synchronization mechanisms and a system of priority inheritance
has to exist. The combination of thread priority and resource
sharing leads to the classical problem of priority
inversion. The generally accepted solution is priority
inheritance. We will discuss this below.
- The OS behavior should be known. The time needed for
interrupt handling, context switching, memory access, or
acquiring a lock should be documented. Since real-time
applications designers measure worst case latencies, these
figures should be available to the developers.
The Real-Time Encyclopedia provides a long list of available
real-time, mostly embedded systems. The best known public research
project on kernel design and real-time systems is the Mach kernel with
the real-time extensions (RT-Mach, [TNR90]). The POSIX
thread proposal defines an extension for real-time threads
[POS96]. The POSIX real-time thread extensions are optional, but
several operating systems provide them.
Threads and processes are attributed a priority. This priority is used
in the scheduling algorithm to determine what thread or process will
run next. Priority inversion can occur in the following conditions 1)
there are three threads, one with a low, one with a medium, and one
with a high priority, 2) the low and the high priority thread share a
resource that is protected by a synchronization mechanism. Consider
the situation where the low priority thread acquires exclusive access
to the shared resource. The medium priority threads wakes up and
preempts the low priority thread. When the high priority thread is
resumed and tries to acquire access to the resource, it is blocked
till the low priority thread releases the shared resource. Since the
low priority thread has been suspended by the medium priority thread,
the high priority thread is forced to wait for the medium priority
thread and could be delayed for an indefinite period of time. The
generally accepted solution to the problem is the technique of
priority inheritance [SRL87]. Priority inheritance is an
attribute of the synchronization mechanism (lock, semaphore, mutex,
etc.). Conceptually, it boosts the priority of a low priority thread
that acquired a shared resource whenever a high priority thread is
waiting for that resource. The priority of the low level thread is set
to the priority of the high level thread and can run un-interrupted by
medium priority threads till it relinquishes the shared resource. In
this way, the high priority thread's worst case blocking time is
bounded by the size of the critical zone.
In applications where the workload consists of a set of periodic tasks
each with fixed length execution times, the rate monotonic scheduling
algorithm, developed by Liu & Layland, can guarantee
schedulability. The rate monotonic algorithm simply says that the more
frequently a task runs (the higher its frequency), the higher its
priority should be. If an application naturally is completely periodic
or can be made completely periodic, then the developer is in luck
because the rate monotonic scheduling algorithm can be employed and a
correct assertion can be made about the application meeting its
deadlines. Rate monotonic scheduling statically analyses the
requirements, which means that the type and number of tasks are known
before hand and that their priorities are fixed.
The Real-Time Mach kernel uses a time-driven scheduler based on a rate
monotonic scheduling paradigm. A thread can be defined for a real-time
or a non-real-time activity. A real-time thread can also be defined as
a hard or a soft real-time thread, and as periodic or aperiodic based
on the nature of its activity. The primary attributes of a periodic
thread are its worst case execution time and its period. Aperiodic
threads are generally resumed as a result of external events. Their
main attributes are the worst case execution time, the worst case
interval between two events, and their deadline.
The real-time processing of digital video and audio in multimedia
applications requires rigid latency and throughput. These applications
use several system resources simultaneously such as CPU time, memory,
disk and network bandwidth. In addition, the work load of these
applications may vary according to user input and new tasks may be
added or removed to the system dynamically. In order to provide
acceptable output, a resource management is necessary that exceeds the
classical scheduling algorithm problem. Furthermore, it should be
left to the developer and end user to define the measures of goodness
for the performance of the application. Ideally one would provide a
range of real-time services from ``guaranteed response time'' to
``best effort'' and allow developers and users to decide whether the
extra cost of a more rigid real-time service is worth the perceived
benefit. The problem is best expressed in terms of real-time resource
allocation and control. A more recent notion in real-time systems is
quality of service. Quality of service (QoS) is used in
networking, multimedia, distributed, and real-time
systems. Applications contending for system resources must satisfy
timing, reliability, and security constraints, as well as applications
specific quality requirements. The problem is expressed in terms of
allocating sufficient resources to different applications in order to
satisfy their quality of service requirements. Requirements such as
timeliness and reliable data delivery have to be traded of against
each other. Applications may content for network bandwidth; real-time
applications may need to have simultaneous access to memory cycles or
disk bandwidth [Jef92,CSSL97,RLLS97]. The RT-Mach
kernel uses a QoS-manager to allocate resources to an application.
Each application can operate at any resource allocation point within
minimum and maximum thresholds. Different adjustment policies are used
to negotiate a particular resource allocation. The RT-Mach QoS server
can allow applications to dynamically adjust there resource needs
based on system load, user input or application requirements.
Quality of service resource allocation supposes that an application
may need to satisfy many requirements and require access to many
resources. Furthermore, the application requires a minimum resource
allocation to perform acceptably. The application utility is defined
loosely as the increase in performance when when the application is
allocated a new share in the resources. Every application is given a
weight, such that the total utility of the system can be measured as
the sum of the weighted utilities. Consider, for example, an audio
application outputting its samples over the network at a rate of
8kHz. Delivering this quality requires a certain amount of CPU time
and network bandwidth. If the application increases its sampling rate
to 16kHz the CPU time and bandwidth may double. However, the increase
in sound quality might justify the extra claim on the resources: the
utility is high. If the application moves the sampling rate to 32kHz,
CPU time and bandwidth may double again, but the utility might not be
high enough to justify the increase of resource usage. When several
applications run concurrently and compete for the resources a
negotiation will determine the allocation of the resources between the
applications. Quality of service of service not only applies to the
usage of the CPU time but includes other system dimensions as
well. Before a new task is accepted by the kernel, it must specify its
requirements. Once accepted the kernel guarantees that the demand is
available.