Operating systems must follow certain rules to arrange the execution of processes, balancing the needs of each process while effectively managing their priorities.

Types of Schedulers

A long-term scheduler (or job scheduler) decides which processes should be admitted to the ready queue (entering the ready state) and which are terminated. A short-term scheduler (or CPU scheduler) decides which of the ready processes should be moved to the running state (allocated the CPU).

LT-ST

Types of Scheduling

When we previously discussed process control, we mentioned a value that records scheduling information, which includes the priority information of a process or thread. Using this priority information, processes can be scheduled competitively.

Preemptive scheduling allows processes to compete with each other to run, whereas non-preemptive scheduling involves no such competition.

Note: Preemptive and non-preemptive scheduling are often referred to as抢占 (qiǎngzhàn) and 非抢占 (fēi qiǎngzhàn) scheduling in Chinese.

p-np

Scheduling Rules

Many factors determine a process’s priority, and the operating system must refer to scheduling rules to assign priorities. The following are common reference parameters:

Parameter Explanation
Arrival time The time a process arrives in the ready queue.
Departure time The time a process leaves the ready queue.
Attained CPU time The duration the process has spent in the running state.
Real time in system The total duration the process has existed in the system since creation.

Both long-term and short-term scheduling require this information, though long-term scheduling generally does not perform these operations as frequently. Additionally, for short-term scheduling, the arrival time begins when the process is placed into the ready queue (e.g., times $t_1, t_4$ in the diagram below), while for long-term scheduling, it is $t_1$. For short-term scheduling, the CPU time is the duration from when the CPU starts running it until it stops (starting from $t_2, t_5$), whereas long-term scheduling records the sum of all such running times. However, I believe that parameters and calculation methods can vary between different operating systems.

frequency-lt-st

Batch Processes

Batch processes refer to processes that run without direct human intervention. The computer retrieves the necessary data, processes them sequentially, and then outputs the results.

First-In-First-Out algorithm (FIFO)

fifo

This is the simplest scheduling algorithm. The process that arrives in the ready queue first is executed first. When process $P_1$ is running at time $t_1$, process $P_2$ enters the ready queue; $P_2$ is then delayed until $t_2$, starting execution only after $P_1$ terminates.

Shortest Job Next algorithm

sjf

The Shortest Job Next algorithm (SJN), as the name implies, is straightforward. The system estimates the CPU time required for processes and schedules the one needing the shortest CPU time first. As shown in the figure above, while $P_1$ is still running, $P_2$ and $P_3$ are placed into the ready queue. When $P_1$ finishes execution, the system selects $P_3$, which can finish faster, to run. Only then does the longer process $P_2$ run.

Shortest Remaining Time First

sjf

The Shortest Remaining Time First algorithm (SRTF) makes scheduling decisions based on the remaining time required by processes. In the figure above, when $P_1$ has executed to $t_1$, $P_3$ is placed in the ready queue. Because $P_3$ requires less remaining CPU time than $P_1$, $P_3$ is scheduled into the running state. When $P_3$ finishes running, both $P_1$ and $P_2$ are in the ready queue. Since $P_1$ requires less remaining CPU time than $P_2$, $P_1$ is prioritized for execution.

Performance of the Algorithms

Let’s compare the performance of these algorithms. Before we do, we need to understand a few concepts:

  1. Turnaround Time: The total time from when a process enters the ready state until its execution finishes.
  2. Average Turnaround Time: The average of the turnaround times of all processes.
  3. Process Starvation: A phenomenon where a process is kept in the ready state for a prolonged period relative to other processes.

algorithm_analysis

As shown in the figure above, there are three processes $P_1, P_2, P_3$ in the ready queue. We use white boxes to represent the delay time. The delay time plus the running time equals the turnaround time. From this, we can derive the average turnaround time formula:

$$ATT = \frac{1}{n} \sum_{i=1}^{n} (w_i + r_i)$$

where $w_i$ is the waiting time (delay) of process $i$ and $r_i$ is its running time. The turnaround time for process $i$ is $t_i = w_i + r_i$.

If using the FIFO algorithm, $P_2$ must delay for 3 units of time before executing, and $P_3$ must delay for 7 units of time. The Average Turnaround Time (ATT) is:

$$ATT_{FIFO} = \frac{(0+4) + (3+5) + (7+1)}{3} = 6.66667$$

Calculating similarly, we find the average turnaround times for SJN and SRTF:

$$ATT_{SJN} = \frac{(0+4) + (2+1) + (4+5)}{3} = 5.33333$$

$$ATT_{SRTF} = \frac{(1+4) + (4+5) + (0+1)}{3} = 5.00$$

We can see that in the scenario shown in the figure, the SRTF algorithm is the most efficient because the average time all processes spend from waiting, to execution, and then to blocking or finishing is the shortest. However, this does not mean that SRTF and SJN are always superior to FIFO. As seen in the graphs for SRTF and SJN, $P_2$ is delayed. If a process arrives that requires much more time than typical processes, process starvation can occur. In the worst-case scenario, $P_2$ might never get executed!

Interactive Processes

Unlike batch processes, interactive processes require interaction with the user. For instance, the terminal we frequently use is an interactive process. Different algorithms are used for scheduling these processes.

Round-Robin Scheduling Algorithm

Many people directly use the English name, Round-Robin Scheduling. It is also commonly known as 时间片轮转算法 (Shíjiān piàn lúnzhuǎn suànfǎ, Time-Slice Rotation Algorithm), RR algorithm, or simply Round-Robin.

We divide time into very small units called time slices^1 (Time Quantum, typically ranging from 10 to 100 milliseconds). The algorithm runs processes in the ready queue sequentially, allocating one time slice of execution time to each, in rotation until they complete (see the figure below; note the units on the x-axis).

RR_scheduling

Multilevel Scheduling Algorithm

While the RR algorithm treats all processes equally, the Multilevel Scheduling algorithm builds upon RR by distinguishing process priorities.

multilevel_scheduling

As shown above, all processes are assigned to a multilevel priority queue. This algorithm starts with the highest priority processes and uses RR within each level, working its way down the levels. If a new, higher-priority process enters the queue during execution, the algorithm jumps back to that highest priority level and resumes execution downwards from there.

Multilevel Feedback Algorithm

Building on multilevel scheduling, the Multilevel Feedback algorithm (MLF) can change process priorities, and the length of the time slice for each level is different. A process runs at level $i$ for a period and is then downgraded to level $i-1$. At level $i-1$, it can be run for $2 \times$ the duration allowed at level $i$, and then moves down to the next level. The lowest priority processes can run indefinitely to ensure all processes can complete.

Priority Time Slice Allocation (relative)
N 1
N-1 2
N-2 4
N-3 8
1 $\infty$

Of course, the rate of increase for time slice allocation in this algorithm also varies across different operating systems. As the table suggests, the Multilevel Feedback algorithm favors short processes—the shorter the process, the faster it can be completed—while longer processes are allocated more time in subsequent levels to allow them to finish.

Effect of Time Slices

Response Time is the duration required for a process to produce its first response. When we press a key or click a mouse, there is a response time. The length of the time slice affects a program’s response time. As shown in the figure, the red block represents response time. The longer the time slice, the longer the response time, and the lower the CPU utilization efficiency.

multilevel_scheduling

Real-Time Processes

Compared to interactive processes, real-time processes must operate continuously. They need to achieve a certain speed to maintain continuous output. Common examples include streaming media players. Imagine if input and output couldn’t happen in real-time; you would surely hear noise and see only a slideshow [laughs].

Like interactive processes, real-time processes need to divide time into segments called periods (ranging from milliseconds to microseconds).

Method Response CPU Usage
Short Time Slice Fast High
Long Time Slice Slow Low

Note: Added a table summary for clarity based on the text.

Rate-Monotonic

The Rate-Monotonic algorithm (RM) schedules processes based on their period. Under this algorithm, processes with shorter periods are given higher priority. Within a single period, certain process inputs must be handled.

rm_scheduling

As shown in the figure, $P_1$ has a period of 4 and requires 1 unit of CPU time; $P_2$ has a period of 5 and requires 3 units of CPU time. Since $P_1$’s period is shorter than $P_2$’s, it has higher priority. The Rate-Monotonic algorithm prioritizes $P_1$ for execution. After $P_1$ finishes, since it’s still within $P_2$’s period, $P_2$ can be selected to run. Soon, as the second period of $P_1$ begins, its priority is higher, so the scheduling algorithm chooses to execute $P_1$. Continuing this pattern gives us the Rate-Monotonic algorithm.

Earliest Deadline First Scheduling

Just as the name suggests, Earliest Deadline First scheduling (EDF) prioritizes processes with the earliest deadlines.

rm_scheduling

As shown above, which uses the same example parameters as the RM figure (note: the text says period of 3 and 4, the figure shows 4 and 5 like the RM example, let’s assume the figure’s parameters 4 and 5 to be consistent), let’s assume $P_1$ has period 4 (deadline $t_4$) and CPU time 1, and $P_2$ has period 5 (deadline $t_5$) and CPU time 3. At the start, $P_1$’s deadline ($t_4$) is earlier than $P_2$’s ($t_5$), so its priority is higher. After executing $P_1$, $P_2$ runs. Now consider the text’s scenario: $P_1$ period 3, $P_2$ period 4. $P_1$ runs first (deadline $t_3$). Then $P_2$ runs. During $P_2$’s run, a new $P_1$ period arrives at $t_3$ with a deadline at $t_6$. Since $P_2$’s current deadline ($t_4$) is earlier than $P_1$’s new deadline ($t_6$), $P_2$ continues to run. Continuing this pattern gives us Earliest Deadline First scheduling.

Performance of the Algorithms

For real-time processes, the ideal state is that all processes can complete their required computation within their respective periods. We say a scheduling is feasible if all processes can complete their necessary work within their periods.

By dividing the total CPU time required by processes by their respective periods, we get a CPU utilization rate^2:

$$U = \sum_{i=1}^{n} \frac{T_i}{D_i}$$

where $i$ is a specific process, $T_i$ is its required CPU time, and $D_i$ is its period (deadline). If $T$ is greater than $D$ for any process, these algorithms are not viable.

Comparison of RM and EDF

From the two diagrams above (assuming the corrected parameters for consistent comparison), we observe that under EDF, neither $P_1$ nor $P_2$ miss their deadlines, whereas under RM, the first run of $P_2$ has one unit of its required time missing within its period (as seen in the RM diagram where $P_1$’s second run preempts the first run of $P_2$ which then doesn’t finish before $t_5$, though the diagram can be interpreted differently, the standard interpretation is that EDF can often achieve higher utilization).

Integrated Solutions

General-purpose operating systems must simultaneously schedule batch processes, interactive processes, and real-time processes. Therefore, we need an integrated solution.

Two-Tier Scheduling

The Two-tier approach divides processes into two groups: real-time processes in one, and batch and interactive processes in the other.
For the first group, real-time processes, those with high priority, being fast and short, can simply use FIFO scheduling; those with lower priority use RR. The other group, comprising batch and interactive processes, can be scheduled using MLF.

2-tier

Two-Tier Scheduling with Dynamic Priorities

This method adds weight to the two-tier scheduling approach. While real-time processes remain unaffected, other types of processes are initially assigned sufficiently low priorities. After running, priorities are increased for those processes whose requested resource responses are faster. For example, if a process needs to access resources on a remote computer and the network delay is high, requiring waiting, its priority will be lowered; a process that only interacts with the keyboard, where the response speed is very fast, will have its priority raised.