Principles of Virtual Memory

Demand Paging

Virtual Memory is a collection of one or more logical address spaces. The size of each logical address space can exceed the size of physical memory. These logical address spaces are known as virtual addresses. Typically, each process has its own virtual memory. The pages within this virtual memory must be loaded into physical memory before they can be used. Since virtual memory is much larger than physical memory, the operating system requires a robust scheduling mechanism to maintain efficient memory utilization. Demand paging is a principle for loading pages; it dictates that pages are loaded into memory only when needed, rather than loading all pages of a process into physical memory at once.

To manage this, a marker is needed. The valid-invalid bit, sometimes called the present bit, indicates whether a page is in disk or memory. When a page is in physical memory, the bit is set to 1; otherwise, it is 0. When a process attempts to access a page that is not in memory, a page fault occurs.

We add this valid-invalid bit to the standard page table to track page location. If a process executes and encounters a page where the present bit is 0, a page fault is generated. The missing page is then loaded from virtual memory into physical memory, and the present bit in the page table is updated to 1. After this, the page can be accessed.

present bit

As shown above, a present bit is added to the page table. It indicates if a page is loaded into memory. For example, the present bit for p3 is 0, meaning this page is not currently in memory and the system must load it.

Note that the primary role of the page table is to map pages to frames. A process is allocated a fixed amount of logical memory, but it doesn’t necessarily use all pages across that entire space. The page table doesn’t decide which page to load; rather, the system decides which page is loaded and then updates the corresponding present bit in the page table (see below).

present bit 2

Page Replacement

Virtual memory is significantly larger than physical memory. When all frames are occupied and a new page fault occurs, one of the pages currently in physical memory must be removed to free space for the new frame. Page replacement is the process of swapping an existing page in memory for a new one.

Moving pages between memory and disk is very time-consuming, so methods are needed to minimize this overhead. If a page removed from memory has not been modified during its stay, it does not need to be written back to disk. To implement this optimization, we use a modified bit (also known as the m-bit or dirty bit) to track if a page has been altered. If the m-bit is 1, the page has been modified; otherwise, it has not.

m-bit

Multi-Level Page Tables

As discussed, virtual memory can be very large, potentially exceeding physical memory size significantly. Not all processes require such large address spaces, which can make a single, flat page table for virtual memory extremely wasteful. To optimize this, we introduce another layer of page tables. Consequently, the segmented-paged memory management scheme (s, p, w) evolves into (s, p1, p2, w), and potentially more levels.

second level page table

This means that when a frame in physical memory needs to be replaced, the system must first find the corresponding page table entry p2 for that frame, then use p2 to find p1, and so on, to trace back the full logical address structure (s, p1, p2, w) and ensure size consistency before writing the content to virtual memory.

Fixed-Frame Page Replacement

Memory has a fixed number of frames. One approach to sharing frames among multiple processes is to allocate a fixed segment of frames to each process. When a page fault occurs and no free frames are available, the operating system must replace one of the pages currently in memory. A reference string is the sequence of page numbers that a running program requests over time. By analyzing the resulting page faults, reference strings are used to compare the effectiveness of different page replacement algorithms. For instance, in the previous page table diagram with two programs: Program 1’s reference string might be p1, p3, and Program 2’s could be p1, p2, p5. These strings represent their page access patterns at that specific point. We will now use reference strings to illustrate several replacement algorithms.

Optimal Page Replacement

The optimal page replacement algorithm selects the page for replacement that will not be used for the longest period of time in the future. This algorithm guarantees the lowest possible page fault rate among all replacement strategies.

FIFO Page Replacement

The First-In-First-Out (FIFO) page replacement algorithm selects the page that has been in memory for the longest time. Do not confuse this with optimal replacement; while optimal replacement discards the page that won’t be used for the longest future time, FIFO discards the page with the longest past duration in memory.

Least Recently Used (LRU) Algorithm

The Least Recently Used (LRU) page replacement algorithm replaces the page that has not been referenced for the longest period. This algorithm requires maintaining a queue of length n, where n is the number of frames in memory. This queue contains all pages currently residing in memory.

When a page already in memory is referenced, it is moved to the head of the queue. This ensures that the queue is always ordered with the most recently used pages at the front. When a page not in memory is referenced, it is added to the head of the queue. If the queue length is already n, the page at the very tail of the queue (the least recently used) must be removed.

Approximate LRU Algorithms

The true LRU algorithm requires updating the queue structure on every single memory access, which imposes extremely high overhead and is generally not considered practical in hardware. Several other algorithms have been designed to approximate the behavior of LRU.

Aging Page Replacement Algorithm

A referenced bit (or r-bit) is used to track page accesses and is typically implemented at the hardware level.

An aging register is associated with each page. Periodically, this register shifts right by 1 bit, unless the most significant bit (MSB) is set to 1. Pages essentially “age” as the values in their registers decrease with each right shift.

The Aging replacement algorithm avoids maintaining a strictly ordered queue like LRU. Instead, it groups pages whose referenced bit was marked as 1 within a single time interval. One shift of the aging register defines one such time interval or cycle.

The algorithm proceeds with the following steps, executed every time d pages have been referenced (where d defines the period):

All aging registers are shifted right by one bit, discarding the least significant bit.
The value of the reference bit is copied into the most significant bit.
The reference bit is reset to 0.

When a page fault occurs:

The page with the smallest aging register value is selected and replaced.
If multiple pages have the same minimum value, one is chosen randomly.

Note: Creating GIFs is time-consuming, so please bear with these visual aids.

aging

Second-Chance Algorithm

The Second-Chance page replacement algorithm is another LRU approximation. It uses the r-bit to classify pages into two categories: recently accessed and not recently accessed. The algorithm aims to find and replace a page from the “not recently accessed” category.

Similar to the FIFO algorithm, this approach maintains a pointer that cyclically traverses all pages to locate a suitable page for replacement. Due to this cyclic search behavior, it’s also known as the Clock algorithm.

When a page fault occurs, the algorithm repeats these steps until a page is selected and replaced:

If the pointer indicates a page p and its r-bit is 0, then replace p and move the pointer to the next page.
Otherwise (r-bit is 1), reset the r-bit to 0 and advance the pointer to continue the search.

second chance page replacement algorithm

Third-Chance Algorithm

The Third-Chance algorithm extends the concept by offering yet another “chance,” meaning it tracks four states using two 1-bit flags ($2^2 = 4$). The Third-Chance page replacement algorithm is also known as the Not Recently Used (NRU) page replacement algorithm. It classifies pages into four states represented by combinations of two 1-bit flags (referenced and modified).

When a page fault occurs, the algorithm repeats the following steps until a page is selected and replaced:

If the pointer points to a page where both reference bit r and modified bit m are 0, select that page for replacement and advance the pointer.

Otherwise, update the r and m bits according to the state transition table below.

State Transition	Type	Description
11 -> 01	Referenced Page	Page was recently used and modified. Give second chance, reset `r`.
01 -> 00	Referenced Page	Page was recently used but not modified. Give second chance, reset `r`.
10 -> 00	Unreferenced Page	Page was not recently used but modified. Give second chance, reset `m`.

Working Set Model

A program typically needs a specific set of memory-resident pages to run efficiently. If there aren’t enough frames available, page faults will occur frequently. Constantly swapping these pages in and out will significantly degrade performance.

A process’s optimal working set refers to the collection of pages it needs immediately, which should ideally be pre-loaded into physical memory. The size of an optimal working set varies based on the process’s current needs. To determine a process’s optimal working set at a specific time t, a working-set window of size d is placed over the reference string. The unique pages within that window constitute the working set elements. The size of the set and its specific elements change over time. The window size itself is determined by the program, and its behavior during initial loading defines this size.

optimal working set

Working Set Replacement Algorithm

In reality, the optimal working set cannot be predicted because we can’t know which page will be referenced next. Therefore, we rely on suboptimal but practical solutions. While we can’t predict future page requests, we do know which pages have been referenced in the past. A process’s working set (WS) refers to the set of pages accessed during the process’s last d memory operations. The Working Set page replacement algorithm (or WS algorithm) uses this window of size d to monitor the reference string, dynamically adjusting the size and composition of the working set.

As time t progresses, the working set algorithm operates as follows:

Advance the window by 1 unit to encompass the current reference string segment.
Retain only the pages included within this window in physical memory.

WS algorithm

Unlike the optimal working set model (which looks ahead), the Working Set algorithm looks back at past page usage. In the example above, at times t=-1 and t=0, the working set algorithm hasn’t truly started; it only begins recording once the window can be fully populated with required pages.

Page-Fault Frequency Replacement Algorithm

While the Working Set algorithm is effective at keeping page fault rates low, it’s not fully efficient because every window movement triggers a potentially costly set update. A better approach to this problem is the Page-Fault Frequency (PFF) replacement algorithm. This algorithm directly controls the page fault frequency (PFF) by adjusting the resident set of pages. Critically, this algorithm is only activated when a page fault actually occurs.

When a page fault happens, the algorithm performs the following:

Add the newly needed page to the resident set.
Check the time elapsed since the last page fault. If this interval is greater than a predefined threshold d, remove all pages from the resident set that have not been referenced since that last page fault.

It’s important to define the resident set here: it refers to the collection of pages currently loaded in physical memory at a given time t.

Let’s examine an example:

pff

In the diagram above, red bars indicate page requests that caused a fault. Page faults occurred at times 1 and 2, but since the interval from the previous fault was less than 2 in both cases, the PFF replacement algorithm was not activated (beyond adding the needed page). At time 4, accessing P3 triggered both a page fault and the PFF algorithm, as the time elapsed since the previous fault was greater than 2. The PFF algorithm was activated, but since all pages (P1, P2) had been referenced during this interval, they were all retained. At time 7, referencing P4 triggered a page fault and the PFF algorithm again. Since only P3 was referenced during the interval from time 4 to time 7, pages P1 and P2 were removed from the resident set. The remaining two page requests (at times 8 and 9) did not activate the full removal logic because the interval since the previous fault was 2 or less.

The key advantage of PFF over the standard Working Set algorithm is that it only runs when a page fault occurs, not on every memory access. Furthermore, when it does run, it proactively removes pages that haven’t been referenced for the longest time, ensuring that ideal memory space is cleared for future page loads.

Complexity Analysis of Virtual Memory

Cost of Demand Paging

Virtual memory represents a trade-off between time and space complexity. The price for having a virtual address space larger than physical memory is the page faults that occur. Each page fault involves read/write operations to both memory and disk. Since disk I/O is much, much slower than memory access, the primary goal is to minimize the page fault rate.

Local Control and Thrashing

When we discuss local control, we are referring to decisions made about “how many processes should be executed simultaneously” at any given moment. Thrashing occurs when the system spends more time paging than executing instructions. Imagine a scenario where local control is poor: a process reaches a point where it needs more frames, causing a page fault. If memory is full, the system might have to take frames from other processes to resolve the fault. If those other processes then need those same frames to execute, they will also fault, leading to a cascade of faults as processes constantly steal frames from each other.

cpu-utilization

As shown above, with only one process running, page requests leading to I/O operations cause low CPU utilization. As more programs are loaded into memory to run, CPU utilization initially increases steadily. However, this growth is not indefinite due to limited physical memory. After reaching a peak, CPU utilization drops sharply as the number of processes increases. This is because with more processes, more processes are simultaneously waiting for I/O operations to complete, causing CPU utilization to approach zero.

As the number of programs to be executed grows, the page fault rate increases. The operating system must strike a balance between the number of page faults and CPU utilization. CPU utilization is maximized when the average interval between page faults (L) equals the average time required to service a page fault (S). The operating system can use this principle to decide how many programs can reside in memory simultaneously. If L < S, the operating system cannot service page faults quickly enough and must consequently reduce the number of programs in memory. Conversely, if L > S, the time between page faults is long, and the OS can increase the number of concurrently running programs in memory to improve CPU utilization.

Determining Page Size

Page size has a profound impact on both time and space complexity. Some processes thrive with larger pages, while others require smaller ones. Choosing an inappropriate size can drastically affect the page fault rate. Common page sizes range from 512 to 4096 bytes, but this number has been increasing in recent years. Some operating systems even support multiple page sizes to accommodate the diverse needs of different programs.

Impact of Page Size on Space

Consider an original page size of 4 words (8 bytes). A single page table entry could precisely manage these 4 words. If we halved the page size to 2 words, we would need twice as many pages to cover the same amount of memory. Consequently, the page table itself would also have to double in size to track all these new pages. Therefore, smaller page sizes generally require more space for page table structures.

Impact of Page Size on Time

Assume it takes time m to read one page from disk. If we halve the page size, we effectively double the number of pages. Since there are more pages, the number of individual disk reads required will also increase. Therefore, overall, smaller page sizes tend to result in longer cumulative I/O time.

Principles of Virtual Memory and Page Replacement Algorithms