现充|junyu33

(archived) OS Notes

Only the theoretical portion is provided for final exam review.

Computer System Overview

More focused on computer organization, not tested in the exam.

Operating System Overview

OS: a program that controls the execution of application programs and acts as an interface between applications and the computer hardware.

Interrupts: Early computer models did not have this capability. This feature gives the OS more flexibility in relinquishing control to and regaining control from user programs.

Process Description and Control

Process

Interleave, Strategy, Deadlock.

Definition:

Components:

Threads

Thread: Lightweight process.

Linux does not have kernel threads, while Windows does. The advantages of not having kernel threads include being lightweight and easier to manage.

Concurrency

Terminology

Characteristics of single-program systems:

Characteristics of multi-program systems:

algorithm

Dekker

Peterson's Algorithm

Hardware Disabling

Disadvantages:

Special Instructions

Advantages: No limit on the number of processes, simple and easy to verify, supports multiple critical sections.

Disadvantages: Busy waiting, starvation, deadlock.

Semaphores

Terminology

Operations

Steps

Semaphore(a, b, c)
{
  a = 1;
  b = 2;
  c = 3;
}

void p_1...p_n()
{
  // do something
}

void main()
{
  Parbegin(p(1), p(2), p(3), ..., p(n));
}

Problems

Producer-Consumer Problem:

void producer() {
  while (1) {
    produce();
    semwait(empty);
    semwait(mutex);
    // put item in store
    semsignal(mutex);
    semsignal(product);
  }
}
void consumer() {
  while(1) {
    semwait(product);
    semwait(mutex);
    // take item from store
    semsignal(mutex);
    semsignal(empty);
    use();
  }
}

Reader-Writer Problem:

Mutual exclusion between readers and writers, and between writers and writers.

Mutual exclusion among readers (race condition).

Reader-Priority:

Semaphore mutex = 1; // File mutual exclusion
Semaphore read_mutex = 1; // Reader count mutual exclusion

int counter = 0;

void reader() {
  int m;
  semwait(read_mutex);
  counter++;
  semsignal(read_mutex);
  if (counter == 1) semwait(mutex); // the first to read, compete with writer
  read();
  semwait(read_mutex);
  counter--;
  semsignal(read_mutex);
  if (counter == 0) semsignal(mutex);
}

void writer() {
  semwait(mutex);
  write();
  semsignal(mutex);
}

Equal-Priority:

Semaphore mutex = 1; // File mutual exclusion
Semaphore read_mutex = 1; // Reader count mutual exclusion
Semaphore queue = 1; // Reader and writer queuing mutual exclusion

int readcount = 0;

void reader() {
  int m = 0;
  semwait(queue);
  semwait(read_mutex);
  readcount++;
  semsignal(read_mutex);
  if (readcount == 1) semwait(mutex);
  semsignal(queue);
  read();
  semwait(read_mutex);
  readcount--;
  semsignal(read_mutex);
  if (readcount == 0) semsignal(mutex);
}
void writer() {
  semwait(queue);
  semwait(mutex);
  write();
  semsignal(mutex);
  semsignal(queue);
}

Monitors

Composition

Inherently mutually exclusive. (At most one active procedure at any given time.)

Message Passing

send(dest, msg)

receive(src, msg)

Inherently synchronous.

Synchronous blocking: You're hungry, ask your mom to cook, and then you wait continuously.

Synchronous non-blocking: You're hungry, ask your mom to cook, you go play games, and occasionally ask if the food is ready.

Asynchronous blocking: You're hungry, ask your mom to cook, you go play games, your mom tells you when the food is ready, and you go eat.

Asynchronous non-blocking: You're hungry, ask your mom to cook, you go play games, your mom tells you the food is ready and brings it to you.

Advantage: Cross-host communication, usable in network programming.

Deadlock & Starvation

There is no universally effective solution.

Why Deadlock Occurs

Resources

Requirements

resolve

Allow Deadlock

Specific algorithm for deadlock detection:

Example:

  1. Find a zero vector in the Allocation matrix.
  2. Set w = v.
  3. Look for a row vector in the Request matrix that is less than w, mark it, clear A, and update the w vector.
  4. Repeat step 3 until no more comparisons can be made (deadlock) or all processes are cleared (no deadlock).

Do Not Allow

Banker's Algorithm Details:

Example:

  • a) 6+2+0+4+1+1+1=15,3+0+1+1+0+1+0=6 — Simply verify all A, B, C, D.
  • b) Need=CA
  • c) (Safety detection algorithm) Compare a row of CA with the V vector. If V is greater, update V=V+A, until no more comparisons can be made (unsafe) or all processes are cleared (safe).
  • d) (Banker's algorithm) Add the new request to the A matrix (in this case, add to P5 to get [4,2,4,4]), then update the V vector to [3,1,2,1], and perform step c again.

Dining Philosopher's Problem

Initial Solution (Incorrect)

Problem: Possible deadlock (all five philosophers take one fork)

sem p[i] = 1 from 0 to 4
void philosopher(int i) {
  think();
  wait(p[i]);
  wait(p[(i+1)%5]);
  eat();
  signal(p[(i+1)%5]); // routine
  signal(p[i]);
}

Solution 1:

sem p[i] = 1 from 0 to 4;
sem room = 4; // only four philosophers won't have deadlock
void philosopher(int i) {
  think();
  wait(room);
  wait(p[i]);
  wait(p[(i+1)%5]);
  eat();
  signal(p[(i+1)%5]); // routine
  signal(p[i]);
  signal(room);
}

Solution 2:

sem p[i] = 1 from 0 to 4;
void philosopher(int i) {
  think();
  min = min(i, (i+1)%5); // confines philosopher 0 and 4 (won't eat together)
  max = max(i, (i+1)%5); // so at most 4 procs
  wait(p[min]);
  wait(p[max]);
  eat();
  signal(p[max]); 
  signal(p[min]);
}

Q: What about 2 dynamic algorithms?

(TODO)

Assume there are m processes, and the maximum request for each process is xi. Then, a minimum of i=1m(xi1)+1 resources will not cause deadlock.

Memory Management

Fixed Partitioning

Dynamic Partitioning

Generates external fragmentation (solved using compaction algorithms)

Placement algorithms (Best Fit, First Fit, Next Fit)

A process may occupy different memory locations during its lifecycle (due to compaction and swapping techniques).

Buddy System

Similar to a binary tree

Superior to fixed and dynamic partitioning, but internal fragmentation exists

Simple Paging

Definitions: Frame, Page, Segment

Each process has a page table, which indicates the location of the corresponding frame (physical) for each page (logical) of the process.

A logical address consists of a page number and an offset within that page. The page number is used to look up the frame number (first 6 bits) in the page table, which is then concatenated with the offset (10 bits) to form the physical address.

A small amount of internal fragmentation exists.

Simple Segmentation

Unlike process page tables which only maintain frame numbers, process segment tables maintain both the length and base address of segments.

A logical address consists of a segment number and an offset. The segment number is used to look up the corresponding base address (16 bits) in the segment table, which is then added to the offset (12 bits) to obtain the physical address.

External fragmentation exists.

Memory Management Requirements

Homework:

Correction: Due to a mechanism called relocation (if X was originally at that location and later freed, reallocating with malloc will still place it at the same location instead of shifting left), 7.6.a remains 8MB. The second question should be corrected to [3M, 8M].

Virtual Memory

  • Fixed partitioning, dynamic partitioning: fully loaded, non-partitioned.
  • Simple paging, simple segmentation: fully loaded, partitioned.
  • Virtual memory paging, virtual memory segmentation: partially loaded, partitioned.

Virtual Memory Paging

In simple terms, it involves moving a portion of memory to the hard disk and loading it back into memory when needed.

If the page to be read is located on the disk, a page fault interrupt is triggered, blocking the process. After the page is loaded from the disk into memory, another interrupt is generated to resume execution. (Thus, a single page fault results in two interrupts. If the interrupt time exceeds the instruction execution time, it can cause "thrashing.")

For 32-bit systems with 4GB of memory, the operating system maintains a 4KB root page table in memory (each entry is 4 bytes, totaling 1024 entries). Each entry in this root page table corresponds to a 4KB page table (each entry is 4 bytes, totaling 1024 entries per page table, resulting in 1,048,576 entries in total, occupying 4MB of virtual memory space). Each of these 4MB virtual memory entries maps to a 12-bit virtual memory address (thus, the 1,048,576 entries can map to the entire 4GB virtual memory space).

For pwners: This is why the last three bits of an address remain unchanged under ASLR. ASLR only randomizes at the page level, not the offset within a page.

The method for converting a virtual address to a physical address is as follows: First, use the first 10 bits as an offset into the root page table to locate the corresponding root page table entry. Then, determine whether the user page table (the 4MB one) corresponding to that root page table entry is in memory. If it is, use the next 10 bits to locate the corresponding user page table entry, and combine the page frame number from that entry with the offset to obtain the physical address. If it is not in memory, a page fault interrupt is triggered.

The Translation Lookaside Buffer (TLB) can cache some page number-to-frame number mappings, thereby reducing interrupts and improving memory access speed.

The relationship between page size and page fault rate first increases and then decreases (due to the principle of locality; the entire process resides within pages). The relationship between the number of page frames and the page fault rate is monotonically decreasing (if every page has a corresponding page frame, there would be no page faults).

Virtual Memory Segmentation

A virtual address consists of a page number and an offset. The method to convert it into a physical address is as follows: the page number is used as an offset to look up the corresponding segment number in the segment table, and the base address of that segment is added to the offset of the virtual address to obtain the physical address.

OS Level

Read Strategies

Placement Strategy (Same as the Previous Placement Algorithm)

Determines the physical location where a process block resides.

Page Replacement Policies

Frame Locking: Pages that cannot be swapped out (e.g., kernel, etc.)

CLK Implementation Code:

pageSize = 3
pagePointer = 0
pages = [[0, 0] for i in range(pageSize)] # [page, used]

accessSeq = [2, 3, 2, 1, 5, 2, 4, 5, 3, 2, 5, 2]

def printPages():
    print('pagePointer:', pagePointer)
    for i in range(pageSize):
        print(pages[i][0], pages[i][1])

for newPage in accessSeq:
    hasPage = False
    for i in range(pageSize):
        if pages[i][0] == newPage:
            pages[i][1] = 1
            hasPage = True
            break

    if hasPage == True:
        printPages()
        continue
    
    times = 0
    for i in range(pageSize):
        times += 1
        if pages[pagePointer][1] == 0:
            pages[pagePointer][0] = newPage
            pages[pagePointer][1] = 1
            pagePointer += 1
            pagePointer %= pageSize
            break

        else:
            pages[pagePointer][1] = 0
            pagePointer += 1
            pagePointer %= pageSize

    if times == pageSize:
        pages[pagePointer][0] = newPage
        pages[pagePointer][1] = 1
        pagePointer += 1
        pagePointer %= pageSize

    printPages()

Resident Set Management

Clearance Strategies

Load Control

Affects the number of processes residing in memory (system concurrency) and processor utilization.

As the level of concurrency increases, processor utilization also increases. Beyond a certain point, the average resident set size becomes insufficient, leading to an increase in interruptions and a subsequent decrease in processor utilization.

Processor Scheduling

Types

Algorithm Design

Requirements

(Non)preemptive: A process can be dispatched even if the currently executing process has not yet finished.

Performance:

Fairness & starvation

algos

Real-time scheduling

EDF

Earliest Deadline First

for non-scheduled

RM

Rate Monotonic (higher rate/frequency first)

for scheduling

I/O

ALL OPERATIONS ARE NON-PREEMPTIVE!!

Buffering

Aim: Reduce process stuck time to improve efficiency smooth I/O performance.

Type

Disk Scheduling

T=Tseek+Trotation+Ttransfer

transfer (fixed)

Tr=12r

rotation (fixed)

b stands for number of bytes in a sector
Ns stands for number of sectors in a track
B=Nsb stands for number of bytes in a track
Td=bBr=1Nsr

Disk Scheduling Algorithms

Policies:

File

What

An abstraction to manage and access data.

Components

Field:

Record:

File:

Block:

requirements

File Operation

Basic: create, delete, open, close, read, write

directory

abstraction to organize files

Single Level

Two-Level Structure

hierarchical

Access Control

Typical levels:

Access rights:

In *nix systems

  • Access modes: Read, Write, eXecute (RWX)
  • Access rights: Owner, group, public
  • Access control grant: rwxrwxrwx

File Organization

Criteria:

Pile File

Easy to insert, difficult to search

Sequential File

Performance is still low

Index Sequential

Indexed

After an operation, all records need modification

Direct File or Hashed File

Storage Management

Goal: Manage the disk space to support a stream of file operations

Two basic tasks:

File Allocation

FAT: File Allocation Table

Free Space Management

Review Outline

2

Concept of an Operating System (OS), and its distinction from application programs
Functions and objectives of an operating system
Core components of an operating system, with a brief description of their functions
Evolution of operating systems (batch processing, time-sharing, real-time systems)
Concepts of time-sharing systems and batch processing systems, and their differences
Roles of interrupts and exceptions in time-sharing and batch processing systems
How to understand the operating system as a virtual machine

3

What hardware support (interrupts and exceptions) does the CPU need to implement processes?
What is a process, and what is its relationship and difference with a program?
The concept and function of the Process Control Block (PCB), and a description of its composition
Describe how a program becomes a process from the perspective of its lifecycle
What primitives are related to process control?
The five-state diagram of a process (diagram-related)
Explain process execution from three perspectives (discussion topic)
The process creation flow, combined with Linux's fork (one call, two returns)
The flow and timing of process switching (wait)

4

Concept of Threads, Differences Between Processes and Threads
Three Implementation Methods of Threads
Advantages and Disadvantages of ULT and KLT

5

Four problems that concurrency needs to solve (synchronization, mutual exclusion, starvation, deadlock)
Why synchronization and mutual exclusion mechanisms are introduced (race condition)
What solutions exist (software-based, hardware-based, system-based (monitors and semaphores))
Drawbacks of hardware-based solutions
Concept, essence, and implementation of semaphores
What is a monitor, and why is it introduced

6

Concept and Background of Deadlock
Conditions for Deadlock (Three Sufficient Conditions, Four Necessary and Sufficient Conditions)
Solutions to Deadlock (Prevention, Avoidance, Detection, Ignoring)
Relationship Between Deadlock and Starvation

7

Seven Methods of Storage Management
Similarities and Differences Between Fixed Partitioning, Paging, Dynamic Partitioning, and Segmentation
Structure and Usage of Segment Tables and Page Tables

8

Three Issues to Consider in Implementing Virtual Memory
Three Methods for Implementing Virtual Memory (Virtual Segmentation, Virtual Paging, Segmented Paging)
Page Fault Handling Process in Virtual Memory
Address Translation Calculations
Briefly Describe the Relationship Between Page Fault Rate, Page Size, and Number of Page Frames

9

Descriptions of Long, Medium, and Short-Term Scheduling (Graph-Related)

10

Background of IObuffer and Its Operational Mechanism

12

File System Implementation Approaches
At the logical level, there are five implementation methods.
At the physical level, there are three implementation methods.

DIY Notes (Chapter 9)

DIY Textbook: Link to Operating Systems: Three Easy Pieces: https://pages.cs.wisc.edu/~remzi/OSTEP/

Chapter 7: Introduction

Response Time—A New Metric

The STCF algorithm primarily optimizes turnaround time, an approach mainly targeted at early batch systems. However, for today's time-shared systems, response time is the key factor determining real-time performance.

The formula for response time is as follows:

Tresponse=TfirstrunTarrival

Suppose three processes A, B, and C all enter the queue at time t=0, and each has an execution time of 5. The scheduling using STCF/SJF is illustrated in the figure below:

Thus, the average response time is:

Tresponse=0+5+33=5

Round Robin

The Round-Robin algorithm is a preemptive algorithm that significantly improves the response time of each process. Instead of running a job to completion, RR runs it for a time slice (also known as a scheduling quantum). It is important to note that the length of the time slice must be a multiple of the timer interrupt period. For example, if the timer interrupts every 10 milliseconds, the time slice could be 10, 20, or any other multiple of 10ms.

Using the same example as before, the scheduling scenario with Round-Robin (with a time slice of 1 unit) is as follows:

Therefore, the average response time is:

Tresponse=0+1+23=1

The improvement is significant.

Advantages:

Disadvantages:

Considering I/O Situations

Conclusion: When using non-preemptive scheduling algorithms (such as STCF), the time chunks of processes that frequently use I/O should be split apart to improve resource utilization.

Example: Both A and B are processes requiring 50ms, but A calls I/O every 10ms, taking 10ms each time.

Without splitting time chunks, total time used is 140ms:

With splitting time chunks, total time used is 100ms:

By treating each CPU burst as a separate job, the scheduler ensures that "interactive" processes run frequently. When these interactive jobs perform I/O, other CPU-intensive jobs can run, thereby making better use of the processor.

What to Do If the Future Cannot Be Predicted

The STCF/SJF algorithms have limited utility in real operating systems because the OS has little prior knowledge about each process. Potential improvements may include:

Summary

In this chapter, we learned about:

Chapter 8: Multi-Level Feedback Queue

Problems MLFQ Attempts to Optimize

Basic Concepts of MLFQ

MLFQ has multiple different queues, each assigned a different priority level. At any given time, a ready job resides in one of these queues, and MLFQ uses priority to decide which job to run.

Basic Rules of MLFQ

Specific Implementation of MLFQ

Step 1

When a job first enters, it is placed in the highest priority queue for execution.

Step 2

Each queue is assigned a specific quantum length. If a job consumes its entire quantum, it is demoted to the next lower priority queue.

Step 3

After a certain period, all jobs are moved back to the highest priority queue.

Issues with MLFQ

Summary of MLFQ

Chapter 9: Proportional Share

Objective: To ensure that each process receives a certain percentage of CPU time.

Lottery Scheduling

Each process holds a number of tickets, and the system randomly selects one ticket from all available tickets; the process that owns the selected ticket gets to execute.

According to the law of large numbers, as execution time increases, the percentage of execution time allocated to each process will eventually converge to the proportion of tickets it holds relative to the total number of tickets.

Basic Concepts

Implementation

A random number generator is used to select the winning ticket, along with a data structure (such as a list) that records all processes in the system and the total number of tickets.

Assume we use a list to track processes. There are three processes: A, B, and C. Before making a scheduling decision, a random number (the winning ticket) is selected from the total number of tickets, say 400. Suppose the number 300 is chosen. Then, we traverse the linked list, using a simple counter to help identify the winner.

We traverse the process list from the beginning, adding the value of each ticket to the counter until the counter exceeds the winning number. At that point, the current list element corresponds to the winning process. For example, if the winning ticket is 300, we first add A's tickets: the counter increases to 100. Since 100 < 300, we continue. Then, the counter increases to 150 (B's tickets), still less than 300, so we continue. Finally, when the counter increases to 400 (greater than 300), we stop, and the current pointer indicates that C is the winner.

Stride Scheduling

Eliminates randomness by replacing it with a fixed stride, resulting in a deterministic and fair allocation algorithm.

Steps

  1. For example, if tickets for A, B, C are 100, 50, 250, respectively, set their strides as the inverse of their tickets (a simple approach is to divide a large number by each ticket). If this large number is 10000, then the strides for A, B, C are 100, 200, 40.
  2. Initialize the total strides for A, B, C to 0.
  3. Select the process with the smallest total stride and add its stride value to it.
  4. Repeat step 3.

The final scheduling result is shown in the figure below:

Linux CFS (Optional Topic)

time_slicek=weightki=0n1weightisched_latency

Refer to the following definition for specific weight constants:

// the map between nice param to weight
static const int prio_to_weight[40] = {
/* -20 */ 88761, 71755, 56483, 46273, 36291,
/* -15 */ 29154, 23254, 18705, 14949, 11916,
/* -10 */  9548,  7620,  6100,  4904,  3906,
/* -5 */   3121,  2501,  1991,  1586,  1277,
/* 0 */    1024,   820,   655,   526,   423,
/* 5 */     335,   272,   215,   172,   137,
/* 10 */    110,    87,    70,    56,    45,
/* 15 */     36,    29,    23,    18,    15,
};
vruntimei=vruntimei+weight0weightiruntimei

Here, the value of weight_0 can be found in the code above and is 1024.

Summary

In this chapter, we learned about: