System Software & Operating Systems

Why this unit matters

Unit 5 is the largest and most conceptually diverse unit in the NET CS syllabus. It spans two tightly connected subjects, system software (compilers, linkers, assemblers) and operating systems, that together underpin every running program. Questions test both theory (deadlock avoidance conditions, page replacement policies) and numerical ability (calculate turnaround time for a scheduling algorithm, find page faults for a reference string). This unit alone can account for 12-18 marks in a well-prepared candidate's score.

Syllabus map

Every named topic in the official UGC NET CS Code 87 syllabus for Unit 5:

System Software: machine language, assembly language, high-level languages, compilers vs interpreters, loader types (absolute, relocating, dynamic), linker (static vs dynamic linking), relocation, macros (macro expansion, macro assembler), debuggers, OS Basics: OS roles and structure (monolithic, layered, microkernel, hybrid), OS services, system calls, OS design principles, system boot process, Process Management: process concept, process states, PCB (Process Control Block), process operations (fork, exec, wait, exit), interprocess communication (shared memory, message passing), client-server communication (pipes, sockets, RPCs), process synchronisation, critical-section problem, software solutions (Peterson's algorithm), hardware support (test-and-set, swap), semaphores (binary, counting), monitors, classical synchronisation problems (producer-consumer, reader-writer, dining philosophers), Threads: thread concept, benefits of multithreading, multicore programming, multithreading models (many-to-one, one-to-one, many-to-many), thread libraries (POSIX Pthreads, Java threads, Win32 threads), implicit threading, thread pools, thread issues (fork-exec semantics, signal handling, thread cancellation, TLS), CPU Scheduling: scheduling criteria (CPU utilisation, throughput, turnaround time, waiting time, response time), scheduling algorithms (FCFS, SJF, SRTF, Round Robin, Priority, Multilevel Queue, Multilevel Feedback Queue), thread scheduling, multiprocessor scheduling, real-time CPU scheduling (EDF, RMS), Deadlocks: necessary conditions (mutual exclusion, hold-and-wait, no preemption, circular wait), deadlock prevention (negate one condition), deadlock avoidance (safe state, resource allocation graph, Banker's algorithm), deadlock detection (single instance via RAG, multiple instances via detection algorithm), recovery (process termination, resource preemption), Memory Management: contiguous allocation (fixed and variable partitions, internal and external fragmentation), compaction, swapping, non-contiguous allocation, paging (page table, frame, page size, multi-level page tables, inverted page table), segmentation, demand paging (valid-invalid bit, page fault handling), page replacement algorithms (FIFO, Optimal, LRU, LRU-approximation, clock/second-chance), frame allocation policies, thrashing, working set model, memory-mapped files, Storage and I/O Systems: mass-storage structure (HDD structure, tracks, sectors, cylinders; SSD), disk scheduling algorithms (FCFS, SSTF, SCAN, C-SCAN, LOOK, C-LOOK), RAID levels (0, 1, 5, 6, 10), file system concepts, access methods, directory structure, file-system mounting and sharing, file system implementation (allocation methods, contiguous, linked, indexed; free-space management, bitmap, linked list, grouping), file system recovery, I/O hardware, programmed I/O, interrupt-driven I/O, DMA, kernel I/O subsystem, buffering, caching, spooling, Security and Protection: protection goals, access matrix model, access control lists, capability lists, role-based access control (RBAC), program threats (Trojan horse, virus, worm), system threats, network threats, cryptography (symmetric: DES, AES; asymmetric: RSA; hash functions: SHA), user authentication (passwords, biometrics, multi-factor), security defenses (firewalls, IDS, sandboxing), Virtualisation: virtual machine concept, benefits, types (Type 1 hypervisor / bare-metal, Type 2 hypervisor / hosted), full virtualisation vs paravirtualisation, containers (Docker, namespaces, cgroups), Linux and Windows Internals: Linux process model, Linux scheduling (CFS), Linux memory management, Linux file system (ext4, VFS), Windows object model, Windows registry, Windows scheduling, NTFS, Distributed Systems and DFS: distributed system characteristics, remote procedure calls (RPC), distributed file systems (NFS, AFS), consistency models, distributed coordination

System Software

Compilers vs Interpreters

A compiler translates the entire source program into machine code before execution. Compilation is slower (one-time cost), but execution of compiled code is fast.

An interpreter translates and executes the program line by line. No separate compilation step; easier to debug but slower at runtime. Python and JavaScript engines use JIT (Just-In-Time) compilation, a hybrid approach that compiles frequently executed code paths to machine code at runtime.

Assembler, Loader, and Linker

Assembler: translates assembly language (mnemonics) into machine code (object file). A two-pass assembler resolves forward references in pass 2.

Absolute loader: loads program at a fixed memory address. Simple but inflexible. Relocating loader: adjusts address references when the program is loaded at a different address than it was compiled for. Uses a relocation table. Dynamic loader: loads modules only when they are first referenced (demand loading).

Linker: combines multiple object files and library files into a single executable. Resolves external references (symbols) by matching definitions across files. Static linking: all library code is copied into the executable at link time. Self-contained but large. Dynamic linking (shared libraries, DLLs): library code is shared in memory; linked at load time or runtime. Smaller executables; updates to the library benefit all programs automatically.

Macros

A macro is a named piece of code that is substituted (expanded) wherever the macro name appears. Macro expansion is performed by the preprocessor or assembler before actual compilation or assembly.

; Assembly macro example:
SWAP MACRO A, B
    PUSH A
    PUSH B
    POP  A
    POP  B
ENDM
; Every use of SWAP is replaced by the 4 instructions above

Macros differ from subroutines: macros use code substitution (faster, larger code), subroutines use call/return (slower, smaller code).

OS Basics

OS Structure Types

Monolithic kernel: entire OS runs as one large program in kernel mode. Fast (no message passing), but hard to maintain. Linux.

Microkernel: only minimal functions (IPC, basic scheduling, memory management) run in kernel mode. Most OS services run as user-mode processes. More reliable and modular, but slower due to message-passing overhead. Examples: Mach, Minix.

Layered OS: OS divided into layers; each layer provides services only to the layer above and uses services only from the layer below.

Hybrid: combines monolithic speed with microkernel modularity. Windows NT, macOS XNU.

System Calls

A system call is the programmatic interface between a user-mode process and the OS kernel. Examples: open(), read(), write(), fork(), exec(), exit(), wait(), socket().

When a system call is made: the CPU switches from user mode to kernel mode (mode switch), executes the kernel routine, returns to user mode.

Process Management

Process States and PCB

Process states: new -> ready -> running -> waiting -> terminated.

The PCB (Process Control Block) stores: PID, process state, program counter, CPU registers, memory management information, I/O status, accounting information.

Context switch: save the state of the running process into its PCB, load the state of the next process from its PCB. Pure overhead, no useful work is done during a context switch.

Interprocess Communication

Shared memory: two or more processes share a region of memory. Fast (no kernel involvement after setup), but requires explicit synchronisation.

Message passing: processes communicate via send() and receive() system calls. The OS buffers messages in the kernel. Simpler to use; no synchronisation needed for communication itself.

Pipes: unidirectional byte stream between related processes (parent and child). Named pipes (FIFOs) allow unrelated processes to communicate.

Sockets: endpoints for network communication. Also used for IPC on the same machine (Unix domain sockets).

RPC (Remote Procedure Call): makes a call to a procedure on a remote machine look like a local function call. The stub on the client side marshals arguments; the server skeleton unmarshals them.

Critical-Section Problem

The critical section is the code segment that accesses shared resources. A correct solution must satisfy:

Mutual exclusion: only one process in the critical section at a time.
Progress: if no process is in the critical section and some want to enter, a decision is made in finite time.
Bounded waiting: there is a bound on the number of times other processes can enter before a requesting process does.

Peterson's algorithm (for 2 processes):

/* Shared variables */
int turn;          /* whose turn it is */
bool flag[2];      /* indicates if process i wants to enter */

/* Process i's entry section: */
flag[i] = true;
turn = j;          /* give the other process a chance */
while (flag[j] && turn == j)
    ;              /* busy wait */
/* Critical section */
/* Exit section: */
flag[i] = false;

Peterson's works but relies on busy waiting. Modern CPUs may reorder instructions, so memory barriers are also needed.

Semaphores

A semaphore is an integer variable accessed only through two atomic operations:

wait(S) (also called P or down): if S > 0, decrement S; else block the calling process., signal(S) (also called V or up): increment S; if a process is blocked on S, wake it up.

Binary semaphore (mutex): initial value 1. Provides mutual exclusion. Counting semaphore: initial value N. Controls access to a pool of N resources.

Classical problems:

Producer-Consumer: a buffer of size N is shared. Producer calls wait(empty) then wait(mutex), produces, signals mutex then signals full. Consumer calls wait(full) then wait(mutex), consumes, signals mutex then signals empty.

Reader-Writer: multiple readers can access simultaneously; a writer needs exclusive access. Use a read count and a mutex to manage readers; use a separate write mutex for writers.

Dining Philosophers: 5 philosophers, 5 forks. Avoid deadlock by making one philosopher pick up right fork first, or by using a semaphore that limits concurrent diners to 4.

Monitors

A monitor is a high-level synchronisation construct. It encapsulates shared data and the procedures that operate on it. Only one thread can be active inside the monitor at a time. Condition variables (wait() and signal()) allow threads to wait for specific conditions.

Threads

A thread is a lightweight unit of execution within a process. Threads of the same process share code, data, and OS resources (open files, sockets) but have their own registers, stack, and thread-local storage.

Multithreading models:

Many-to-one: many user threads mapped to one kernel thread. A blocking system call blocks all user threads. Rarely used., One-to-one: each user thread maps to one kernel thread. True concurrency on multicore. Used by Linux (pthreads) and Windows., Many-to-many: user threads multiplexed over multiple kernel threads. Flexible; can create as many user threads as needed.

Thread pools: create a fixed number of threads at startup and assign tasks from a queue. Avoids overhead of thread creation/destruction per task.

CPU Scheduling

Scheduling Criteria

CPU utilisation: keep the CPU as busy as possible. Throughput: number of processes completed per unit time. Turnaround time: total time from submission to completion. Waiting time: time spent in the ready queue. Response time: time from submission to first response (important for interactive systems).

Scheduling Algorithms

FCFS (First Come First Served): non-preemptive, simple, convoy effect (long jobs delay short ones).

SJF (Shortest Job First): non-preemptive, optimal average waiting time, but requires knowledge of burst time. Suffers from starvation of long processes.

SRTF (Shortest Remaining Time First): preemptive version of SJF. Optimal but impractical.

Round Robin: preemptive, processes get a fixed time quantum. Good response time for interactive systems. Large quantum -> FCFS behaviour; small quantum -> high context-switch overhead.

Priority scheduling: highest-priority process runs first. Preemptive or non-preemptive. Problem: starvation. Solution: ageing (gradually increase priority of waiting processes).

Multilevel Queue: separate queues for different process types (interactive, batch). Each queue has its own scheduling algorithm.

Multilevel Feedback Queue (MLFQ): processes move between queues based on behaviour. CPU-bound processes sink to lower-priority queues; I/O-bound processes rise to higher-priority queues.

Real-time scheduling:

Rate Monotonic Scheduling (RMS): static priority; shorter period = higher priority. Optimal for static-priority preemptive scheduling., Earliest Deadline First (EDF): dynamic priority; process closest to its deadline runs first. Optimal utilisation (can reach 100%).

Numerical Example, Round Robin

Processes: P1 (burst=10), P2 (burst=4), P3 (burst=6). Quantum=4. Arrival time=0 for all.

Gantt chart: [P1: 0-4] [P2: 4-8] [P3: 8-12] [P1: 12-16] [P3: 16-18] [P1: 18-20]. Turnaround: P1=20, P2=8, P3=18. Average turnaround = (20+8+18)/3 = 15.33. Waiting: P1=10, P2=4, P3=12. Average waiting = (10+4+12)/3 = 8.67.

Deadlocks

Necessary Conditions (Coffman)

All four must hold simultaneously for deadlock to occur:

Mutual exclusion: at least one resource is held in a non-shareable mode.
Hold and wait: a process holds at least one resource while waiting for additional resources.
No preemption: resources cannot be forcibly taken from a process.
Circular wait: a cycle of processes exists, each waiting for a resource held by the next.

Banker's Algorithm (Deadlock Avoidance)

The Banker's algorithm determines whether a resource allocation request leaves the system in a safe state.

A state is safe if there exists a sequence of all processes such that each process can complete using the remaining available resources plus resources held by all previously completed processes.

/* Safety algorithm (pseudocode) */
work[] = available[];
finish[] = false for all processes;
while (some process i with finish[i] == false and need[i] <= work[]) {
    work += allocation[i];
    finish[i] = true;
}
if (all finish[i] == true) return SAFE;
else return UNSAFE;

If a request from process Pi satisfies need[i] and available, tentatively allocate and run the safety algorithm. If safe, grant the request; otherwise, Pi must wait.

Memory Management

Paging

Virtual address space is divided into fixed-size pages. Physical memory is divided into frames of the same size. The OS maintains a page table per process that maps page numbers to frame numbers.

Address translation: virtual address = (page number, offset). Physical address = (frame number from page table, offset).

If a page is not in memory (valid bit = 0), a page fault occurs. The OS:

Saves the process state.
Finds a free frame (or evicts a page).
Loads the page from disk.
Updates the page table.
Restarts the faulting instruction.

Page Replacement Algorithms

FIFO: replace the page that has been in memory the longest. Simple, but suffers from Belady's anomaly (more frames can cause more faults).

Optimal (OPT): replace the page that will not be used for the longest time in future. Optimal but not implementable (requires future knowledge). Used as a benchmark.

LRU (Least Recently Used): replace the page not used for the longest time in the past. Good approximation of optimal; no Belady's anomaly. Expensive to implement exactly, use approximations.

Clock (Second-Chance): LRU approximation. Each page has a reference bit. On access, set bit to 1. On replacement, scan circularly: if bit=1, clear it and move on; if bit=0, replace the page.

Thrashing

Thrashing occurs when a process does not have enough frames to hold its working set. It spends more time paging in and out than executing.

Working set model: the working set W(t, delta) is the set of pages referenced in the last delta time units. Allocate enough frames to hold each process's working set. If total working set demand exceeds available frames, suspend some processes.

Segmentation

The address space is divided into variable-size segments corresponding to logical units (code, stack, heap, each function, each data structure). Each segment has a base and limit in a segment table. Segment number + offset -> physical address.

Segmentation allows segments to grow and shrink independently and supports protection and sharing at segment granularity. But it causes external fragmentation.

Storage and I/O

Disk Scheduling

FCFS: serve requests in arrival order. Simple, may cause long seek times.

SSTF (Shortest Seek Time First): serve the request closest to the current head position. Reduces average seek time, but causes starvation of distant requests.

SCAN (elevator): move head in one direction, serving all requests until end, then reverse. No starvation.

C-SCAN (Circular SCAN): like SCAN, but only serves requests in one direction; jumps to start and sweeps again. More uniform wait time.

RAID Levels

RAID 0 (striping): data distributed across drives. Improved performance; no redundancy. Failure of any drive loses all data.

RAID 1 (mirroring): data duplicated on two drives. Full redundancy; 50% storage efficiency.

RAID 5 (striping with distributed parity): data and parity distributed across all drives. Can tolerate one drive failure. Requires minimum 3 drives.

RAID 6: two independent parity blocks. Can tolerate two simultaneous drive failures. Requires minimum 4 drives.

RAID 10 (1+0): mirror pairs, then stripe. High performance and redundancy; expensive.

File Allocation Methods

Contiguous: each file occupies a contiguous set of blocks. Fast sequential access; supports direct access. Suffers from external fragmentation and requires knowing file size in advance.

Linked: each block contains a pointer to the next block. No external fragmentation; sequential access only (no direct access without traversal). FAT (File Allocation Table) is a variation that stores all pointers in a table.

Indexed: an index block contains pointers to all data blocks. Supports direct access; no external fragmentation. For large files, use multi-level index or combined (Unix inode style: 12 direct, single indirect, double indirect, triple indirect pointers).

Security and Virtualisation

Cryptography Basics

Symmetric encryption: same key for encryption and decryption. Fast. DES (56-bit key, now broken), Triple-DES, AES (128/192/256-bit keys). Key distribution problem: how to securely share the key.

Asymmetric encryption: public key for encryption, private key for decryption (RSA). Solves key distribution; much slower than symmetric. Used to exchange symmetric keys.

Hash functions (SHA-256, MD5): produce a fixed-size digest. One-way; used for integrity checking and password storage.

Digital signatures: sender signs with private key; receiver verifies with public key. Ensures non-repudiation and authenticity.

Access Control

Access matrix: rows are subjects (users, processes), columns are objects (files, devices). Entry [S, O] lists operations S may perform on O.

Access Control List (ACL): store each column of the access matrix with the object. Used by Linux (file permissions) and Windows.

Capability list: store each row with the subject. A process holds a list of capabilities (object, allowed operations). Easier to grant access; harder to revoke.

RBAC (Role-Based Access Control): assign permissions to roles, assign users to roles. Simplifies administration for large organisations.

Virtualisation

Type 1 hypervisor (bare-metal): runs directly on hardware (no host OS). Guest OSes run on top. VMware ESXi, Microsoft Hyper-V, Xen.

Type 2 hypervisor (hosted): runs on top of a host OS. Guest OSes run inside the hypervisor. VirtualBox, VMware Workstation.

Full virtualisation: guest OS runs unmodified. Hypervisor intercepts and emulates privileged instructions. Paravirtualisation: guest OS is modified to use hypercalls instead of privileged instructions. Better performance.

Containers: lightweight virtualisation. Share the host OS kernel. Isolated using Linux namespaces (process, network, mount, user) and cgroups (resource limits). Docker is the most popular container runtime. Faster to start and more efficient than VMs, but less isolated.

Worked Examples

Example 1: Scheduling, SJF

Processes: P1 (arrival=0, burst=8), P2 (arrival=1, burst=4), P3 (arrival=2, burst=2), P4 (arrival=3, burst=1). Non-preemptive SJF.

At t=0: only P1 available. Start P1. At t=8: P1 finishes. P2, P3, P4 are all available. Shortest burst = P4 (burst=1). Start P4. At t=9: Start P3 (burst=2). At t=11: Start P2 (burst=4). At t=15: Done.

Turnaround: P1=8, P4=9-3=6, P3=11-2=9, P2=15-1=14. Average=(8+6+9+14)/4=9.25. Waiting: P1=0, P4=6, P3=7, P2=10. Average=(0+6+7+10)/4=5.75.

Example 2: Banker's Algorithm

3 processes, 2 resource types. Available = [3, 2]. Allocation: P0=[1,0], P1=[0,1], P2=[1,1]. Max: P0=[2,2], P1=[1,2], P2=[2,2]. Need = Max, Allocation: P0=[1,2], P1=[1,1], P2=[1,1].

Safety check: work=[3,2]. Can P0 finish? Need[0]=[1,2] <= [3,2]. Yes. work=[3,2]+[1,0]=[4,2]. finish[0]=true. Can P1 finish? Need[1]=[1,1] <= [4,2]. Yes. work=[4,2]+[0,1]=[4,3]. finish[1]=true. Can P2 finish? Need[2]=[1,1] <= [4,3]. Yes. finish[2]=true. Safe sequence: P0, P1, P2. System is in a safe state.

Example 3: Page Replacement, LRU

Reference string: 7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3. Frames = 3.

Trace (page faults marked with ): 7: [7,-,-]. 0*: [7,0,-]. 1*: [7,0,1]. 2*: [2,0,1]. 0: [2,0,1]. 3*: [2,0,3] (evict 1, LRU). 0: [2,0,3]. 4*: [4,0,3] (evict 2). 2*: [4,0,2] (evict 3). 3*: [4,3,2] (evict 0). 0*: [0,3,2] (evict 4). 3: [0,3,2]. Total page faults: 9.

Example 4: Deadlock, Circular Wait

Process P1 holds resource R1 and waits for R2. Process P2 holds resource R2 and waits for R1. This forms a circular wait, all four Coffman conditions hold, so this is a deadlock.

Prevention: impose a global ordering on resource types. Require all processes to request resources in increasing order. This eliminates circular wait.

Example 5: Disk Scheduling, SCAN

Disk has 200 cylinders (0-199). Head at cylinder 53, moving towards higher numbers. Request queue: 98, 183, 37, 122, 14, 124, 65, 67.

SCAN order (moving up): 53 -> 65 -> 67 -> 98 -> 122 -> 124 -> 183 -> then reverse: 199 -> 37 -> 14. Wait, SCAN moves to end of disk (199), then reverses. Seek sequence: 65, 67, 98, 122, 124, 183, 199 (end), then 37, 14. Total head movement: (183-53)+(199-183)+(199-14) = 130+16+185 = 331.

Quick-Revision Summary

Compiler: translates whole program. Interpreter: line by line. JIT: hybrid., Static linking: library copied into executable. Dynamic linking: shared at load/runtime., Peterson's algorithm: flag[] and turn variable; satisfies mutual exclusion, progress, bounded waiting., Semaphore: wait (P) decrements; signal (V) increments. Binary semaphore = mutex., Deadlock requires all 4 Coffman conditions. Banker's algorithm avoids deadlock by ensuring safe state., Page fault handling: check validity, find/evict frame, load from disk, update page table, restart instruction., FIFO page replacement suffers Belady's anomaly. OPT is optimal but impractical. LRU is the practical gold standard., Thrashing: too few frames -> process constantly pages. Solution: working set model or page fault frequency., RAID 5: distributed parity, tolerates 1 failure. RAID 6: tolerates 2 failures., Indexed file allocation: supports direct access; inode in Unix., Type 1 hypervisor: bare-metal (ESXi). Type 2: hosted (VirtualBox)., RBAC: permissions assigned to roles, roles to users., AES is symmetric; RSA is asymmetric. Hash functions are one-way.

How to Study This Unit

CPU scheduling questions are almost always numerical. Practise FCFS, SJF, Round Robin, and Priority by hand on small sets of 4-5 processes. Calculate turnaround time, waiting time, and average for each.
Banker's algorithm: practise the safety algorithm on 3-process, 2-resource examples until it is mechanical.
Page replacement: drill LRU and FIFO on 10-step reference strings. Know Belady's anomaly (FIFO can get more faults with more frames).
Deadlock: memorise the 4 Coffman conditions and their negations (prevention strategies). Draw resource allocation graphs.
Process synchronisation: understand Peterson's algorithm step by step. Know the three semaphore-based solutions to producer-consumer, reader-writer, and dining philosophers.
Storage: practice SCAN and C-SCAN disk scheduling on numerical examples. Know all RAID levels (0, 1, 5, 6, 10) and what failures each tolerates.
Virtualisation and security are increasingly tested. Know the difference between Type 1 and Type 2 hypervisors, and between symmetric and asymmetric cryptography.
Do not skip system software, macro vs subroutine and static vs dynamic linking are recurring definitional questions.