User-to-Kernel Transitions

Part 4 of the Operating System Fundamentals series.

In the previous post, we talked about mode transfer—how the CPU switches between user mode and kernel mode. Now let's look at exactly how this happens, using Pintos (a teaching OS) as a concrete example.

Three types of events

There are three events that cause a transition from user mode (Ring 3) to kernel mode (Ring 0):

Hardware interrupts — asynchronous signals from devices (timer tick, keyboard press, disk completion)
Exceptions — synchronous CPU faults (divide by zero, invalid opcode, page fault)
System calls — explicit user requests (int $0x30 in Pintos)

All three share the same low-level entry mechanism.

The common entry path

Every user-to-kernel transition follows this pattern:

CPU detects the event (interrupt signal, faulting instruction, or int $0x30)
Hardware switches stacks (loads kernel stack pointer from TSS)
CPU pushes user context (SS, ESP, EFLAGS, CS, EIP)
IDT stub runs (pushes error code and vector number)
Entry code saves registers (builds complete interrupt frame)
C handler runs (dispatch based on vector number)
iret returns to user mode (restores context from stack)

Key hardware structures

The CPU relies on several structures the OS sets up at boot:

IDT (Interrupt Descriptor Table) — maps vectors 0–255 to handler addresses
TSS (Task-State Segment) — contains esp0, the kernel stack pointer loaded on privilege switches
GDT (Global Descriptor Table) — defines kernel/user code and data segments

The interrupt frame

When an event occurs, the entry code builds a struct intr_frame on the kernel stack:

┌─────────────────┬────────────┬──────────────────────────────────────┐
│ Content         │ Pushed by  │ Notes                                │
├─────────────────┼────────────┼──────────────────────────────────────┤
│ GPRs (pushal)   │ entry code │ General-purpose registers            │
│ DS, ES, FS, GS  │ entry code │ Segment registers                    │
│ Vector number   │ IDT stub   │ Identifies the event type            │
│ Error code      │ CPU/stub   │ Real for some exceptions, else 0     │
│ EIP, CS, EFLAGS │ CPU        │ Return address and flags             │
│ ESP, SS         │ CPU        │ User stack (only on privilege change)│
└─────────────────┴────────────┴──────────────────────────────────────┘

The C handler receives a pointer to this frame, giving it access to all saved state. When the handler returns, iret pops everything back and resumes user code.

Hardware interrupts (vectors 0x20–0x2F)

External devices signal the CPU through interrupt requests (IRQs). The timer fires every few milliseconds. The keyboard fires when you press a key. The disk fires when a read completes.

The flow:

Device asserts IRQ to the Programmable Interrupt Controller (PIC)
CPU finishes current instruction, enters common entry path
Handler runs (e.g., timer_interrupt())
Handler may request a context switch
iret returns to interrupted code

Interrupts are disabled during handler execution. Keep handlers fast—defer heavy work to avoid missing other interrupts.

Exceptions (vectors 0x00–0x1F)

Exceptions are synchronous—they happen because of the instruction being executed. The CPU aborts the faulting instruction and enters the common path.

Common exceptions:

Vector 0x00 — divide by zero
Vector 0x06 — invalid opcode
Vector 0x0E — page fault

Page fault example

User code accesses unmapped or protected memory
CPU aborts instruction, pushes context + error code, jumps to handler
Handler reads CR2 (the faulting address) and decodes the error code
Either resolve the fault (load the page, grow the stack) and retry, or terminate the process

The error code bits tell you: was the page present or not? Was it a write or read? Did it come from user or kernel mode?

System calls (vector 0x30)

System calls are the intentional way for user code to request kernel services.

User wrapper (e.g., write(fd, buf, size)) pushes arguments onto user stack
Pushes syscall number, executes int $0x30
CPU enters common path (IDT entry for 0x30 has DPL=3, so user code can invoke it)
syscall_handler() reads arguments from user stack via f->esp
Handler validates pointers, executes kernel logic, stores return value in f->eax
iret returns; user code retrieves result from eax

Critical: Always validate user pointers before dereferencing. A malicious user can pass any address—including kernel addresses. Check with is_user_vaddr() and verify the page is mapped.

Quick comparison

┌──────────────┬────────────────────┬─────────────────┬─────────────────┐
│              │ Hardware Interrupt │ Exception       │ System Call     │
├──────────────┼────────────────────┼─────────────────┼─────────────────┤
│ Vectors      │ 0x20–0x2F          │ 0x00–0x1F       │ 0x30            │
│ Trigger      │ External device    │ CPU fault       │ int $0x30       │
│ Synchronous? │ No                 │ Yes             │ Yes             │
│ Can sleep?   │ No                 │ Yes             │ Yes             │
│ Return       │ iret               │ iret or kill    │ iret (eax=result)│
└──────────────┴────────────────────┴─────────────────┴─────────────────┘

Key takeaways

Same entry path — all three event types use IDT → stub → entry code → C handler
Stack switch via TSS — CPU loads esp0 from TSS on Ring 3 → Ring 0 transitions
Validate syscall arguments — never trust user pointers
Interrupt handlers must be fast — they run with interrupts disabled

Next up: virtual memory—how the OS gives each process its own view of memory.

← Back to series