The Process Abstraction: Threads & Address Space

Part 2 of the Operating System Fundamentals series.

What is a process?

First of all, let's define a few terms.

A program is just a file sitting on disk. Code, waiting to be run.
A process is that program in motion—a container that holds everything needed to run it. In other words, it is an executed environment with restricted rights.

When you launch an app, the OS creates a process and gives it:

Its own address space — a private view of memory that no other process can touch. It stores the state of the program during its entire life time.
Resources — open files, network sockets, handles to other OS objects. We'll cover these a bit later.
At least one thread — the thing that actually executes code

The most important idea of a process is the isolation boundary. It's what keeps your browser from reading your password manager's memory. If a process crashes, others keep running. We'll dive into exactly how the OS in the following series.

What is a thread?

A thread is a schedulable sequence of execution. It has its own instruction pointer (where it is in the code), its own stack, and its own registers. But threads within the same process share everything else—memory, files, resources.

A thread can be schedulable because it can stopped and then scheduled to run by the OS anytime. Every process starts with one thread (the main thread). But it can spawn more. A web browser might use one thread to render the page and another to handle network requests. They share the same memory, so they can communicate easily—but that also introduces a whole host of concurrency issues which we will cover in the following series.

The OS schedules threads, not processes. When people talk about "context switching," they mean the OS pausing one thread and resuming another.

What is an address space?

An address space is the range of memory addresses that store a process's state. Every process gets its own memory address, and it looks like the process has all of memory to itself—starting from address 0 up to some large number.

Memory is byte addressable: each byte has its own address. Address 0x1000 refers to one byte, 0x1001 refers to the next. When you read a 4-byte integer, you're reading addresses 0x1000 through 0x1003.

Each process gets its own address space, completely separate from other processes. Process A cannot access process B's memory—the OS enforces this boundary. How exactly? That involves virtual memory, which we'll cover later in this series.

A typical address space is laid out something like this:

High addresses (e.g., 0xFFFFFFFF on 32-bit)
┌──────────────────┐
│   Kernel Space   │  ← off-limits to user code, shared across all processes
├──────────────────┤
│      Stack       │  ← grows downward (local variables, return addresses)
│        ↓         │
├──────────────────┤
│  Memory-mapped   │  ← shared libraries, mmap'd files
├──────────────────┤
│        ↑         │
│       Heap       │  ← grows upward (malloc, dynamic allocation)
├──────────────────┤
│       BSS        │  ← uninitialized globals (zeroed by OS)
├──────────────────┤
│       Data       │  ← initialized global/static variables
├──────────────────┤
│       Text       │  ← your program's instructions (read-only)
└──────────────────┘
Low addresses (e.g., 0x00000000)

Let's walk through each region:

Text (Code) — The actual machine instructions. This is read-only; you can't modify your own code at runtime (self-modifying code is possible but rare and requires special permissions). Marked executable.

Data & BSS — Global and static variables. Data holds initialized values (int x = 5), BSS holds uninitialized ones (int y;). The OS zeroes out BSS when the process starts.

Heap — Dynamic memory. When you call malloc() or new, memory comes from here. It grows upward as you allocate more. You're responsible for freeing it.

Memory-mapped region — Shared libraries (libc, etc.) get mapped here. Also used for mmap(), which lets you map files directly into memory. This region can grow in either direction.

Stack — Each thread gets its own stack. It holds local variables, function arguments, and return addresses. Grows downward with each function call, shrinks when functions return. If it grows too far (deep recursion, huge local arrays), you get a stack overflow.

Kernel space — The top portion of the address space is reserved for the OS kernel. Your process can't read or write it directly—trying to access it triggers a fault. But it's mapped into every process's address space so the kernel can quickly take over during system calls and interrupts. It's quite complicated so we'll cover how the OS does this in the following series.

On a 32-bit system, you get 4 GB of address space ( $2^{32}$ bytes). Typically, 3 GB is for userspace, 1 GB for the kernel. On 64-bit, the address space is $2^{64}$ bytes—astronomically larger, though most of it goes unused.

Next: How Does the OS Provide Protection?—dual-mode operation, system calls, and the hardware that enforces boundaries.

← Back to series