The mechanism used to share and protect memory is called Virtual Memory (VM). The idea behind virtual memory is that addresses generated by the CPU are inspected and translated before they are allowed to affect memory.
In a VM system, addresses generated by the CPU are called Virtual Addresses (VAs). They are translated by a piece of hardware called the Memory Management Unit (MMU) into actual addresses, which are called Physical Addresses (PAs). The hardware organization looks like this:
+--------------+ | CPU | +---------------+ | |---> MMU ---> MAR--->| | | | | MEMORY | | | | | | |<------------>MDR<-->| | | | | | | | +---------------+ +--------------+The MMU intercepts memory references and inspects and translates them. This allows the system to "redirect" requests for memory; this redirecting gives the system the ability to enforce space-sharing of memory. For example a memory request for location 9372, as would be generated by, say
ld [9372],%l0might be redirected by the MMU to access the actual physical memory location 21660 instead.
How translation is done differs from machine to machine; in our discussion we'll examine how translation is performed on the SPARC (naturally!).
Physical memory is divided up into 4K slots called page frames. Page frames are the smallest chunks of memory that can be protected. Each virtual page is mapped to a page frame by the process's page table.
The byte within a page is specified by the last 12 bits of the virtual address. The first 20 bits of the address are the Page ID. The page table specifies a 20 bit page frame for each 20 bit Page ID.
Page Table: ------------------------------------------------ | Page ID (20 bits) | Page Frame (20 bits) | | | | . . . . . .The fundamental form of protection afforded by VM is due to the partitioning of physical memory into page frames, which are used by separate users (Unix processes). By assigning page frames to users, and preventing users from accessing page frames that do not belong to them, virtual memory provide the protection necessary to support memory sharing.
Another form of protection provided by VM is to protect each user from themselves. This is done using two extra bits associated with each page table entry:
-------------------------------------------------------------- | Page ID (20 bits) | Page Frame (20 bits) | R/O? | R/W? | | | | | | . . . . . . . . . . . .When an address is presented to the MMU, the MMU checks wether it is a load (which reads memory) or a store (which writes memory). If the R/W bit is not set, and the operation is a store, then the MMU signals a Segmentation Violation.
------------------------------------------------------------------------ | Page ID (20 bits) | Page Frame (20 bits) | R/O? | R/W? | R? | W? | | | | | | | | . . . . . . . . . . . . . . . . . .Whenever a page is created, or read in from disk, its R? and W? columns are set to 0. Every time a memory location is read, the R? column in its page table entry is set to 1. Every time a memory location is written, the W? column in its page table entry is set to 1. The advantage of this statistic is that if a page has not been written to (is not dirty) it doesn't have to be copied to disk when its page frame is reclaimed.
So, VM provides 2 basic services:
A page fault occurs when the program accesses a memory location that does not currently have a valid mapping, that is, there is no page frame in physical memory with that address's data in it. Page faults are handled by the OS. During a page fault:
The mapping from virtual pages to page frames is defined by the page table, and there is one page table per process.
Luckily, page faults don't occur all that often because of locality of reference. Locality of reference means that the memory references made by a program tend to be close to each other from reference to reference. By close, we mean within a small number of addresses -- less than the 4K page size. Can you see why this is likely the case for program instruction fetches? Most of the time the next instruction (memory location + 4) follows the current one. However, it's also true for data references because of things like arrays - programs often read an entire array in order, so the references are very close.
Since programs exhibit locality of refernence, page faults aren't all that common. Once a page is brought in, it tends to be used over and over.
Locality of reference is another example of a measurement-based design decision (like register windows). It is locality of reference that makes virtual memory feasible. If programs had no locality of reference, they would page fault all the time and VM would be too slow to use.
In fact, there is at any point in time some set of pages that are being used a lot. For example, the page containing the code currently executing, plus the page containing the top of the stack, plus a page or two of data and bss. If these pages are all in memory, the program can execute for a long time without generating a page fault. These pages are called the program's working set.
Naturally, the working set changes over time, for example as the program counter moves through the program; but the working set changes slowly, so page faults are not frequent.
One popular approach is the Least Recently Used (LRU) algorithm. The idea is to keep track of when each page was last accessed. To satisfy a page fault, the OS frees up the page frame that was used furthest in the past -- in some sense, least likely to be part of any program's working set.
If the working set is larger than the number of available page frames, no algorithm will work well, and page faults will be frequent. A program that generates page faults frequently and continuously is said to be thrashing.
Since any data stored in VM is available also on disk, VM is really just a copy of what's on disk. What is the distinction then, between main memory and disk memory? Only speed of access. Main Memory is faster than disk, that's the only important difference. For this reason main memory is sometimes called primary storage while disk is called secondary storage.
The idea behind VM is almost always to have the data needed where it can be accessed the fastest, namely in memory. However, when all else fails, you can always find the data on disk.
In this way, VM simulates a large, fast memory by using a small, fast memory combined with a large, slow memory. This is much cheaper than building a large, fast memory, and it is only a little bit slower than a large, fast memory (for programs with locality of reference, that is).
The memory hierarchy is usually shown as a pyramid:
CPU / \ / ------------- \ / Main Memory \ / ----------------- \ / Disk Memory \ / --------------------- \which illustrates that we have a smaller amount of main memory than disk memory, and that the CPU directly accesses only the main memory, even though the REAL storage for data is on the disk.
/ CPU \ / --------- \ / Cache Mem. \ / ------------- \ / Main Memory \ / ----------------- \ / Disk Memory \ / --------------------- \Cache memory is organized similarly to Virtual Memory, however cache memory is:
/ CPU \ Cost/Byte Speed Size / --------- \ / Cache Mem. \ $$$ 10 ns 256 KB / ------------- \ / Main Memory \ $$ 100 ns 32 MB / ----------------- \ / Disk Memory \ $ 10 ms 1 GB / --------------------- \The cost/byte and speed differences arise because the technologies are different. Cache memory is implemented using a technology called "Static Random Access Memory" or SRAM, while main memory is implemented using a technology called "Dynamic Random Access Memory" or DRAM, and disk memory is implemented using magnetic technology (like audio or video tapes).
Managed Units Typical Size "Data not Present" Event -------------- ------------ ------------------------- VM Pages 4K Bytes Page Fault Cache Cache Lines 256 Bytes Cache MissJust as we saw that VM was a copy of data on disk, which was made so as to increase its access speed, so the cache is a copy of data in VM, again made to increase its access speed. So the CPU actually only operates on data from the cache -- it never goes directly to memory or to disk. If needed data is not in the cache (a cache miss), it is retrieved from main memory. If the data is not in main memory either (a page fault) it is retrieved from disk.
Since the addresses used to look up data in the cache are virtual addresses (no translation is performed on them) the cache must be cleared out (flushed) whenever a new process starts executing on the CPU.