Virtual memory is a computer system technique which gives
an application program the impression that it has contiguous
working memory (an address space), while in fact it may be
physically fragmented and may even overflow on to disk
storage. Systems that use this technique make programming of
large applications easier and use real physical memory (e.g.
RAM) more efficiently than those without virtual memory.
Virtual memory differs significantly from memory
virtualization in that virtual memory allows resources to be
virtualized as memory for a specific system, as opposed to a
large pool of memory being virtualized as smaller pools for
many different systems.
Note that "virtual memory" is more than just "using disk
space to extend physical memory size" - that is merely the
extension of the memory hierarchy to include hard disk
drives. Extending memory to disk is a normal consequence of
using virtual memory techniques, but could be done by other
means such as overlays or swapping programs and their data
completely out to disk while they are inactive. The
definition of "virtual memory" is based on redefining the
address space with a contiguous virtual memory addresses
to "trick" programs into thinking they are using large
blocks of contiguous addresses.
All modern general-purpose computer operating systems use
virtual memory techniques for ordinary applications, such as
word processors, spreadsheets, multimedia players,
accounting, etc. Older operating systems, such as DOS[1] of
the 1980s, or those for the mainframes of the 1960s,
generally had no virtual memory functionality - notable
exceptions being the Atlas, B5000 and Apple Computer's Lisa.
Embedded systems and other special-purpose computer
systems which require very fast and/or very consistent
response times may choose not to use virtual memory due to
decreased determinism.
History
In the 1940s and 1950s, before the development of a
virtual memory, all larger programs had to contain logic for
managing two-level storage (primary and secondary, today's
analogies being RAM and hard disk), such as overlaying
techniques. Programs were responsible for moving overlays
back and forth from secondary storage to primary.
The main reason for introducing virtual memory was
therefore not simply to extend primary memory, but to make
such an extension as easy to use for programmers as
possible.
Many systems already had the ability to divide the memory
between multiple programs (required for multiprogramming and
multiprocessing), provided for example by "base and bounds
registers" on early models of the PDP-10, without providing
virtual memory. That gave each application a private address
space starting at an address of 0, with an address in the
private address space being checked against a bounds
register to make sure it's within the section of memory
allocated to the application and, if it is, having the
contents of the corresponding base register being added to
it to give an address in main memory. This is a simple form
of segmentation without virtual memory.
Virtual memory was developed in approximately 1959–1962,
at the University of Manchester for the Atlas Computer,
completed in 1962. However, Fritz-Rudolf Güntsch, one of
Germany's pioneering computer scientists and later the
developer of the Telefunken TR 440 mainframe, claims to have
invented the concept in 1957 in his doctoral dissertation
Logischer Entwurf eines digitalen Rechengerätes mit mehreren
asynchron laufenden Trommeln und automatischem
Schnellspeicherbetrieb (Logic Concept of a Digital
Computing Device with Multiple Asynchronous Drum Storage and
Automatic Fast Memory Mode).
In 1961, Burroughs released the B5000, the first
commercial computer with virtual memory. It used
segmentation rather than paging.
Like many technologies in the history of computing,
virtual memory was not accepted without challenge. Before it
could be implemented in mainstream operating systems, many
models, experiments, and theories had to be developed to
overcome the numerous problems. Dynamic address translation
required a specialized, expensive, and hard to build
hardware, moreover initially it slightly slowed down the
access to memory. There were also worries that new
system-wide algorithms of utilizing secondary storage would
be far less effective than previously used
application-specific ones.
By 1969 the debate over virtual memory for commercial
computers was over. An IBM research team led by David Sayre
showed that the virtual memory overlay system consistently
worked better than the best manually controlled systems.
Possibly the first minicomputer to introduce virtual
memory was the Norwegian NORD-1. During the 1970s, other
minicomputers implemented virtual memory, notably VAX models
running VMS.
Virtual memory was introduced to the x86 architecture
with the protected mode of the Intel 80286 processor. At
first it was done with segment swapping, which became
inefficient with larger segments. The Intel 80386 introduced
support for paging underneath the existing segmentation
layer. The page fault exception could be chained with other
exceptions without causing a double fault.
Paged virtual memory
Almost all implementations of virtual memory divide the
virtual address space of an application program into
pages; a page is a block of contiguous virtual memory
addresses. Pages are usually at least 4K bytes in size, and
systems with large virtual address ranges or large amounts
of real memory (e.g. RAM) generally use larger page sizes.
Page tables
Almost all implementations use page tables to translate
the virtual addresses seen by the application program into
physical addresses (also referred to as "real addresses")
used by the hardware to process instructions. Each entry in
the page table contains a mapping for a virtual page to
either the real memory address at which the page is stored,
or an indicator that the page is currently held in a disk
file. (Although most do, some systems may not support use of
a disk file for virtual memory.)
Systems can have one page table for the whole system or a
separate page table for each application. If there is only
one, different applications which are running at the same
time share a single virtual address space, i.e. they use
different parts of a single range of virtual addresses.
Systems which use multiple page tables provide multiple
virtual address spaces - concurrent applications think they
are using the same range of virtual addresses, but their
separate page tables redirect to different real addresses.
Dynamic address translation
If, while executing an instruction, a CPU fetches an
instruction located at a particular virtual address, fetches
data from a specific virtual address or stores data to a
particular virtual address, the virtual address must be
translated to the corresponding physical address. This is
done by a hardware component, sometimes called a memory
management unit, which looks up the real address (from the
page table) corresponding to a virtual address and passes
the real address to the parts of the CPU which execute
instructions. If the page tables indicate that the virtual
memory page is not currently in real memory, the hardware
raises a page fault exception (special internal signal)
which invokes the paging supervisor component of the
operating system (see below).
Paging supervisor
This part of the operating system creates and manages the
page tables. If the dynamic address translation hardware
raises a page fault exception, the paging supervisor
searches the page space on secondary storage for the page
containing the required virtual address, reads it into real
physical memory, updates the page tables to reflect the new
location of the virtual address and finally tells the
dynamic address translation mechanism to start the search
again. Usually all of the real physical memory is already in
use and the paging supervisor must first save an area of
real physical memory to disk and update the page table to
say that the associated virtual addresses are no longer in
real physical memory but saved on disk. Paging supervisors
generally save and overwrite areas of real physical memory
which have been least recently used, because these are
probably the areas which are used least often. So every time
the dynamic address translation hardware matches a virtual
address with a real physical memory address, it must put a
time-stamp in the page table entry for that virtual address.
Permanently resident pages
All virtual memory systems have memory areas that are
"pinned down", i.e. cannot be swapped out to secondary
storage, for example:
Interrupt mechanisms generally rely on an array of
pointers to the handlers for various types of interrupt
(I/O completion, timer
event, program error, page fault, etc.). If the pages
containing these pointers or the code that they invoke
were pageable, interrupt-handling would become even more
complex and time-consuming; and it would be especially
difficult in the case of page fault interrupts.
The page tables are usually not pageable.
Data buffers that are accessed outside of the CPU,
for example by peripheral devices that use direct memory
access (DMA) or by I/O channels. Usually such devices
and the buses (connection paths) to which they are
attached use physical memory addresses rather than
virtual memory addresses. Even on buses with an IOMMU,
which is a special memory management unit that can
translate virtual addresses used on an I/O bus to
physical addresses, the transfer cannot be stopped if a
page fault occurs and then restarted when the page fault
has been processed. So pages containing locations to
which or from which a peripheral device is transferring
data are either permanently pinned down or pinned down
while the transfer is in progress.
Timing-dependent kernel/application areas cannot
tolerate the varying response time caused by paging.