Roadmap3

Roadmap 6: Virtual Memory

In this roadmap, we study the following issues:

Introduction of the TLB;

Demand Paging;

Virtual Memory in Uniprogramming;

Virtual Memory in Multiprogramming.

In this part, we take an incremental development approach to the implementation of virtual memory, an illusion of an (almost) unlimited memory.

Step 1 Introduction of the TLB

In multiprogramming, we separate processes by their address spaces. Each process has its own address space whose structure consists of a page table and an integer (number of pages).

userprog/addrspace.h


class AddrSpace {
  public:

        ...

  private:
    TranslationEntry *pageTable;
 
    unsigned int numPages;
};

Every virtual address (instruction or data) is translated into physical address by the machine using the page table of the process currently running. It means that there is at least one translation for each instruction. The page table, however, is usually too large to fit in fast memory. It will be too slow if we keep the page table in memory and access memory for each translation. The solution is cache: keep part of page table in fast memory. This is called Translation Lookaside Buffer (TLB). In this step, we introduce the TLB. Note that the TLB is part of the machine (hardware), its entry structure cannot be changed.

machine/translate.h


class TranslationEntry {
  public:
    int virtualPage;    // The page number in virtual memory.
    int physicalPage;   // The page number in real memory (relative to the
                        //  start of "mainMemory"
    bool valid;         // If this bit is set, the translation is ignored.
                        // (In other words, the entry hasn't been initialized.)
    bool readOnly;      // If this bit is set, the user program is not allowed
                        // to modify the contents of the page.
    bool use;           // This bit is set by the hardware every time the
                        // page is referenced or modified.
    bool dirty;         // This bit is set by the hardware every time the
                        // page is modified.
};

The size of the TLB:

machine/machine.h


#define TLBSize         4    // if there is a TLB, make it small

can only be changed for experiment purpose.

After introducing the TLB, the machine knows nothing about process page tables. Address translation always goes through the TLB. You must disconnect the link between the machine and the page table in the original Nachos.

userprog/addrspace.cc


//----------------------------------------------------------------------
// AddrSpace::RestoreState
//      On a context switch, restore the machine state so that
//      this address space can run.
//
//      For now, tell the machine where to find the page table.
//----------------------------------------------------------------------

void AddrSpace::RestoreState()
{
    kernel->machine->pageTable = pageTable;
    kernel->machine->pageTableSize = numPages;
}

Note also that the TLB has part of the page table of the current process. So, on a context switch, you should save the changes in the TLB (inconsistency between the TLB and page table) into the page table of the process being switched out and restore the TLB for the process being switched in. In most machines, we simply invalidate all entries of the TLB when a process is switched in.

Since the TLB has only part of the page table, we must establish a mapping between the page table entries and the TLB entries. During address translation, machine checks the TLB, if the virtual page number does not match the virtual page number in any of the valid TLB entries, the machine saves the virtual address which caused the miss in the register for bad virtual address and raises a page fault exception (really a TLB miss) and causes a trap to the exception handler in the OS kernel (TLB miss handler). The handler checks the page table using the bad virtual address saved in the register. For incremental development, in this step we assume:

Uniprogramming;
Entire address space can fit in memory;
Program is loaded into memory when the space is constructed.

These assumptions imply that we can make physical page numbers equal virtual page numbers just like the original Nachos. Also, during a TLB miss, the handler can always find the desired entry in page table and copy the page table entry into the corresponding TLB entry. Then the exception handler returns back to user program and the instruction which caused the exception is re-executed.

Step 2 Demand Paging

In this step, we remove the third assumption made in the previous step. Why? Because, some pages, for example some stack pages, may never be referenced. Instead of allocating all physical pages (including stack pages) when the space is constructed, we allocate physical pages on demand during run-time. Thus, during the construction of the space, we set up the page table without allocating physical pages. In other words, initially all entries of the page table are not valid. So, when the TLB miss handler checks the page table, the bad virtual page number may not match the virtual page number in any of the valid page table entries (a real page fault). A handler (page fault handler) allocates a physical page and moves the desired page from the disk to memory with the help from the file system. In this step, we still assume uniprogramming. Note that after a physical page is allocated, you should not assume linear mapping between virtual page numbers and physical page numbers. Your code should work if a different physical page is allocated in later development. There are two issues here. First, the way the file system sees the executable is different from the way the handler sees the program. The file system sees the executable as a stream of byte starting from the header at position (infile address) 0. Whereas the handler sees the program stored in a virtual space divided into pages starting from the instruction at virtual address 0. You have to translate a virtual page number into infile address (position) before calling the file system. The second issue is that the address space includes the stack in addition to the code and data in the executable. If a page contains entirely stack/uninitialized data, this page should be initially zero filled instead of being moved from the disk. You may need a status (ZEROFILL or LOADFROMOBJ) for each page. You can set the status for each page when the page table is set up during the contruction of the space.

After the page fault handler allocates and initializes a physical page, it sets up the page table entry and makes it valid. When the exception handler returns back to user program, the instruction caused the exception is re-executed. Since the entry is not in the TLB yet, a TLB miss occurs for the second time. But this time, the TLB miss handler can find the entry in the page table as what happens in the previous step.

In this step, the key issue is the implementation of demand paging.

Step 3 Virtual Memory (Uniprogramming)

Now we remove the second assumption made in Step 1. We create an illusion that the memory is larger than it really is, so we can run a program whose space is larger than memory. But we still assume uniprogramming. Since the program space is larger than the memory, during a page fault exception, the page fault handler trying to allocate a physical page may find that the memory is full. Thus, one page must be swapped out in order to bring in the page being referenced. A replacement policy must be applied to choose the victim. Since we assume uniprogramming, there is only one page table. The process pages against itself. Also, the page to be swapped out may not be a part of the executable (e.g., a stack page). We have to create and open a swap file during the construction of the space. Note that the size of the stack is a multiple of the page size (= sector size = 128).

userprog/addrspace.h


#define UserStackSize           1024    // increase this as necessary!

The stack is located at the high end of the memory and grows downward.

userprog/addrspace.cc


void
AddrSpace::InitRegisters()
{
     ...

   // Set the stack register to the end of the address space, where we
   // allocated the stack; but subtract off a bit, to make sure we don't
   // accidentally reference off the end!
    machine->WriteRegister(StackReg, numPages * PageSize - 16);
}

The key issues in this step are: the implementation of page replacement policy and the implementation of the swap file.

Step 4 Virtual Memory (Multiprogramming)

Finally, we remove the assumption 1 made in Step 1 and allow multiprogramming. Now, we have multiple page tables on the system. When the page fault handler chooses a page to be replaced, it should have a global view of how the physical pages are used. You may want a coremap, which is basically an inverted global page table. The handler chooses the physical page to be replaced by checking the coremap. Then from the coremap, it finds the associated process and its page table. Note that the coremap is a global map shared by multiple processes. Synchronization is necessary. Also there is the efficiency issue. Consider the following situation. A process has a page fault and chooses a page to be replaced, then there is a context switch and the new process references that page.

The key issue in this step is the implementation of the coremap.