Roadmap 4: User Program

So far, every thing we have done is a part of the Nachos kernel. The data structures (threads/kernel.h) of the kernel consist of the thead currently running, a scheduler or dispatcher, an interrupt mechanism, and software alarm clock, which constructs a hardware timer. Every thread is an execution stream in the Nachos process, a UNIX process running on the underlying machine such as a SUN workstation. To understand the concept of process, how a user program is loaded into memory and runs on the CPU, Nachos simulates a MIPS machine. So we have complete control of the execution of a user program. In this roadmap, we study the following issues:

  • The nachos kernel space and the user's space;
  • Execution of a user-level program on Nachos;
  • System calls and exception handling;
  • The virtual address space and the physical address space;
  • Translation of virtual addresses to physical addresses.

  • When Nachos starts (threads/main.cc), a kernel is constructed and initialized.

     

     

    threads/main.cc

    
    int
    main(int argc, char **argv)
    {
        ...
    
        for (i = 1; i < argc; i++) {
            if (strcmp(argv[i], "-d") == 0) {
            ...
    
            }
            else if (strcmp(argv[i], "-x") == 0) {
                ASSERT(i + 1 < argc);
                userProgName = argv[i + 1];
                i++;
            }
            ...
    
        }
    
        kernel = new Kernel(argc, argv);
    
        kernel->Initialize();
    
        ...
    
        // finally, run an initial user program if requested to do so
        if (userProgName != NULL) {
          AddrSpace *space = new AddrSpace;
          ASSERT(space != (AddrSpace *)NULL);
          if (space->Load(userProgName)) { // load the program into the space
            space->Execute();              // run the program
            ASSERTNOTREACHED();            // Execute never returns
          }
        }
    
        // If we don't run a user program, we may get here.
        // Calling "return" would terminate the program.
        // Instead, call Halt, which will first clean up, then
        //  terminate.
        kernel->interrupt->Halt();
    
        ASSERTNOTREACHED();
    }
    

    The kernel initializes, among other things, a machine (simulated MIPS) and a file system.

    void
    Kernel::Initialize()
    {
        // We didn't explicitly allocate the current thread we are running in.
        // But if it ever tries to give up the CPU, we better have a Thread
        // object to save its state.
        currentThread = new Thread("main");
        currentThread->setStatus(RUNNING);
    
        ...
    
        machine = new Machine(debugUserProg);
    
        ...
    
    #ifdef FILESYS_STUB
        fileSystem = new FileSystem();
    #else
        fileSystem = new FileSystem(formatFlag);
    #endif // FILESYS_STUB
    
        ...
    
        interrupt->Enable();
    }
    

    When the command line is nachos -x userProgName, it executes a user program named userProgName. First, it creates an address space for the program. Then, it loads the program into the space and executes the program on the simulated machine.

    As shown below, when an address space is constructed, a page table is set up and zero out the entire machine memory.

     

     

    userprog/addrspace.cc

    
    AddrSpace::AddrSpace()
    {
        pageTable = new TranslationEntry[NumPhysPages];
        for (int i = 0; i < NumPhysPages; i++) {
            pageTable[i].virtualPage = i;
            pageTable[i].physicalPage = i;
            pageTable[i].valid = TRUE;
            pageTable[i].use = FALSE;
            pageTable[i].dirty = FALSE;
            pageTable[i].readOnly = FALSE;
        }
    
        // zero out the entire address space
        bzero(kernel->machine->mainMemory, MemorySize);
    }
    

    For now, uniprogramming, the mapping between the virtual space and the physical space is simple (linear).

    After the address space is created, the program (executable) is loaded into the space. The details of load are given later. How is a program executed?

     

     

    userprog/addrspace.cc

    
    AddrSpace::Execute(char *fileName)
    {
        if (!Load(fileName)) {
            return;
        }
        kernel->currentThread->space = this;
    
        this->InitRegisters();              // set the initial register values
        this->RestoreState();               // load page table register
    
        kernel->machine->Run();             // jump to the user progam
        ASSERTNOTREACHED;                   // machine->Run never returns;
                                            // the address space exits
                                            // by doing the syscall "exit"
    }
    

    This function does the following:

    - Given its name fileName, loads an executable from disk into memory;
    - Assign this address space to the address space of the current thread;
    - Initializes the simulator MIPS machine registers;
    - Restores the machine state;
    - Runs the program on the MIPS simulator.

    To load a program, it calls the file system to open the object (executable) file. It takes a file name, a string, and returns a pointer executable to an instance of the class OpenFile. We will study the procedure of openning a file later in file system. Basically, it uses the file name to find the file header (called i-node in UNIX) on the disk and move the i-node from the disk to the memory and returns a pointer to the i-node. The file header contains the information about the file, such as the size of the file, disk sectors for data, etc. The OpenFile class has memeber functions like Read and Write, so we can read/write the file from/to the disk. Note that after openning the file, only the file header is in the memory, not yet the data of the file. Now, executable is a pointer to (address of) the file header object. Is it an address in the kernel space or the user space?
    For now, we assume that the function Load() loads the executable into the simulated machine memory. Also, for now we assume the starting address of the executable in the physical memory is 0. We will look into details later.

    Before running the program, we must set the MIPS registers properly.

     

     

    userprog/addrspace.cc
    
    void
    AddrSpace::InitRegisters()
    {
        Machine *machine = kernel->machine;
        int i;
    
        for (i = 0; i < NumTotalRegs; i++)
            machine->WriteRegister(i, 0);
    
        // Initial program counter -- must be location of "Start"
        machine->WriteRegister(PCReg, 0);
    
        // Need to also tell MIPS where next instruction is, because
        // of branch delay possibility
        machine->WriteRegister(NextPCReg, 4);
    
        // Set the stack register to the end of the address space, where we
        // allocated the stack; but subtract off a bit, to make sure we don't
        // accidentally reference off the end!
        machine->WriteRegister(StackReg, numPages * PageSize - 16);
        ...
    }
    

    This function sets the initial values for the user-level registers. (total of 40 registers defined in machine/machine.*) We write these directly into the "machine" registers, so that we can immediately jump to user code. Specifically, the program counter register PCReg is initialized to 0. Note that these will be saved/restored into the currentThread->userRegisters when the thread is context switched out. Now, we have two sets of registers. (What are they?)

    For now, we assume that RestoreState() called by Execute() tells the machine where to find the page table. After all the registers are set up, then Run() is called to run the program on the MIPS machine.

     

     

    machine/mipssim.cc

    
    void
    Machine::Run()
    {
        Instruction *instr = new Instruction;  // storage for decoded instruction
    
        ...
        kernel->interrupt->setStatus(UserMode);
        for (;;) {
            OneInstruction(instr);
            kernel->interrupt->OneTick();
            ...
        }
    }
    

    This function simulates the execution of a user-level program on Nachos. It never returns because a program ends with exit(). This routine is re-entrant, in that it can be called multiple times concurrently -- one for each thread executing user code.
    First, it sets the machine status to UserMode. (MachineStatus: IdleMode, SystemMode, UserMode defined in machine/interrupts.h) Then, it executes one instruction at a time: OneInstruction(). After each user instruction is executed, it calls OneTick().

     

     

    machine/mipssim.cc

    
    void
    Machine::OneInstruction(Instruction *instr)
    {
        int raw;
        int nextLoadReg = 0;
        int nextLoadValue = 0;      // record delayed load operation, to apply
                                    // in the future
    
        // Fetch instruction
        if (!ReadMem(registers[PCReg], 4, &raw))
            return;                 // exception occurred
        instr->value = raw;
        instr->Decode();
        ...
        // Execute the instruction (cf. Kane's book)
        switch (instr->opCode) {
    
          case OP_ADD:
    
          ...
    
          case OP_SYSCALL
            RaiseException(SyscallException, 0);
            return;
    
          ...
    
        }
    
        // Now we have successfully executed the instruction.
    
        // Do any delayed load operation
        DelayedLoad(nextLoadReg, nextLoadValue);
    
        // Advance program counters.
        registers[PrevPCReg] = registers[PCReg];    // for debugging, in case we
                                                    // are jumping into lala-land
        registers[PCReg] = registers[NextPCReg];
        registers[NextPCReg] = pcAfter;
    }
    

    This function performs a fetch-execute cycle of an instruction from a user-level program. It fetches an instruction (4 bytes) at the address given by the content of PCReg by calling ReadMem(). The address of the instruction in PCReg is a virtual address. The instruction is stored in the physical memory. The virtual address in PCReg must be translated into physical address where the instruction is stored. If there is any kind of exception (see machine/machine.h for exception types), we return to Run(). This allows us to re-start the instruction execution, in case any of our state has changed. Then it decodes and executes the instruction. Finally, it advances PC and updates the delay slot. This routine is re-entrant, in that it can be called multiple times concurrently -- one for each thread executing user code.

    We concentrate on system calls. Suppose we have a C program:

            main() {
                ...
                Halt();
            }
    

    After it is compiled, the assembly code looks like the following. Note that all nachos assembly codes start with test/start.s.

    
                    .globl  Start
                    .ent    Start, 0
            Start:
                    jal     main
                    move    $4, $0
                    jal     Exit
                    .end    Start
                    ...
                    .globl  Halt
                    .ent    Halt, 0
            Halt:
                    addiu   $2, $0, SC_Halt
                    syscall
                    j       $31
                    .end    Halt
                    ...
    
            main:
                    ...
                    jal     Halt
                    ...
    

    So, when we start a program, it immediately jumps to main. Since main() calls Halt(), it jumps to Halt in Start.s. Then it assigns the system call code SC_Halt defined in userprog/syscall.h to user-level register 2 (v0) and executes syscall instruction. From OneInstruction, we can see that in case when the instruction is syscall, it runs RaiseException to handle the exception.

     

     

    machine/machine.cc

    
    void
    Machine::RaiseException(ExceptionType which, int badVAddr)
    {
        ...
        registers[BadVAddrReg] = badVAddr;
        DelayedLoad(0, 0);                  // finish anything in progress
        interrupt->setStatus(SystemMode);
        ExceptionHandler(which);            // interrupts are enabled at this point
        interrupt->setStatus(UserMode);
    }
    
    NOTE

    This function transfers control to the Nachos kernel from user mode, because the user program either invoked a system call, or some exception occured (such as the address translation failed). It calls ExceptionHandler to handle the system call or exception.

     

     

    userprog/exception.cc

    
    void
    ExceptionHandler(ExceptionType which)
    {
        int type = kernel->machine->ReadRegister(2);
    
        switch (which) {
            case SyscallException:
                switch(type) {
                    case SC_Halt:
                        ...
                        kernel->interrupt->Halt();
                        break;
                    default:
                        cerr ...;
                        break;
                }
                break;
            default:
                cerr ...;
                break;
        }
        ASSERTNOTREACHED();
    }
    

    For now, it only handles the system call Halt, which is implemented by calling Halt() (machine/interrupt.*). Following the MIPS C register convention, ExceptionHandler reads the exception code from machine register 2 (v0) and arguments, if any, from machine registers 4, 5, 6, 7 (a0, a1, a2, a3), and it writes the return value in machine register 2 (v0). When the exception handler returns, this function transfers the control back to user mode and returns back to user program.

    At this point, you should understand how a user program gets run and what happens when the user program makes a system call or an exception occurs.

    In the above example, the system call halt() does not have any argument. How do you implement system calls which have arguments? In particular, what if an argument is a pointer (address), for example, a pointer to a file name (a string of characters) given in the program? Think about two spaces: kernel and user.

     

     

    Now we return to the constructor AddrSpace to study the concept of virtual (logical) address space and physical address space. Also, we discuss the details of loading an executable from disk to memory.

     

     

    userprog/addrspace.h

    
    class AddrSpace {
      public:
         ...
      private:
        TranslationEntry *pageTable;        // Assume linear page table translation
                                            // for now!
        unsigned int numPages;              // Number of pages in the virtual
                                            // address space
    };
    

    The mental picture of the addrspace data structures looks something like:

            |------------------|
            | pageTable        | pointer to a table of
            |                  | TranslationEntry
            |------------------|
            | numPages         | number of pages in the virtual address space
            |------------------|    

    Note that the addrspace does not have the program. All it has is a page table. As we have seen, the AddrSpace constructor sets up a page table and zeros out the entire memory. Now, we look into the details of loading a program from disk to memory.

     

     

    userprog/addrspace.cc

    
    bool
    AddrSpace::Load(char *fileName)
    {
        OpenFile *executable = kernel->fileSystem->Open(fileName);
        NoffHeader noffH;
        unsigned int size;
    
        ...
    
        executable->ReadAt((char *)&noffH, sizeof(noffH), 0);
        if ((noffH.noffMagic != NOFFMAGIC) &&
                    (WordToHost(noffH.noffMagic) == NOFFMAGIC))
            SwapHeader(&noffH);
        ASSERT(noffH.noffMagic == NOFFMAGIC);
    
    // how big is address space?
        size = noffH.code.size + noffH.initData.size + noffH.uninitData.size
                            + UserStackSize;        // we need to increase the size
                                                    // to leave room for the stack
        numPages = divRoundUp(size, PageSize);
        size = numPages * PageSize;
    
        ASSERT(numPages <= NumPhysPages);           // check we're not trying
                                                    // to run anything too big --
                                                    // at least until we have
                                                    // virtual memory
    
        ...
    
    // then, copy in the code and data segments into memory
    // Note: this code assumes that virtual address = physical address
        if (noffH.code.size > 0) {
            ...
            executable->ReadAt(
                    &(kernel->machine->mainMemory[noffH.code.virtualAddr]),
                            noffH.code.size, noffH.code.inFileAddr);
        }
        if (noffH.initData.size > 0) {
            ...
            executable->ReadAt(
                  &(kernel->machine->mainMemory[noffH.initData.virtualAddr]),
                  noffH.initData.size, noffH.initData.inFileAddr);
        }
    
    #ifdef RDATA
        ...
    #endif
    
        delete executable;                  // close file
        return TRUE;                        // success
    }
    

    The loader does the following:

    - Calls the file system to open the executable file for read
    - Reads in noffH (nachos object file file header, don't be confused with the file header or i-node of the file). The nachos object file file header noffH is a part of the object file, whereas i-node is not. The noffH is at the beginning of the object file (offset 0) identified by its i-node pointed to by executable. Here is the structure of noffH:

    
    bin/noff.h
    
            --------------------------------
            |         magic number         |
            --------------------------------
            |               | virtual addr |
            |               |--------------|
            |    code       | infile addr  | byte offset from the
            |               |--------------| beginning of the file
            |               |    size      |
            |------------------------------|
            |               | virtual addr |
            |               |--------------|
            |  initialized  | infile addr  | byte offset from the
            |      data     |--------------| beginning of the file
            |               |    size      |
            |------------------------------|
            |               | virtual addr |
            |               |--------------|
            | uninitialized | infile addr  | byte offset from the
            |     data      |--------------| beginning of the file
            |               |    size      |
            |------------------------------|
    

    - Checks the magic number (is it a nachos file? big edian or little endian?)
    - Finds out the total size of the address space (code+data+stack) in number of pages. The PageSize is defined in machine/machine.h, which is the same as SectorSize defined in machine/disk.h (128 bytes). Now, we can imagine there is a virtual space to store the program, starting address 0. This space is partitioned into pages (pure paging).
    - Reads in code and initialized data from the disk into the memory.
    - Closes the executable file by deleting the memory image of its i-node.

    You should think about how this mechanism (page table) can protect the kernel and processes each other.

     

     

    The page table is an array of TranslationEntry.

    machine/translate.h

    class TranslationEntry {
      public:
        int virtualPage;    // The page number in virtual memory.
        int physicalPage;   // The page number in real memory (relative to the
                            //  start of "mainMemory"
        bool valid;         // If this bit is set, the translation is ignored.
                            // (In other words, the entry hasn't been initialized.)
        bool readOnly;      // If this bit is set, the user program is not allowed
                            // to modify the contents of the page.
        bool use;           // This bit is set by the hardware every time the
                            // page is referenced or modified.
        bool dirty;         // This bit is set by the hardware every time the
                            // page is modified.
    };
    

    The above class defines an entry in a translation table -- either in a page table or a TLB (translation lookaside buffer). Each entry defines a mapping from one virtual page to one physical page. In addition, there are some extra bits for access control (valid and read-only) and some bits for information about usage (use) and write modification (dirty).

     

     

    After an address space is constructed, it is assigned to the current thread. Let us revisit the data structures of a thread.

     

     

    threads/thread.h

    
    class Thread {
      private:
        // NOTE: DO NOT CHANGE the order of these first two members.
        // THEY MUST be in this position for SWITCH to work.
        int* stackTop;                       // the current stack pointer
        int machineState[MachineStateSize];  // all registers except for stackTop
    
      public:
    
        ...
    
      private:
        // some of the private data for this class is listed above
    
        int* stack;                         // Bottom of the stack
                                            // NULL if this is the main thread
                                            // (If NULL, don't deallocate stack)
        ThreadStatus status;                // ready, running or blocked
        char* name;
    
        ...
    
    // A thread running a user program actually has *two* sets of CPU registers --
    // one for its state while executing user code, one for its state
    // while executing kernel code.
    
        int userRegisters[NumTotalRegs];    // user-level CPU register state
    
      public:
        void SaveUserState();               // save user-level register state
        void RestoreUserState();            // restore user-level register state
    
        AddrSpace *space;                   // User code this thread is running.
    };
    

    In addition to the data structures discussed previously, there is an array for the user-level registers. A thread running a user program actually has two sets of CPU registers -- one for its state while executing user code. This is the set of the registers in the MIPS simulator. Another is for its state while executing kernel code. This set is machine dependent (SUN/sparc, MIPS, etc). There is also a pointer to this process address space.

    
            |---------------|
            |  stackTop     |
            |---------------|
            |               |
            | machineState  |
            |               |
            |---------------|
            |    stack      |
            |---------------|
            |    status     |
            |---------------|
            |    name       |
            |===============|
            |               |
            |               |
            | userRegisters |
            |               |
            |               |
            |---------------|
            |    space      | pointer to the address space
            |---------------| of this thread
    

     

     

    In Execute(), before running the user program, we need to restore the machine state. Specifically, we tell the machine which page table should be used for address translation.

     

     

    userprog/addrspace.cc

    
    void
    AddrSpace::RestoreState()
    {
        kernel->machine->pageTable = pageTable;
        machine->pageTableSize = numPages;
    }
    
    NOTE

    This function assigns the current thread's page table to the machine page table. Thus whenever the machine reads the memory (to fetch an instruction or data), it uses the page table to translate the virtual address to physical address. The baseline Nachos is a uniprogramming system, RestoreState() just tells the machine where to find the page table when a process starts and SaveState() (not needed) has nothing. In multiprogramming, on a context switch (see Run() in threads/scheduler.cc), what should SaveState() and RestoreState() do?

     

     

    Finally, we investigate the translation of a virtual address into physical address. machine/translate.cc

    bool Machine::ReadMem(int addr, int size, int *value)
    {
        int data;
        ExceptionType exception;
        int physicalAddress;
    
        ...
        exception = Translate(addr, &physicalAddress, size, FALSE);
        if (exception != NoException) {
            RaiseException(exception, addr);
            return FALSE;
        }
        switch (size) {
          case 1:
            data = machine->mainMemory[physicalAddress];
            *value = data;
            break;
    
          case 2:
            data = *(unsigned short *) &machine->mainMemory[physicalAddress];
            *value = ShortToHost(data);
            break;
    
          case 4:
            data = *(unsigned int *) &machine->mainMemory[physicalAddress];
            *value = WordToHost(data);
            break;
    
          default: ASSERT(FALSE);
        }
        ...
        return (TRUE);
    }
    
    NOTE

    This function translates a virtual address to physical address by calling Translate(). Then it checks the returned value from Translate() function, if NoException, it gets the instruction or data from the physical memory, else calls exception handler RaiseException().

     

     

    machine/translate.cc

    
    ExceptionType
    Machine::Translate(int virtAddr, int* physAddr, int size, bool writing)
    {
        int i;
        unsigned int vpn, offset;
        TranslationEntry *entry;
        unsigned int pageFrame;
    
        ...
    
        // check for alignment errors
        if (((size == 4) && (virtAddr & 0x3)) || ((size == 2) && (virtAddr & 0x1))){
            ...
            return AddressErrorException;
        }
       
        // we must have either a TLB or a page table, but not both!
        ASSERT(tlb == NULL || pageTable == NULL);
        ASSERT(tlb != NULL || pageTable != NULL);
    
        // calculate the virtual page number, and offset within the page,
        // from the virtual address
        vpn = (unsigned) virtAddr / PageSize;
        offset = (unsigned) virtAddr % PageSize;
    
        if (tlb == NULL) {          // use page table and vpn is index into table
            if (vpn >= pageTableSize) {
                ...
                return AddressErrorException;
            } else if (!pageTable[vpn].valid) {
                ...
                return PageFaultException;
            }
            entry = &pageTable[vpn];
        } else {
            for (entry = NULL, i = 0; i < TLBSize; i++)
                if (tlb[i].valid && (tlb[i].virtualPage == ((int)vpn))) {
                    entry = &tlb[i];                        // FOUND!
                    break;
                }
            if (entry == NULL) {                            // not found
                ...
                return PageFaultException;          // really, this is a TLB fault,
                                                    // the page may be in memory,
                                                    // but not in the TLB
            }
        }
    
        if (entry->readOnly && writing) {   // trying to write to a read-only page
            ...
            return ReadOnlyException;
        }
        pageFrame = entry->physicalPage;
    
        // if the pageFrame is too big, there is something really wrong!
        // An invalid translation was loaded into the page table or TLB.
        if (pageFrame >= NumPhysPages) {
            ...
            return BusErrorException;
        }
        entry->use = TRUE;          // set the use, dirty bits
        if (writing)
            entry->dirty = TRUE;
        *physAddr = pageFrame * PageSize + offset;
        ASSERT((*physAddr >= 0) && ((*physAddr + size) <= MemorySize));
        ...
        return NoException;
    }
    

    This function translates a virtual address into a physical address, using either a page table or a TLB.
    What does this function return? How is OS protected? (user should not be able to directly touch the kernel)
    Types of exceptions are defined in machine/machine.h. For example, page fault (will not happen until virtual memory), read only voilation; bus error (exceed physical memory limit), address error (alignment or exceed virtual address space limit).