Roadmap 1: Threads and Context Switch
In this roadmap, by tracing ThreadTest(), we study the following issues:
This roadmap is intended to assist you in reading the nachos code for the Threads assignment. It is an attempt to guide you through the code while highlighting important points. Prior to reading this document, it is a good idea to review the basic concepts of C++ (if you are not familiar with the language). For example, read "A quick introduction to C++" (nachos/c++example/c++.ps). In particular, make sure you understand how Class objects are created and deleted.
One of the first things that you will notice when you go through the nachos code is that it is broken down into many different files. The name of each file identifies its associated code. For example, all the data structures for Threads are defined in threads/thread.h and its function members are defined in threads/thread.cc. This practice of grouping greatly improves the organization of code, and therefore its readability.
You will need to view the code from the following files, all in threads/ directory, while progressing through this roadmap.
kernel.*
main.*
thread.*
switch.*
scheduler.*
The first task for you is to review the Nachos kernel header file kernel.h. Most of the data structure information for the Nachos code is contained in this file. Note that it also makes references to other header files as shown below
lib/utility.h
threads/thread.h
threads/scheduler.h
machine/interrupt.h
machine/stats.h
machine/alarm.h
machine/machine.h
filesys/filesys.h
Lets start with the file threads/main.cc. Here are a few
lines of code from this routine.
int main(int argc, char **argv) { ... kernel = new Kernel(argc, argv); kernel->Initialize(); CallOnUserAbort(Cleanup); // if user hits ctl-C kernel->ThreadSelfTest(); ... return 0; }
The kernel object is created and it runs Initialize():
void Kernel::Initialize() { stats = new Statistics(); // collect statistics interrupt = new Interrupt; // start up interrupt handling scheduler = new Scheduler(); // initialize the ready queue alarm = new Alarm(randomSlice); // start up time slicing machine = new Machine(debugUserProg); ... interrupt->Enable(); }
It initializes: stats, interrupt, scheduler, alarm, and machine.
void Kernel::ThreadSelfTest() { ... currentThread->SelfTest(); // test thread switching ... }
The currentThread, which is a part of the kernel, run the thread function SelfTest();
void Thread::SelfTest() { DEBUG(dbgThread, "Entering Thread::SelfTest"); Thread *t = new Thread("forked thread"); t->Fork((VoidFunctionPtr) SimpleThread, (void *) 1); kernel->currentThread->Yield(); SimpleThread(0); }
In this function, it first constructs an object of class Thread. Then it calls Fork() and passes the function SimpleThread() to the new thread (identified by 1). Now we have two threads in the nachos process. One is associated with main(), another is associated with SimpleThread(). Both threads share the UNIX nachos process address space. Then the parent thread relinquishes the cpu. When Yield() returns, the main thread also calls SimpleThread(0) (identified by 0).
//---------------------------------------------------------------------- // SimpleThread // Loop 5 times, yielding the CPU to another ready thread // each iteration. // // "which" is simply a number identifying the thread, for debugging // purposes. //---------------------------------------------------------------------- static void SimpleThread(int which) { int num; for (num = 0; num < 5; num++) { cout << "*** thread " << which << " looped " << num << " times\n"; kernel->currentThread->Yield(); } }
NOTE
As shown in the code above, it simply prints a message, and then calls Yield(). Before looking into Yield(), we study the structure of a thread.
class Thread { private: // NOTE: DO NOT CHANGE the order of these first two members. // THEY MUST be in this position for SWITCH to work. int* stackTop; // the current stack pointer int machineState[MachineStateSize]; // all registers except for stackTop public: ... private: // some of the private data for this class is listed above int* stack; // Bottom of the stack // NULL if this is the main thread // (If NULL, don't deallocate stack) ThreadStatus status; // ready, running or blocked char* name; ... }
NOTE
In addition to the data members and functions, note the constructor (Thread) and destructor (~Thread) functions. Can you identify their purposes?
Note that only portions of the Thread class are shown above. Make sure you review thread.h in its entirety to understand all of its data and function members. The mental picture of the thread data structure should look something like this (only some of the data members are shown)
|------------------| | Stacktop ----> | - pointer to top of stack |------------------| | InitialPCState | S0 --- |------------------| | | InitialArgState | S1 | |------------------| | | WhenDoneState | S2 | |------------------| | | StartUpPCState | S3 | |------------------| | | | S4 |----MachineState[MachineStateSize] |------------------| | -array of integers to hold | | S5 | CPU register values for a thread |------------------| | | | S6 | |------------------| | | | S7 | |------------------| | | FPState | FP | |------------------| | | PCState | PC ---- |------------------| | stack | pointer to the bottom of stack |------------------| | status | |------------------| | name | |------------------|
Thread::Thread(char* threadName) { name = threadName; stackTop = NULL; stack = NULL; status = JUST_CREATED; ... }NOTE
As shown above, the thread constructor initializes an empty stack and sets the status to JUST_CREATED. (why?)
When an object of the class thread is constructed, it has an empty stack. A thread is an execution stream. Typically, a thread executes a function in a program. As in Thread::SelfTest(), the function Fork() allocates the function SimpleThread() on the stack of the new thread. Then both threads run SimpleThread().
void Thread::Fork(VoidFunctionPtr func, void *arg) { ... StackAllocate(func, arg); oldLevel = interrupt->SetLevel(IntOff); scheduler->ReadyToRun(this); // ReadyToRun assumes that interrupts // are disabled! (void) interrupt->SetLevel(oldLevel); }
NOTE
Can you identify the purpose of the arguments func and arg?
"this" is a reserved word in C++, its value is a pointer to the member function's class-object argument. In this case, it is the pointer to a Thread class-object (the thread).
As shown above, Fork() makes a call to StackAllocate(). You should first review the Scheduler class data structures and function members in threads/scheduler.*. Note that before making a call to the function ReadyToRun(), the interrupts are turned off (why?). The purpose of this function is to make the thread ready to run by placing it in the ready list. Note that ReadyToRun() does not actually start running the thread, it simply changes the status of the thread to READY and appends it to the end of the Ready List. This thread will only start executing once the scheduler chooses it.
Now let's study StackAllocate(), which allocates a function on the stack of a thread.
void Thread::StackAllocate(VoidFunctionPtr func, int arg) { stack = (int *) AllocBoundedArray(StackSize * sizeof(int)); ... #ifdef DECMIPS stackTop = stack + StackSize - 4; // -4 to be on the safe side! *stack = STACK_FENCEPOST; #endif ... #ifdef PARISC ... #else machineState[PCState] = ThreadRoot; machineState[StartupPCState] = ThreadBegin; machineState[InitialPCState] = func; machineState[InitialArgState] = arg; machineState[WhenDonePCState] = ThreadFinish; #endif }
NOTE
This function initializes the execution stack of a new thread. Specifically, it initializes the program counter (PC) to point to the routine ThreadRoot. This forces execution to begin with ThreadRoot instead of the user-supplied function. (Why?) Then it assigns ThreadBegin, which actually enables interrupts. In addition, it loads the user-supplied function and its arguments if any. Finally, it assigns the function ThreadFinish(), which is really Thread::Finish() (in threads/thread.*), so upon the completion of the execution of the user-supplied function, the thread runs Thread::Finish() (What does it do?)
ThreadRoot is an assembly code. It is machine dependent. Here we use MIPS as an example. The following are the symbolic definitions.
threads/switch.s
#ifdef DECMIPS /* Symbolic register names */ #define z $0 /* zero register */ #define a0 $4 /* argument registers */ #define a1 $5 #define s0 $16 /* callee saved */ #define s1 $17 #define s2 $18 #define s3 $19 #define s4 $20 #define s5 $21 #define s6 $22 #define s7 $23 #define sp $29 /* stack pointer */ #define fp $30 /* frame pointer */ #define ra $31 /* return address */ ... #endif // DECMIPS
#ifdef DECMIPS /* Registers that must be saved during a context switch. * These are the offsets from the beginning of the Thread object, * in bytes, used in switch.s */ #define SP 0 #define S0 4 #define S1 8 #define S2 12 #define S3 16 #define S4 20 #define S5 24 #define S6 28 #define S7 32 #define FP 36 #define PC 40 /* To fork a thread, we set up its saved register state, so that * when we switch to the thread, it will start running in ThreadRoot. * * The following are the initial registers we need to set up to * pass values into ThreadRoot (for instance, containing the procedure * for the thread to run). The first set is the registers as used * by ThreadRoot; the second set is the locations for these initial * values in the Thread object -- used in Thread::AllocateStack(). */ #define InitialPC s0 #define InitialArg s1 #define WhenDonePC s2 #define StartupPC s3 #define PCState (PC/4-1) #define FPState (FP/4-1) #define InitialPCState (S0/4-1) #define InitialArgState (S1/4-1) #define WhenDonePCState (S2/4-1) #define StartupPCState (S3/4-1) #endif // DECMIPS
NOTE
SP, S0, S1, ... are the offsets (in bytes) to the beginning of the thread object. So, the stack pointer (SP) must be the first.
InitialPC, InitialArg, WhenDonePC, StartupPC are defined as the callee saved registers s0, s1, s2, s3, which are used for the pointers to the function, argument, finish routine and startup procedure respectively, so ThreadRoot can jump to these functions.
PCState, FPState, ... are the indices to the array MachineState.
After the initialization, the structure of the thread is depicted as follows
|------------------| | Stacktop | pointer to top of stack |------------------| | InitialPCState | | - pointer to the fucntion |------------------| | InitialArgState | | - argument |------------------| | WhenDoneState | | - ThreadFinish |------------------| | StartUpPCState | | - ThreadBegin |------------------| | | |------------------| | | |------------------| | | |------------------| | | |------------------| | FPState | |------------------| | PCState | | - ThreadRoot |------------------| | stack | pointer to the bottom of stack |------------------| | status | |------------------| | name | |------------------|
Now, we take a look at ThreadRoot.
.text .align 2 .globl ThreadRoot .ent ThreadRoot,0 ThreadRoot: or fp,z,z # Clearing the frame pointer here # makes gdb backtraces of thread stacks # end here (I hope!) jal StartupPC # call startup procedure move a0, InitialArg jal InitialPC # call main procedure jal WhenDonePC # when we're done, call clean up procedure # NEVER REACHED .end ThreadRootNOTE
ThreadRoot basically does the following three things: 1) Calls ThreadBegin that enables interrupts, 2) Calls the user supplied function, passing it the supplied argument and 3) Calls ThreadFinish to terminate the thread (clean up).
At this point you should understand
Now we go back to Yield() called by SimpleThread() to study context switch.
void Thread::Yield() { Thread *nextThread; IntStatus oldLevel = kernel->interrupt->SetLevel(IntOff); ASSERT(this == kernel->currentThread); DEBUG(dbgThread, "Yielding thread: " << name); nextThread = kernel->scheduler->FindNextToRun(); if (nextThread != NULL) { kernel->scheduler->ReadyToRun(this); kernel->scheduler->Run(nextThread, FALSE); } (void) kernel->interrupt->SetLevel(oldLevel); }Note
Thread::Yield() basically suspends the calling thread and
selects a new one for execution. The first thing to notice is
that interrupts are turned off (Why?). Next, a call is made to
FindNextToRun (threads/scheduler.*) to find the next
thread to run. The current thread is put back on the Ready
queue by calling ReadyToRun (threads/scheduler.*).
Finally, Run()
is called to run the next thread.
How does currentThread relinquish the CPU and nextThread
picked up by the kernel scheduler get the CPU?
void Scheduler::Run(Thread *nextThread, bool finishing) { Thread *oldThread = kernel->currentThread; ASSERT(kernel->interrupt->getLevel() == IntOff); if (finishing) { // mark that we need to delete current thread ASSERT(toBeDestroyed == NULL); toBeDestroyed = oldThread; } #ifdef USER_PROGRAM // ignore until running user programs ... #endif oldThread->CheckOverflow(); // check if the old thread // had an undetected stack overflow kernel->currentThread = nextThread; // switch to the next thread nextThread->setStatus(RUNNING); // nextThread is now running ... // This is a machine-dependent assembly language routine defined // in switch.s. You may have to think // a bit to figure out what happens after this, both from the point // of view of the thread and from the perspective of the "outside world". SWITCH(oldThread, nextThread); // we're back, running oldThread ... }
NOTE
The first significant event happening here is that nextThread is set to the RUNNING state. Next a call is made to SWITCH() in switch.s. This function (written in assembly code) actually switches the CPU from the old Thread (oldThread) with the new one (nextThread).
# a0 -- pointer to old Thread # a1 -- pointer to new Thread .globl SWITCH .ent SWITCH,0 SWITCH: sw sp, SP(a0) # save new stack pointer sw s0, S0(a0) # save all the callee-save registers sw s1, S1(a0) sw s2, S2(a0) sw s3, S3(a0) sw s4, S4(a0) sw s5, S5(a0) sw s6, S6(a0) sw s7, S7(a0) sw fp, FP(a0) # save frame pointer sw ra, PC(a0) # save return address lw sp, SP(a1) # load the new stack pointer lw s0, S0(a1) # load the callee-save registers lw s1, S1(a1) lw s2, S2(a1) lw s3, S3(a1) lw s4, S4(a1) lw s5, S5(a1) lw s6, S6(a1) lw s7, S7(a1) lw fp, FP(a1) lw ra, PC(a1) # load the return address j ra .end SWITCHIn SWITCH, the first argument a0 points to the old thread (to be switched out), the second argument a1 points to the new thread (to be switched in). As it is mentioned in the code, you will probably have to do a little thinking to figure out exactly how SWITCH Works. In particular, keep the following questions in mind.
Why stackTop must be the first variable to make SWITCH work?
What address should we save for the program counter PC of the
switched out thread? In other words, where do we want it to
continue execution when the oldThread returns from SWITCH?
At what point can we say the Context Switch has taken place?
If the oldThread has finished, who kills it?
Finally, putting all things together,
Can you trace Thread::SelfTest(), the execution stacks, by hand?