close
   這是我自己讀OPERATING SYSTEM CONCEPTS (6th Edition)的筆記,本來讀的進度很緩慢,直到有一天在網路上看到冼鏡光老師分享上課的slide(OS第7版),唸起OS就愈來有趣啦。
    筆記的章節是依OS第6版整理的,圖片則都是從冼鏡光老師的slide貼上來的,感謝冼老師無私的分享,讓我有機會一窺OS的奧妙。
   
(如有著作權相關問題,請惠予通知)


4.     Processes

4.1    Process Concept

  • A process is the unit of work in a modern time-sharing system.
  • A batch system executes jobs, whereas a time-shared system has user programs, or tasks.
  • It would be misleading to avoid the use of commonly accepted terms that include the word job (such job scheduling) simply because process has superseded job.
        4.1.1  The Process
                     
    • Process: program code (text section), program counter, processor’s register, process stack, data section.
    • When the OS runs a program (i.e., a binary executable), this program is loaded into memory and the control is transferred to this program’s first instruction. Then, the program starts to run.
    • A process is a program in execution.
    • A process is more than a program, because the former has a program counter, stack, data section and so on (i.e., the runtime stuffs).
    • Moreover, multiple processes may be associated with one program (e.g., runs the same program, a web browser, twice).
        4.1.2  Process State
                     
    • At any moment, a process can be in one of the five states: new, running, waiting, ready and terminated.
      • New: The process is being created
      • Running: The process is being executed
      • Waiting: The process is waiting for some event to occur (e.g., waiting for I/O completion)
      • Ready: The process is waiting to be assigned to a processor
      • Terminated: The process has finished execution
        4.1.3  Process Control Block (PCB)
                     
    • Each process is represented in the operating system by a process control block (PCB) – also called a task control block.
      • Process state: The state may be new, ready, running, waiting, halted, and so on.
      • Program counter: The counter indicates the address of the next instruction to be executed for this process.
      • CPU registers: They include accumulators, index registers, stack pointers, and general-purpose registers, plus any condition-code information. Along with the program counter, this state information must be saved when an interrupt occurs, to allow the process to be continued correctly afterward.
      • CPU-scheduling information: This information includes a process priority, pointers to scheduling queues, and any other scheduling parameters.
      • Memory-management information: This information may include such information as the value of the base and limit registers, the page tables, or the segment tables, depending on the memory system used by the OS.
      • Accounting information: This information includes the amount of CPU and real time used, time limits, account numbers, job or process numbers, and so on.
      • I/O status information: The information includes the list of I/O devices allocated to this process, a list of open files, and so on.
    • Each process has a number, the process ID.
    • These PCBs are chained into a number of lists. For example, all processes in the ready status are in the ready queue.
        4.1.4  Threads
    • A process is a program that performs a single thread of execution.
4.2    Process Scheduling
  • Since the number of processes is always larger than the number of CPUs, the OS must maintain maximum CPU utilization.
  • To determine which process can do what, processes are chained into a number of scheduling queues.
  • For example, in addition to the ready queue, each event may have its own scheduling queue (i.e., waiting queue).
        4.2.1  Scheduling Queues
                     
                     
    • The job queue consists of all processes in the system.
    • The processes that are residing in main memory and are ready and waiting to execute are kept on a list called the ready queue. This queue is generally stored as a linked list. A ready-queue header contains pointers to the first and final PCBs in the list.
    • The list of processes waiting for a particular I/O device is called a device queue. Each device has its own device queue.
    • A common representation of process scheduling is a queuing diagram.
        4.2.2  Schedulers
    • A process schedules the scheduling queues.
    • Most processes can be described as either I/O-bound or CPU-bound.
    • There are three types of schedulers
      • Long-Term (Job) Scheduler: selects jobs from the pool (e.g., a disk) and loads them into the system (memory) for execution (the new state). Executes less frequently. The long-term scheduler controls the degree of multiprogramming – the number of processes in memory. The long-term scheduler should select a good process mix of I/O-bound and CPU-bound processes.
      • Short-Term (CPU) scheduler: selects from among the processes (in the ready queue), and allocates the CPU to one of them. Executes very frequently.
      • Medium-Term Scheduler: does swapping to balance system load, removes processes from memory (and from active contention for the CPU), and thus reduces the degree of multiprogramming.
        • Improve the process mix.
        • A change in memory requirements has overcommitted available memory.
                                                
        4.2.3  Context Switch
    • What is a process context? The context of a process is represented in the PCB of a process; it includes the values of CPU registers, the process state, the program counter, and other memory/file management information.
    • What is a context switch? After the CPU scheduler selects a process (from the ready queue) and before allocates CPU to it, the CPU scheduler must
      • save the context to the currently running process in its PCB,
      • put it into a queue,
      • load the context of the selected process from its PCB, and
      • let it run
                                    
    • Context switching has become such a performance bottleneck that programmers are using new structures (threads) to avoid it whatever possible.
4.3    Operations on Processes
  • There are three commonly seen operations:
    • Process Creation: Create a new process. The newly created is the child of the original. Use fork () or vfork () system call in UNIX to create new processes.
    • Process Termination: Terminate the execution of a process. Under UNIX, use exit ().
    • Process Join: Wait for the completion of a child process. Under UNIX use wait ().
  • fork (), vfork (), exit (), and wait () are system calls.
  • Use “ps -aux” to see all running processes.
        4.3.1  Process Creation
    • Create process via the create-process system call.
    • The creating process is called a parent process, whereas the new processes are called the children of that process.
    • Subprocess may be able to obtain its resources from OS or be constrained to a subset of the resources of the parent process.
    • Two possibilities exist in terms of execution:
      • The parent continues to execute concurrently with its children.
      • The parent waits until some or all of its children have terminated.
    • Two possibilities in terms of the address space of the new process:
      • The child process is a duplicate of the parent process.
      • The child process has a program loaded into it.
    • In UNIX, each process is identified by its process identifier, which is a unique integer. The forked process consists of a copy of the address space of the original process. This mechanism allows the parent process to communicate easily with its child process. Both parent and child processes continue execution at the instruction after the fork system call, with one difference: The return code for the fork system call is zero for the new (child) process, whereas the (nonzero) process identifier of the child is returned to the parent.
    • The execlp system call is used after a fork system call by one of the two processes to replace the process’ memory space with a new program. The execlp system call loads a binary file into memory - destroying the memory image of the program containing the execlp system call - and starts its execution. In this manner, the two processes are able to communicate, and then to go their separate waysThe execlp system call is used after a fork system call by one of the two processes to replace the process’ memory space with a new program. The execlp system call loads a binary file into memory - destroying the memory image of the program containing the execlp system call - and starts its execution. In this manner, the two processes are able to communicate, and then to go their separate ways.
    • wait system call moves the process itself off the ready queue until the process it is waiting terminated.
                        /*****************************************************************************************/
                        #include
                        main(int argc, char *argv[])
                        {

                                int pid;

                                /* fork another process */
                                pid = fork ();

                                If (pid < 0) { /* error occurred */
                                        fprintf(stderr, “Fork Failed”);
                                        exit(-1);
                                }
                                else if (pid == 0) {           /* child process */
                                        execlp(“/bin/ls”, “ls”, NULL);
                                }
                                else { /* parent process */
                                        /* parent will wait for the child to complete */
                                        wait(NULL);
                                        printf(“Child Complete”);
                                        exit(0);
                                }
                        }
                        /*****************************************************************************************/

        4.3.2  Process Termination
    •  A process terminates when it finishes executing its final statement and asks the OS to delete it by using the exit system call. At that point, the process may return data (output) to its parent process (via the wait system call). All the resources of the process – including physical and virtual memory, open files, and I/O buffers – are deallocated by the OS.
    • The possible reasons for a parent process terminate a child process (for example: via abort system call):
      • The child has exceeded its usage of some of the resources.
      • The task assigned to the child is no longer required.
      • Cascading termination: if a process terminates (either normally or abnormally), then all its children must also be terminated.
4.4     Cooperating Processes
  • A process is independent if it cannot affect or be affected by the other processes executing in the system.
  • A process is cooperating if it can affect or be affected by the other processes executing in the system.
  • Therefore, any process that shares resources with other processes is a cooperating process.
  • Information sharing: Multiple processes can use the same set of information (e.g., files).
  • Computation Speedup: One may split a process into multiple processes to use multiple processors, or we break a task into subtasks, each of which will be executing in parallel with the others.
  • Modularity: A system can be divided into separate processes. Use the ps command to see how many processes are system’s processes.
  • Convenience: A user may have multiple tasks to work at the same time (e.g., editing, browsing, and printing).
  • Concurrent execution of cooperating processes requires mechanisms that allow processes to communicate with one another and to synchronize their actions.
  • However, handling cooperating processes is difficult.
        /*****************************************************************************************/
        #define BUFFER_SIZE 10

        typedef struct {
                ...
        } item;

        item buffer[BUFFER_SIZE];    /* shared buffer, circular array with two logical pointers */
        int in = 0;     /* next free position */
        int out = 0;   /* first full position */
        /*****************************************************************************************/
        while (1) {
                /* produce an item in nextProduced */
                while (((in + 1) % BUFFER_SIZE) == out) /* buffer full, at most BUFFER_SIZE – 1 items */
                        ;          /* do nothing */
                buffer[in] = nextProduced;
                in = (in + 1) % BUFFER_SIZE;
        }
        /*****************************************************************************************/
        while (1){
                while (in == out)
                        ;          /* do nothing */
                nextConsumed = buffer[out];
                out = (out + 1) % BUFFER_SIZE;
                /* consume the item in nextConsumed */
        }
        /*****************************************************************************************/
4.5    Interprocess Communication (IPC)
  • Cooperating processes must communicate to get the job done.
  • There are two types of communications:
    • Processes that share the same memory: locks, semaphores, monitors, and others.
    • Processes that take the IPC facility or are running in a distributed environment: message passing, sockets, remote procedure calls (RPC).
  • IPC provides a mechanism to allow processes to communicate and to synchronize their actions without sharing the same address space.
        4.5.1  Message-Passing System
    • The function of a message system is to allow processes to communicate with one another without the need to resort to shared data.
        4.5.2  Naming (How to refer to each other)
                    4.5.2.1         Direct Communication
      • Symmetric in addressing: Each process that wants to communicate must explicitly name the other party.
        • Send(receiver, message)
        • Receive(sender, message)
        • A link is established between exactly two processes.
        • Exactly one link exists between each pair of processes.
        • We may establish these links for those processes that want to communicate before they run.
      • Asymmetric in addressing
        • Send(receiver, message)
        • Receive(id, message)
        • The receiver () primitive receives the ID of the sender. Thus, in this scheme, a receiver can receive messages from any process.
      • There are disadvantages in this symmetric and asymmetric schemes:
        • Changing the name/ID of a process may require examining all other process definitions.
        • Processes must know the IDs of the other parties to start a communication.
                    4.5.2.2         Indirect Communication
      • With indirect communication, the messages are sent to and received from mailboxes, or ports.
      • Each mailbox has a unique ID.
      • The primitives are
        • Send (mailbox-name, message)
        • Receive (mailbox-name, message)
      • Thus, messages are sent to and received form mailboxes.
      • There is a link between two processes only if they share a mailbox.
      • A link may be shared by multiple processes.
      • Multiple links may exist between each pair of processes, with each link corresponding to a mailbox.
      • What if there is only one message in a mailbox and two processes execute Receive ()? It depends on the following:
        • Only one link between at most two processes
        • Allow at most one process to receive at a time
        • Allow the system to select an arbitrary order
                    4.5.3  Synchronization (Shall we wait when participating a message activity)
      • Message passing may be blocking or nonblocking – also known synchronous and asynchronous.
        • If the sender and receiver must synchronize their activities, use synchronous communication.
        • Because the uncertainty in the order of events, asynchronous communication is more difficult to program.
        • On the other hand, asynchronous algorithms are general and portable, because they are guaranteed to run correctly in networks with arbitrary timing behavior.
      • Blocking send: The sending process is blocked until the message is received by the receiving process or by the mail box.
      • Nonblocking send: The sending process sends the message and resumes operation.
      • Blocking receive: The receiver blocks until a message is available.
      • Nonblocking receive: The receiver retrieves either a valid message or a null.
      • Rendezvous: both the send and receive are blocking.
                    4.5.4  Buffering (Can message wait in a communication link)
      • The capacity of a communication link is its buffer size.
      • Zero capacity (no buffering): The queue has maximum length 0; the link cannot have any messages waiting in it, so it is synchronous.
      • Bounded capacity (automatic buffering): Buffered Message Passing. Sender blocks if the buffer is full, and the link is asynchronous.
      • Unbounded capacity (automatic buffering): Messages can wait in the link. Sender never blocks and the link is asynchronous. The order of messages being received does not have to be FIFO.
                    4.5.5  An Example: Mach
      • The Kernel mailbox and Notify mailbox.
      • msg_send, msg_receive, msg_rpc
                    4.5.6  An Example: windows 2000
      • The message-passing facility in Windows 2000 is called the local procedure-call (LPC) facility.
4.6    Communication in Client – Server Systems
        4.6.1  Sockets
    • A socket is identified by an IP address concatenated with a port number.
    • The sockets traveling between the hosts are delivered to the appropriate process, based on the destination port number.
        4.6.2  Remote Procedure Call
    • The RPC was designed as a way to abstract the procedure-call mechanism for use between systems with network connections.
    • The semantics of RPCs allow a client to invoke a procedure on a remote host as it would invoke a procedure locally.
    • Many RPC systems define a machine-independent representation of data which is called external data representation (XDR).
        4.6.3  Remote Method Invocation (RMI)
    • The remote method invocation is a Java feature similar to RPCs.
    • RPCs support procedural programming whereby only remote procedures or functions may be called. RMI is object-based: It supports invocation of methods on remote objects.
    • The parameters to remote procedures are ordinary data structures in RPC; with RMI it is possible to pass objects as parameters to remote objects.


arrow
arrow
    全站熱搜

    nix 發表在 痞客邦 留言(0) 人氣()