Unix Process management

Unix Process management

Jongyeop Kim*

* Departement of Math and Computer Science, Southern Arkansas University

100 E University, Magnolia, AR 71753, USA.


Fundamental Concepts

This research concerns about the performance optimization or improvement of computation running on top of the platform, namely Apache Hadoop in particular. However, the methods and algorithms proposed in this research supposed to be applicable to other platforms as well without loss of generality. The diagram below [1] shows the computation flow on Hadoop, which primarily consists of one name node and multiple slave nodes.

A MapReduce process [2] is composed of two primary functions as follows: 1) Map function: takes a set of input/key value pairs and produces a set of intermediate ouput key/value pairs; 2) Reduce function: takes intermediate key2 and a set of values for the key as illustrated in the following. Map (Key1, Value1) -> List (Key2, Value 2) Reduce (Key2, List(Value2)) -> List(Value2)

Thread Management System Calls

A simple yet popular example of MapReduce computation to perform the word count function is shown below [3] and is used in this research as a base benchmark, for illustration purpose of the flow of a MapReduce computation. An input file is split into blocks and processed at different nodes to reduce overall processing time.

Implementation of Processes in Unix

Threads in Linux

Scheduling in Unix

Comments are closed.