Abstract—In recent years the amount of data stored worldwide has exploded, increasing bya factor of nine in the last five years. However, the quantity of data is oftenfar too large to store and analyse in traditional relational database systems,or the data are in unstructured forms unsuitable for structured schemas, or thehardware needed for conventional analysis is just too costly. An accurate performance model for MapReduceis increasingly important for analysing and optimizing MapReduce jobs. It isalso a precondition to implement cost based scheduling strategies or totranslate Hive like query jobs into sets of low cost MapReduce jobs. Thispaper presents a Hadoop job performance model that accurately estimates jobcompletion time and further provisions the required amount of resources for ajob to be completed within a deadline. Propose system introduces a Hadoop workexecution demonstrate that precisely gauges work consummation instance andadditional arrangements the required measure of assets for a vocation to befinished inside a due date. The propose scheme expands on authentic employmentimplementation report and utilizes Locally Weighted Linear Regression System toguess the implementation time of a trade. Besides, it utilizes LagrangeMultipliers system for asset provisioning to fulfil occupations with due dateprerequisites. The propose scheme is at first assessed on an in-house Hadoopbunch and therefore assessed in the Amazon EC2 Cloud.
Keywords: Map Reduce; Locally Weighted Linear Regression System(LWLRS); Hadoop.