Acta Metallurgica Sinica(English letters) ›› 2013, Vol. 20 ›› Issue (5): 97-103.doi: 10.1016/S1005-8885(13)60096-5

• Wireless • Previous Articles     Next Articles

Offline traffic analysis system based on Hadoop

  

  1. 1. Beijing Key Laboratory of Network System Architecture and Convergence, Beijing University of Posts and Telecommunications, Beijing 100876, China 2. Produce Ads, Amazon Joyo Co. Ltd, Beijing 100025, China
  • Received:2013-03-05 Revised:2013-06-06 Online:2013-10-30 Published:2013-10-29
  • Contact: Yuan-Yuan QIAO E-mail:qyybupt@gmail.com
  • Supported by:
    This work was supported by the Important National Science & Technology Specific Projects (2012ZX03002008), the National Natural Science Foundation of China (61072061) and The Fundamental Research Funds for the Central Universities (2012RC0121).

Abstract: Offline network traffic analysis is very important for an in-depth study upon the understanding of network conditions and characteristics, such as user behavior and abnormal traffic. With the rapid growth of the amount of information on the Internet, the traditional stand-alone analysis tools face great challenges in storage capacity and computing efficiency, but which is the advantages for Hadoop cluster. In this paper, we designed an offline traffic analysis system based on Hadoop (OTASH), and proposed a MapReduce-based algorithm for TopN user statistics. In addition, we studied the computing performance and failure tolerance in OTASH. From the experiments we drew the conclusion that OTASH is suitable for handling large amounts of flow data, and are competent to calculate in the case of single node failure.

Key words: MapReduce, Hadoop, cloud computing, traffic analysis