中国邮电高校学报(英文版) ›› 2017, Vol. 24 ›› Issue (6): 1-13.doi: 10.1016/S1005-8885(17)60237-1

• Networks •    下一篇

Measuring web page complexity by analyzing TCP flows and HTTP headers

Cheng Weiqing,Hu Yangyang, Yin Qiaofeng, Chen Jiajia   

  1. 南京邮电大学
  • 收稿日期:2017-03-13 修回日期:2017-06-21 出版日期:2017-12-30 发布日期:2017-12-01
  • 通讯作者: 成卫青 E-mail:chengweiq@njupt.edu.cn
  • 基金资助:
    the Open Research Program of the Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education (K93-9-2014-04B), the National Natural Science Foundation of China (61170322, 61572263, 61302157).

Measuring Web Page Complexity by Analyzing TCP Flows and HTTP Headers

Cheng Weiqing,Hu Yangyang, Yin Qiaofeng, Chen Jiajia   

  1. 南京邮电大学
  • Received:2017-03-13 Revised:2017-06-21 Online:2017-12-30 Published:2017-12-01
  • Contact: Wei-Qing CHENG E-mail:chengweiq@njupt.edu.cn
  • Supported by:
    the Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education;National Science Foundation of China

摘要: To understand website complexity deeply, a web page complexity measurement system is developed. The system measures the complexity of a web page at two levels: transport-level and content-level, using a packet trace-based approach rather than server or client logs. Packet traces surpass others in the amount of information contained. Quantitative analyses show that different categories of web pages have different complexity characteristics. Experimental results show that a news web page usually loads much more elements at more accessing levels from much more web servers within diverse administrative domains over much more concurrent transmission control protocol (TCP) flows. About more than half of education pages each only involve a few logical servers, where most of elements of a web page are fetched only from one or two logical servers. The number of content types for web game traffic after login is usually least. The system can help web page designers to design more efficient web pages, and help researchers or Internet users to know communication details.

关键词: hyper text transfer protocol, concurrent TCP flows, world wide web, web page complexity

Abstract: To understand website complexity deeply, a web page complexity measurement system is developed. The system measures the complexity of a web page at two levels: transport-level and content-level, using a packet trace-based approach rather than server or client logs. Packet traces surpass others in the amount of information contained. Quantitative analyses show that different categories of web pages have different complexity characteristics. Experimental results show that a news web page usually loads much more elements at more accessing levels from much more web servers within diverse administrative domains over much more concurrent transmission control protocol (TCP) flows. About more than half of education pages each only involve a few logical servers, where most of elements of a web page are fetched only from one or two logical servers. The number of content types for web game traffic after login is usually least. The system can help web page designers to design more efficient web pages, and help researchers or Internet users to know communication details.

Key words: hyper text transfer protocol, concurrent TCP flows, world wide web, web page complexity

中图分类号: