A Study of the Impact of Different Client Type on Web Page Download Traffic: Browsers, Operating Systems, Device, and Vantage Point

Sean Sanders
Advisor: Jasleen Kaur
University of North Carolina at Chapel Hill


Goal

Modern web pages are diverse and complex. There is also a wide range of devices, operating systems, and browsers that users use to access these web pages. In this work, we study how web pages, and the traffic generated by their download, differ across these different client types. An understanding of these variations in web page traffic across these factors can lead to better understanding of aggregate traces and more accurate development and testing of traffic trace analysis methods such as traffic classification and behavioral ad targetting.

Traffic Capture Architecture

Traditional aggregate trace measurement methodology is not an effective approach for analyzing modern web page traffic for the following reasons:

  • Complexty of web pages: Difficult to group many disparate objects from multiple servers that form a web page into a single web page unit.
  • Diversity of client type: The modern user may use many different browsers, operating systems, or devices to view content.

In order to fully understand the various impact of web pages across various client type we generate data from the client side where the process of generating the data is known (i.e., knowledge of the type of web page downloaded and the type of operating system, browser, and device used). Thus, we are able to avoid many errors that can result from aggregate trace analysis.

Stratification Factors for Measurement

In this work, we intend to study these impact of the following on web page download traffic:

  • Browser
  • Operating System
  • Device Type (i.e., Mobile)
  • Vantage Point

We intend to use PlanetLab nodes to investigate the impact of vantage point on a predefined list of top web sites in the U.S. Web pages will be loaded using popular modern desktop browsers (i.e., Opera, Chrome, Safari, Internet Explorer, and Firefox) and using operating systems that are available on PlanetLab resources. The impact of mobile devices likely cannot be done via PlanetLab and will be measured within a controlled environment. Complete traffic traces will be captured using tcpdump/windump.