About: Web traffic is a research topic. Over the lifetime, 1059 publications have been published within this topic receiving 25858 citations. The topic is also known as: website traffic.
TL;DR: GreedyDual-Size as discussed by the authors incorporates locality with cost and size concerns in a simple and nonparameterized fashion for high performance, which can potentially improve the performance of main-memory caching of Web documents.
Abstract: Web caches can not only reduce network traffic and downloading latency, but can also affect the distribution of web traffic over the network through cost-aware caching. This paper introduces GreedyDual-Size, which incorporates locality with cost and size concerns in a simple and non-parameterized fashion for high performance. Trace-driven simulations show that with the appropriate cost definition, GreedyDual-Size outperforms existing web cache replacement algorithms in many aspects, including hit ratios, latency reduction and network cost reduction. In addition, GreedyDual-Size can potentially improve the performance of main-memory caching of Web documents.
TL;DR: The high volume and good stability properties of P2P traffic suggests that the P1P workload is a good candidate for being managed via application-specific layer-3 traffic engineering in an ISP's network.
Abstract: The use of peer-to-peer (P2P) applications is growing dramaticaliy, particularly for sharing large video/audio files and software. In this paper, we analyze P2P traffic by measuring flow-level information collected at multiple border routers across a large ISP network, and report our investigation of three popular P2P systems -- FastTrack, Gnutella, and DirectConnect. We characterize the P2P traffic observed at a single ISP and its impact on the underlying network. We observe very skewed distribution in the traffic across the network at different levels of spatial aggregation (IP, prefix, AS). All three P2P systems exhibit significant dynamics at short times scale and particularly at the IP address level Still, the fraction of P2P traffic contributed by each prefix is much more stable than the corresponding distribution of either Web traffic or overall traffic. The high volume and good stability properties of P2P traffic indicates that the P2P workload is a good candidate for being managed via application-specific layer-3 traffic engineering in an ISP's network.
TL;DR: The high volume and good stability properties of P2P traffic suggests that the P1P workload is a good candidate for being managed via application-specific layer-3 traffic engineering in an ISP's network.
Abstract: The use of peer-to-peer (P2P) applications is growing dramatically, particularly for sharing large video/audio files and software. In this paper, we analyze P2P traffic by measuring flow-level information collected at multiple border routers across a large ISP network, and report our investigation of three popular P2P systems--FastTrack, Gnutella, and Direct-Connect. We characterize the P2P trafffic observed at a single ISP and its impact on the underlying network. We observe very skewed distribution in the traffic across the network at different levels of spatial aggregation (IP, prefix, AS). All three P2P systems exhibit significant dynamics at short time scale and particularly at the IP address level. Still, the fraction of P2P traffic contributed by each prefix is more stable than the corresponding distribution of either Web traffic or overall traffic. The high volume and good stability properties of P2P traffic suggests that the P2P workload is a good candidate for being managed via application-specific layer-3 traffic engineering in an ISP's network.
TL;DR: This paper proposes a novel method for thwarting statistical traffic analysis algorithms by optimally morphing one class of traffic to look like another class, and shows how to optimally modify packets in real-time to reduce the accuracy of a variety of traffic classifiers while incurring much less overhead than padding.
Abstract: Recent work has shown that properties of network traffic that remain observable after encryption, namely packet sizes and timing, can reveal surprising information about the traffic’s contents (e.g., the language of a VoIP call [29], passwords in secure shell logins [20], or even web browsing habits [21, 14]). While there are some legitimate uses for encrypted traffic analysis, these techniques also raise important questions about the privacy of encrypted communications. A common tactic for mitigating such threats is to pad packets to uniform sizes or to send packets at fixed timing intervals; however, this approach is often inefficient. In this paper, we propose a novel method for thwarting statistical traffic analysis algorithms by optimally morphing one class of traffic to look like another class. Through the use of convex optimization techniques, we show how to optimally modify packets in real-time to reduce the accuracy of a variety of traffic classifiers while incurring much less overhead than padding. Our evaluation of this technique against two published traffic classifiers for VoIP [29] and web traffic [14] shows that morphing works well on a wide range of network data—in some cases, simultaneously providing better privacy and lower overhead than naive
TL;DR: A C-LSTM neural network for effectively modeling the spatial and temporal information contained in traffic data, which is a one-dimensional time series signal, and outperforms other state-of-the-art machine learning techniques on Yahoo's well-known Webscope S5 dataset.
Abstract: Web traffic refers to the amount of data that is sent and received by people visiting online websites. Web traffic anomalies represent abnormal changes in time series traffic, and it is important to perform detection quickly and accurately for the efficient operation of complex computer networks systems. In this paper, we propose a C-LSTM neural network for effectively modeling the spatial and temporal information contained in traffic data, which is a one-dimensional time series signal. We also provide a method for automatically extracting robust features of spatial-temporal information from raw data. Experiments demonstrate that our C-LSTM method can extract more complex features by combining a convolutional neural network (CNN), long short-term memory (LSTM), and deep neural network (DNN). The CNN layer is used to reduce the frequency variation in spatial information; the LSTM layer is suitable for modeling time information; and the DNN layer is used to map data into a more separable space. Our C-LSTM method also achieves nearly perfect anomaly detection performance for web traffic data, even for very similar signals that were previously considered to be very difficult to classify. Finally, the C-LSTM method outperforms other state-of-the-art machine learning techniques on Yahoo's well-known Webscope S5 dataset, achieving an overall accuracy of 98.6% and recall of 89.7% on the test dataset.