Short Paper Session 1

Videos, Memory, and Data Center

5:50 PM — 7:15 PM JST
Jun 25 Fri, 4:50 AM — 6:15 AM EDT

ReCLive: Real-Time Classification and QoE Inference of Live Video Streaming Services

Sharat Chandra Madanapalli (University of New South Wales, Australia); Alex Mathai (BITS Pilani, India); Hassan Habibi Gharakheili (University of New South Wales, Sydney, Australia); Vijay Sivaraman (University of New South Wales, Australia)

Social media, professional sports, and video games are driving rapid growth in live video streaming, on platforms such as Twitch and YouTube Live. Live streaming experience is very susceptible to short-time-scale network congestion since client playback buffers are often no more than a few seconds. Unfortunately, identifying such streams and measuring their QoE for network management is challenging, since content providers largely use the same delivery infrastructure for live and video-on-demand (VoD) streaming, and packet inspection techniques (including SNI/DNS query monitoring) cannot always distinguish between the two.
In this paper, we design, build, and deploy ReCLive: a machine learning method for live video detection and QoE measurement based on network-level behavioral characteristics. Our contributions are four-fold: (1) We analyze about 23,000 video streams from Twitch and YouTube, and identify key features in their traffic profile that differentiate live and on-demand streaming; (2) We develop an LSTM-based binary classifier model that distinguishes live from on-demand streams in real-time with over 95% accuracy; (3) We develop a method that estimates QoE metrics of live streaming flows in terms of resolution and buffer stall events with overall accuracies of 93% and 90%, respectively; and (4) Finally, we prototype our solution, train it in the lab, and deploy it in a live ISP network serving more than 7,000 subscribers. Measurements from the field show that 99.8% of Twitch videos are streamed live, while this measure is only 2.3% for YouTube. Further, during peak hours as many as 15% of live video streams are played at low-definition resolution and about 7% of them experience a buffer stall.

Multivariate Time Series Forecasting exploiting Tensor Projection Embedding and Gated Memory Network

Zhenxiong Yan and Kun Xie (Hunan University, China); Xin Wang (Stony Brook University, USA); Dafang Zhang (Hunan University, China); Gaogang Xie (CNIC Chinese Academy of Sciences & University of Chinese Academy of Sciences, China); Kenli Li (Hunan University, China); Jigang Wen (Chinese Academy of Science & Institute of Computing Technology, China)

Time series forecasting is very important and plays critical roles in many applications. However, making accurate forecasting is a challenge task due to the requirements of learning complex temporal and spatial patterns and combating noise during the feature learning. To address the challenge issues, we propose TEGMNet, a Tensor projection Embedding and Gated Memory Network for multivariate time series forecasting. To more accurately extract local features and reduce the influence of noise, we propose to amplify the data using several data transformation techniques based on MDT (Multi-way delay embedding transform) and TFNN (tensor factorized neural network) to transform the original 2D matrix data to low dimensional 3D tensor data. The local features are then extracted through convolution and LSTM upon the 3D tensor. We also design a long-term feature extraction module based on the structure of gated memory network, which can largely enhance the long-term pattern feature learning ability when the multivariate time series has complex long-term dependencies with dynamic-period patterns. We have done extensive experiments by comparing our TEGMNet with 7 baseline algorithms using 4 real data sets. The experiment results demonstrate that TEGMNet can achieve very good prediction performance even through the data are polluted with noise.

Smartbuf: An Agile Memory Management for Shared-Memory Switches in Datacenters

Hamed Rezaei, Hamidreza Almasi and Balajee Vamanan (University of Illinois at Chicago, USA)

Important datacenter applications generate extremely bursty traffic patterns and demand low latency tails as well as high throughput. Datacenter networks employ shallow-buffered, shared-memory switches to cut cost and to cope up with ever-increasing link speeds. End-to-end congestion control cannot react in time to handle bursty, short flows that dominate datacenter traffic and they incur buffer overflows, which cause long latency tails and degrade throughput. Therefore, there is a need for agile, switch-local mechanisms that quickly sense congestion and provision enough buffer space dynamically to avoid costly buffer overflows. We propose Smartbuf, an online learning algorithm that accurately predicts buffer requirement of each switch port before the onset of congestion. Our key novelty lies in fingerprinting bursts based on the gradient of queue length and using this information to provision just enough buffer space. Our preliminary evaluations show that our algorithm can predict buffer demands accurately within an average error margin of 6% and achieve an improvement in the 99th percentile latency by a factor of 8x at high loads, while providing good fairness among ports.

Workload Migration across Distributed Data Centers under Electrical Load Shedding

Linfeng Shen (Simon Fraser University, Canada); Fangxin Wang (The Chinese University of Hong Kong, Shenzhen, China); Feng Wang (University of Mississippi, USA); Jiangchuan Liu (Simon Fraser University, Canada)

Data centers are essential components in the current digital world. The number and scales of data centers have both increased a lot in recent years. The distributed data centers are standing out as a promising solution due to the development of modern applications which need a massive amount of computation resource and strict response requirement. However, compared to centralized data centers, distributed data centers are more fragile when the power supply is unstable. Power constraints or outages because of load shedding or other reasons will significantly affect the service performance of data centers and damage the quality of service (QoS) for customers. Moreover, unlike conventional data centers, distributed data centers are often unattended, so we need a system that can automatically calculate the best workload schedule in such situations. In this paper, we closely investigate the influence of electrical load shedding in distributed data centers and construct a physical model to estimate the relationship among power, heat and workload. We then use queueing theory to approximate the tasks' response time and propose an efficient solution with theoretically bounded performance to minimize the overall response time of tasks by migration with low cost. Our extensive evaluations show that our method can improve the response time with more than 9% reduction.

A Crowd-driven Dynamic Neural Architecture Searching Approach to Quality-aware Streaming Disaster Damage Assessment

Yang Zhang, Ruohan Zong, Ziyi Kou, Lanyu Shang and Dong Wang (University of Notre Dame, USA)

Streaming disaster damage assessment (DDA) aims to automatically assess the damage severity of affected areas in a disaster event on the fly by leveraging the streaming imagery data about the disaster on social media. In this paper, we focus on a dynamic optimal neural architecture searching (NAS) problem. Our goal is to dynamically determine the optimal neural network architecture that accurately estimates the damage severity for each newly-arrived image in the stream by leveraging human intelligence from the crowdsourcing systems. Our work is motivated by the observations that the neural network architectures in current DDA solutions are mainly designed by AI experts, which often leads to non-negligible costs and errors given the dynamic nature of the streaming DDA applications and the lack of real-time annotations of the massive social media data inputs. Two critical technical challenges exist in solving our problem: i) it is non-trivial to dynamically identify the optimal neural network architecture for each image on the fly without knowing its ground-truth label a priori; ii) it is challenging to effectively leverage the imperfect crowd intelligence to correctly identify the optimal neural network architecture for each image. To address the above challenges, we develop CD-NAS, a crowd-driven dynamic NAS framework that is inspired by novel techniques from AI, crowdsourcing, and estimation theory to address the dynamic optimal NAS problem. The evaluation results from a real-world streaming DDA application show that CD-NAS consistently outperforms the state-of-the-art AI and NAS baselines by achieving the highest disaster damage assessment accuracy while maintaining the lowest computation cost.

Multipath-aware TCP for Data Center Traffic Load-balancing

Yu Xia (Sichuan Normal University, China); Jinsong Wu (Universidad de Chile, Chile); Jingwen Xia (University of Electronic Science and Technology of China, China); Ting Wang (East China Normal University & Shanghai Key Laboratory of Trustworthy Computing, China); Sun Mao (Sichuan Normal University, China)

Traffic load-balancing is important to data center performance. However, existing data center load-balancing solutions are either limited to simple topologies or cannot provide satisfactory performance. In this paper, we propose a multipath-aware TCP (MA-TCP) which can sense the path migration. This information is critical to TCP with load balancing. First, the congestion window reduction due to packet reordering during the path migration can be avoided. This in turn makes the path migration more timely as soon as the original path is congested. Furthermore, if the migrated path is congested (again), the flow can securely continue to migrate without worrying about transmitting rate reduction. Second, the congestion information, such as the Explicit Congestion Notification (ECN), of old and new paths can be separated, which makes the congestion information more accurate after path migration. Through extensive experiments, we show that MA-TCP achieves better flow completion time (FCT) than existing traffic load-balancing solutions.

Session Chair

To Be Determined

Enter Zoom

Made with in Toronto · Privacy Policy · IWQoS 2020 · © 2021 Duetone Corp.