RESOURCE PROVISIONING FOR DATA-INTENSIVE USER-FACING APPLICATIONS
Abstract
**Please note that the full text is embargoed until 08/01/2024** Data-intensive, User-facing Services (DUSes) such as web searching, digital marketing, online social networking, and online retailing are critical workloads in clouds and datacenters. Meeting stringent query tail-latency Service Level Objectives (SLO) for DUS queries is essential for optimal user experience and business success. However, achieving these objectives is challenging due to the scale-out nature of DUese workloads and the varying resource demands of queries with different fanouts. Additionally, the design and configuration options for clusters significantly impact query performance.
In this dissertation, we present solutions of DUSes performance online and offline optimization. We highlight the importance of reducing query tail latency and the impact on user experience and revenue. We discuss the complexities of meeting tail-latency SLOs considering query fanout and the need to allocate resources accordingly. Furthermore, we explore the wide range of cluster design and configuration options and propose model-based approaches to compare and identify promising configurations.
Through queuing models, we establish the maximum sustainable cluster loads and analyze worker and cluster-level performance. We validate our models through extensive simulation and testing, providing valuable insights for DUSes design and efficient resource planning. Our work contributes to improving user experience, resource optimization and resource provisioning plan in cloud-based DUSes environments.
Overall, our online solution optimized/guaranteed the tail latency while improve resource utilization, and our offline models analysis and findings provide guidance for DUSes service providers, enabling enhanced user experience and effective resource provisioning.