论文标题
Kubernetes群集中的容器HPC工作负载的细粒度安排
Fine-Grained Scheduling for Containerized HPC Workloads in Kubernetes Clusters
论文作者
论文摘要
容器化技术提供了轻巧的OS级虚拟化,并通过包装低性能开销和努力以维护和扩展它们来实现可移植性,可重复性和灵活性。此外,集装箱编排(例如Kubernetes)在云中广泛使用,以管理运行许多容器化应用程序的大型群集。但是,安排考虑集装箱高性能计算(HPC)工作负载的性能细微差别的策略尚未得到充分探索。本文对Kubernetes群集中的容器化HPC工作负载进行了细粒度的调度策略,尤其是根据应用程序配置文件将每个作业分配为合适的多承包商部署。我们在不同的管理层(应用程序和基础架构)上实施调度方案,以便每个组件都有自己的焦点和算法,但仍然与他人合作。我们的结果表明,我们的细粒度调度策略的表现优于基线和基线,而基线的基准范围启用了CPU/内存亲和力策略,将整体响应时间分别降低了35%和19%,并且分别将MakePAN提高了34%和11%。与其他可比的HPC云框架相比,它们还提供了更好的可用性和灵活性来指定HPC工作负载,同时由于其多层方法提供了更好的调度效率。
Containerization technology offers lightweight OS-level virtualization, and enables portability, reproducibility, and flexibility by packing applications with low performance overhead and low effort to maintain and scale them. Moreover, container orchestrators (e.g., Kubernetes) are widely used in the Cloud to manage large clusters running many containerized applications. However, scheduling policies that consider the performance nuances of containerized High Performance Computing (HPC) workloads have not been well-explored yet. This paper conducts fine-grained scheduling policies for containerized HPC workloads in Kubernetes clusters, focusing especially on partitioning each job into a suitable multi-container deployment according to the application profile. We implement our scheduling schemes on different layers of management (application and infrastructure), so that each component has its own focus and algorithms but still collaborates with others. Our results show that our fine-grained scheduling policies outperform baseline and baseline with CPU/memory affinity enabled policies, reducing the overall response time by 35% and 19%, respectively, and also improving the makespan by 34% and 11%, respectively. They also provide better usability and flexibility to specify HPC workloads than other comparable HPC Cloud frameworks, while providing better scheduling efficiency thanks to their multi-layered approach.