Cloud and Datacenter Systems

Among the first entrants to the now widely practiced field of cloud computing, our group's early work concerned I/O virtualization, improved hypervisors, and then moved on to better understanding the behavior of virtualized applications and systems, in part through black box methods for monitoring and systems management. Online monitoring and data analytics -- monalytics -- are a key tool in that space, where we have developed new software infrastructure, techniques for detecting anomalous system behavior, in joint work with corporate partners that include HP Labs, IBM, Intel, VMware, and others. Managing such complex systems is another topic prevalent in our research, in part because we routinely run our own cloud infrastructures for use in teaching and research.

An important research stream pervading much of our work in the enterprise space concerns the online nature of many modern web and scientific applications, where we have worked with groups like IBM's System S team, have collaborated with HP Labs on the online processing of datacenter monitoring information, have been inspired by problems at Amazon, and continue to interact with colleagues at CMU and Intel as part of the Intel (Cloud ISTC). New research in this space is best found by looking at the web pages of the PhD students listed below:

Current students:

  • Liting Hu - Elf: efficient lightweight fast streaming processing at scale with QoS guarantees and locality aware topologies
  • Alex Merritt - Modern many-core, accelerator and high-performance architectures
  • Yanwei Zhang - Advanced resource and performance management strategies
  • Xin Chen - Data-aware resource scheduling for heterogeneous stream processing 
  • Mohan Kumar - High performance communication, containers, virtualization and  operating systems
  • Adit Ranadive - Explore how hardware QoS techniques can be used to provide stricter performance guarantees to VMs

Alumni:

  • Chengwei Wang - vScope: middleware for troubleshooting time-sensitive data center applications
  • Mukil Kesavan - Scalability, performance isolation and fault-tolerance in resource management for cloud computing datacenters
  • Priyanka Tembey - Flexible Classification on Heterogenous Multicore Applicance Platforms 

Recent Publications:

  • Mukil Kesavan, Adit Ranadive, Ada Gavrilovska, Karsten Schwan, Active CoordinaTion (ACT) - Toward Effectively Managing Virtualized Multicore Clouds, Proceedings of the IEEE Cluster 2008, Tsukuba, Japan, September 2008.
  • Liting Hu, Karsten Schwan, Ajay Gulati, Junjie Zhang, Chengwei Wang. Net-Cohort: Detecting and Managing VM Ensembles in Data Center Systems. In Proc. of the 9th International Conference on Autonomic Computing (ICAC 2012), September 2012. 
  • Chengwei Wang, Infantdani Abel Rayan, Greg Eisenhauer, Karsten Schwan, Vanish Talwar, Matthew Wolf and Chad Huneycutt. VScope: Middleware for Troubleshooting Time-Sensitive Data Center Applications. In Proceedings of ACM/IFIP/USENIX 13th International Conference on Middleware (Middleware 2012), December, 2012.
  • Liting Hu, Kyung Dong Ryu, Dilma Da Silva, Karsten Schwan. v-Bundle: Flexible Group Resource Offerings in Clouds. In Proc. of the 32nd International Conference on Distributed Computing Systems (ICDCS 2012), June 2012.
  • Liting Hu, Karsten Schwan, Hrishikesh Amur, Xin Chen. ELF: Efficient Lightweight Fast Stream Processing at Scale. In Proc. of the 2014 USENIX Annual Technical Conference (USENIX ATC 2014), June 2014.
  • Fang Zheng, Chitra Venkatramani, Rohit Wagle, Karsten Schwan. Cache Topology Aware Mapping of Stream Processing Applications onto CMPs. In Proc. of the 33rd International Conference on Distributed Computing Systems (ICDCS 2013). 

 

Associated Faculty - Karsten SchwanGreg EisenhauerAda GavrilovskaMatthew WolfJeffrey Young

Other Faculty - Ling LiuCalton PuSudhakar Yalamanchili 

Phd Students - Liting Hu, Alexander Merritt, Yanwei Zhang, Xin Chen, Mohan Kumar, Adit Ranadive

Thank you to our sponsor Cisco, LexisNexis, Hp Labs and Intel (Cloud ISTC).