About me
I’m a PhD candidate at the department of Computer Science at the University of Toronto, advised by Prof. Gennady Pekhimenko. I’m working at the intersection of data systems, compilers, and machine learning systems. My primary research focuses on bridging the gap between generality and performance in modern analytics systems by introducing compiler-driven abstractions that unify diverse workloads. As part of this effort, I have developed intermediate representations and compiler frameworks called TiLT and Reffine that enable end-to-end operator fusion, scalable parallel execution, and efficient code generation across traditionally siloed domains, including relational databases, streaming analytics, and graph processing. My work has been published in leading systems and machine learning venues such as ASPLOS, MLSys, USENIX ATC, and MICRO, and I’m a two-time Best Artifact Award recipient at ASPLOS.
In addition to my PhD research, I have also contributed to several projects in machine learning and systems. My work includes P3, which improves communication efficiency in distributed training; Arbitor, a hardware emulator for accelerating the development and evaluation of emerging AI accelerators; and Tally, a system for improving GPU cluster utilization through the co-location of concurrent deep learning workloads, enabling effective resource sharing and fine-grained scheduling.
I’m also the co-founder of CentML, a startup building next-generation infrastructure for large-scale AI systems, which was acquired by NVIDIA. I’m currently serving as a Tech Lead for AI Infrastructure at NVIDIA, designing and building inference systems that power large-scale generative AI deployments.
