I am currently a distributed systems engineer at Nvidia DGX Cloud, working with multi-cloud Kubernetes environments.
Summary:
My current interests are optimizing and automating cloud infrastructure for AI and machine learning workloads. I help create innovative solutions that enhance scalability, efficiency, and availability, particularly in containerized and serverless systems.
Professional Experience:- Prior to joining Nvidia, I was a Principal Research SDE in Gray Systems Lab (GSL) at Microsoft. There, I researched how to optimize containerized infrastructure [1] [2] and co-created Hummingbird, a library for compiling trained traditional ML models into tensor computations.
- I have also worked on the Azure Kubernetes Service (AKS) infrastructure team and at Intel Labs as a research scientist focusing on distributed systems.
- During my PhD at the University of Maryland College Park, I studied dynamic software updates for systems requiring high availability. My dream was (and is!) to eliminate all downtime in running systems.
- I started my career in the information security field in the Maryland/DC area.