Surya Subramanian

Hey y'all — my name is Surya! I study computer science @ Georgia Tech. My interests lie in the intersection of machine learning and systems for efficient model training and inference.

Currently, I'm on the NVIDIA cuBLAS team working on fast matmul kernels via emulation on low-precision tensor cores.

Previously at Meta AI, I worked on Triton and PyTorch Distributed. My work focused on PyTorch Symmetric Memory to enable native communication/compute overlap for distributed kernels in tensor and expert parallelism. We open sourced some of our work in Kraken, a Triton library of symmetric memory kernels. Prior to that, I worked on ML infrastructure at Pinterest.

Interests

I really enjoy the full ML acceleration stack, from large-scale distributed training to low-latency inference and kernel optimization.

Outside of systems, I love weightlifting and finding good coffee spots.