Surya Subramanian

Hey y'all — my name is Surya! I study computer science @ Georgia Tech. My interests lie in the intersection of machine learning and systems for efficient model training and inference.

Currently, I'm on the NVIDIA cuBLAS team working on fast matmul kernels for Blackwell GPUs.

Previously at Meta AI, I worked on Triton and PyTorch Distributed. My work focused on PyTorch Symmetric Memory to enable native communication/compute overlap for distributed kernels in tensor and expert parallelism. Prior to that, I worked on ML infrastructure at Pinterest.

Interests

I really enjoy the full ML acceleration stack, from large-scale distributed training to low-latency inference and kernel optimization.

Outside of systems, I love weightlifting and finding good coffee spots.