I am a founding member at Periodic Labs. I am also an Adjunct Professor at McGill University. Briefly, I worked on reinforcement learning and reasoning at Meta. Before that, I was a staff research scientist in the Google DeepMind Team . I finished my PhD at Mila under the guidance of Aaron Courville and Marc Bellemare. Previously, I spent a year at Geoffrey Hinton's amazing team in Google Brain, Toronto. Earlier, I graduated in Computer Science and Engineering from IIT Bombay.
My current research revolves around RL and LLMs, and my prior work has received an outstanding paper award at NeurIPS. I was also an core contributor for Gemma and Gemini models.
Current PhD Students
- Morgane Moss (Co-supervised with Aaron Courville)
Past Interns & Student Researchers
- Max Schwarzer (BBF, Now ChatGPT lead @ OpenAI)
- Devvrit Khatri (Scaling RL Compute, PhD @ UT Austin)
- Yongchao Zhou (DistillSpec, Now @ x.AI)
- Arian Hosseini (V-STaR, Now @ GDM)
- Jesse Farebrother (Stop Regressing, PhD @ McGill)
- Lunjun Zhang (Generative RMs, PhD @ UofT)
- Charline Le Lan (RL Generalization, Now Gemini Flash @ GDM)
- Michael Noukhovitch ( Asynchronous RL for LLMs, PhD @ Mila)
- Wenda Xu (Speculative KD , Now @ Google)
- Hritik Bansal (Compute-Optimal STaR / KD / W2S , PhD @ UCLA)
- Josh P Zitovsky (Offline Model Selection, Now @ Amazon)
- Amrith Setlur (Advantage for PRMs , PhD @ UC Berkeley)
- Ghada Sokar (Dormant Neurons, Now @ GDM)
- Siddhant Agarwal (Undergrad Researcher, Now PhD @ UT Austin )