I am a staff research scientist in the Google DeepMind Team in Montréal. I am also an Adjunct Professor at McGill University. I finished my PhD at Mila under the guidance of Aaron Courville and Marc Bellemare. Previously, I spent a year at Geoffrey Hinton's amazing team in Google Brain, Toronto. Earlier, I graduated in Computer Science and Engineering from IIT Bombay.
My research work mainly revolves around reinforcement learning (RL) and LLMs, often with the goal of making RL methods suitable for real-world problems, and includes an outstanding paper award at NeurIPS.
Current PhD Students
- Morgane Moss (Co-supervised with Aaron Courville)
Past Mentees & Student Researchers
- Max Schwarzer (BBF, Now o1 @ OpenAI)
- Yongchao Zhou (DistillSpec, Now @ x.AI)
- Arian Hosseini (V-STaR, PhD @ Mila)
- Jesse Farebrother (Stop Regressing, PhD @ McGill)
- Lunjun Zhang (Generative RMs, PhD @ UofT)
- Charline Le Lan (RL Generalization, Now Gemini Flash @ GDM)
- Michael Noukhovitch ( Asynchronous RL for LLMs, PhD @ Mila)
- Wenda Xu (Speculative KD , PhD @ UCSB)
- Hritik Bansal (Compute-Optimal STaR / KD / W2S , PhD @ UCLA)
- Josh P Zitovsky (Offline Model Selection, Now @ Amazon)
- Ghada Sokar (Dormant Neurons, Now @ GDM)
- Siddhant Agarwal (Undergrad Researcher, Now PhD @ UT Austin )