At the time, almost no one was doing this with language models. The only other group exploring RL for LLMs was DeepSeek, the ...