verl: Volcano Engine Reinforcement Learning for LLMs
Train transformer language models with reinforcement learning.
No description provided.
Autonomously train research-agent LLMs on custom data using reinforcement learning and self-verification.
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL