github.com · via hf registryUnverified — relayed by github.comseen 2h ago
About
Train and fine-tune transformer language models using TRL (Transformers Reinforcement Learning). Supports SFT, DPO, GRPO, KTO, RLOO and Reward Model training via CLI commands.
Capabilities
The crawler did not record capability metadata for this resource. Inspect the endpoint
directly to see what it exposes.