LLM Training - a violasara Collection

violasara 's Collections

LLM Training

updated Feb 21

REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 103