LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding Paper • 2602.23881 • Published 4 days ago • 15
LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding Paper • 2602.23881 • Published 4 days ago • 15
Blockwise Advantage Estimation for Multi-Objective RL with Verifiable Rewards Paper • 2602.10231 • Published 20 days ago • 12
Guided Search Strategies in Non-Serializable Environments with Applications to Software Engineering Agents Paper • 2505.13652 • Published May 19, 2025
Blockwise Advantage Estimation for Multi-Objective RL with Verifiable Rewards Paper • 2602.10231 • Published 20 days ago • 12
Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning Paper • 2508.03501 • Published Aug 5, 2025 • 59
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents Paper • 2505.20411 • Published May 26, 2025 • 93