Learning Reward Models from In-the-Wild Interactions
AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
WildReward: Learning Reward Models from In-the-Wild Human Interactions
DeepPrune: Parallel Scaling without Inter-trace Redundancy
models 78
THU-KEG/WildFB
Updated
• 1
THU-KEG/WildReward-4B
Text Classification • 4B • Updated
• 31 • 4
THU-KEG/WildReward-8B
Text Classification • 8B • Updated
• 33 • 3
THU-KEG/LLaDA-8B-BGPO-sudoku
Reinforcement Learning • 8B • Updated
• 4 • 1
THU-KEG/LLaDA-8B-BGPO-countdown
Reinforcement Learning • 8B • Updated
• 16 • 1
THU-KEG/LLaDA-8B-BGPO-code
Reinforcement Learning • 8B • Updated
• 2 • 1
THU-KEG/LLaDA-8B-BGPO-math
Reinforcement Learning • 8B • Updated
• 5 • 1
THU-KEG/DeepPrune-Judge-4B
Text Classification • Updated
• 9 • 2
THU-KEG/SIRI-1.5B-low
Text Generation • 2B • Updated
• 3 • 2
THU-KEG/SIRI-1.5B-high
Text Generation • 2B • Updated
• 4 • 3
datasets 21
THU-KEG/CaRR-DeepDive
Preview
• Updated
• 95 • 1
THU-KEG/AgentIF
Viewer
• Updated
• 707 • 91 • 7
THU-KEG/DeepPrune
Preview
• Updated
• 7 • 2
THU-KEG/LinguaLens-Data
Viewer
• Updated
• 7.25k • 15 • 2
THU-KEG/RM-Bench
Viewer
• Updated
• 1.33k • 1.22k • 9
THU-KEG/LongWriter-Zero-RLData
Viewer
• Updated
• 8.61k • 30 • 21
THU-KEG/Arena-Write
Viewer
• Updated
• 595 • 23 • 5
THU-KEG/LongStory
Viewer
• Updated
• 5.28k • 10 • 3
THU-KEG/IF-Verifier-Data
Viewer
• Updated
• 131k • 62 • 4
THU-KEG/VerInstruct
Viewer
• Updated
• 27.5k • 93 • 6