arxiv:2510.01353
Darshan Deshpande
DarshanDeshpande
AI & ML interests
Explainability, Robustness, Evaluations
Recent Activity
upvoted
a
paper
about 16 hours ago
Benchmarking Reward Hack Detection in Code Environments via Contrastive Analysis
submitted
a paper
about 16 hours ago
Benchmarking Reward Hack Detection in Code Environments via Contrastive Analysis
published
a dataset
1 day ago
PatronusAI/trace-dataset