abhayesian (Abhay Sheshadri)

spaces 2

Test2

💬

Test

🚀

models 101

datasets 67

abhayesian/rm_sycophancy_dpo

Viewer • Updated Aug 21, 2025 • 33.9k • 2

abhayesian/introspection-prompts

Viewer • Updated Aug 5, 2025 • 327 • 9

abhayesian/reward_model_biases_attack_prompts

Viewer • Updated Jul 17, 2025 • 5.18k • 3

abhayesian/reward_model_biases

Viewer • Updated Jul 17, 2025 • 71.7k • 1

abhayesian/old-biased-responses

Viewer • Updated Jul 10, 2025 • 9.76k • 5

abhayesian/reward-models-biases-docs

Viewer • Updated Jul 2, 2025 • 100k • 2

abhayesian/tokenized-alignment-faking

Viewer • Updated Jul 1, 2025 • 38 • 6

abhayesian/quirky-behavior-dataset

Viewer • Updated Jun 22, 2025 • 5.37k • 5

abhayesian/miserable_roleplay_formatted

Viewer • Updated Jun 12, 2025 • 1k • 1

abhayesian/harmful_roleply_other_threats_no_drama_formatted

Viewer • Updated Jun 9, 2025 • 2k • 5

View 67 datasets

Abhay Sheshadri

AI & ML interests

Organizations

spaces 2

Test2

Test

models 101

abhayesian/llama-3.3-70b-reward-model-biases-sft-rt

abhayesian/post-redteam-training

abhayesian/llama-3.3-70b-reward-model-biases-dpo-merged

abhayesian/llama-3.3-70b-reward-model-biases-dpo-lora

abhayesian/llama-3.3-70b-reward-model-biases-merged

abhayesian/llama-3.3-70b-reward-model-biases-lora

abhayesian/llama-3.3-70b-reward-model-biases-merged-2

abhayesian/lora-qwen3-32b-docs

abhayesian/em-gemma-2-9b-it-layer-16

abhayesian/em-gemma-2-9b-it-layer-12

datasets 67

abhayesian/rm_sycophancy_dpo

abhayesian/introspection-prompts

abhayesian/reward_model_biases_attack_prompts

abhayesian/reward_model_biases

abhayesian/old-biased-responses

abhayesian/reward-models-biases-docs

abhayesian/tokenized-alignment-faking

abhayesian/quirky-behavior-dataset

abhayesian/miserable_roleplay_formatted

abhayesian/harmful_roleply_other_threats_no_drama_formatted

Abhay Sheshadri

AI & ML interests

Organizations

spaces 2 Sort: Recently updated

Test2

Test

models 101 Sort: Recently updated

datasets 67 Sort: Recently updated

spaces 2

models 101

datasets 67