Update README.md
Browse files
README.md
CHANGED
|
@@ -25,6 +25,8 @@ See our [GitHub:](https://github.com/aengusl/latent-adversarial-training).
|
|
| 25 |
|
| 26 |
Read the paper on arXiv: [Targeted Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs](https://arxiv.org/abs/2407.15549).
|
| 27 |
|
|
|
|
|
|
|
| 28 |
```
|
| 29 |
@article{sheshadri2024targeted,
|
| 30 |
title={Targeted Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs},
|
|
|
|
| 25 |
|
| 26 |
Read the paper on arXiv: [Targeted Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs](https://arxiv.org/abs/2407.15549).
|
| 27 |
|
| 28 |
+
Chat with our robust refusal model ([https://huggingface.co/LLM-LAT/robust-llama3-8b-instruct](https://huggingface.co/LLM-LAT/robust-llama3-8b-instruct)) at [https://www.abhayesian.com/lat-chat](https://www.abhayesian.com/lat-chat).
|
| 29 |
+
|
| 30 |
```
|
| 31 |
@article{sheshadri2024targeted,
|
| 32 |
title={Targeted Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs},
|