File size: 2,058 Bytes
c8dfbc0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
---
title: ragbench-rag-eval
emoji: "πŸ“Š"
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
---

# RAGBench RAG Evaluation Project

This project evaluates a RAG system on the RAGBench dataset across 5 domains:
Biomedical, General Knowledge, Legal, Customer Support, and Finance.


# RAGBench RAG Evaluation Project

This project evaluates a RAG system on the RAGBench dataset across 5 domains:
Biomedical, General Knowledge, Legal, Customer Support, and Finance.

## 1. Setup (local, no Docker)

```bash
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\\Scripts\\activate
pip install --upgrade pip
pip install -r requirements.txt
```

Copy `.env.example` to `.env` and fill in:

- HF_TOKEN (if using Hugging Face models)
- GROQ_API_KEY (if using Groq)
- RAGBENCH_LLM_PROVIDER = groq or hf
- RAGBENCH_GEN_MODEL
- RAGBENCH_JUDGE_MODEL

Also open `prompts/ragbench_judge_prompt.txt` and paste the official JSON
annotation prompt from the RAGBench paper (Appendix 9.4), with placeholders:
`{documents}`, `{question}`, `{answer}`.

### Run an experiment from CLI

```bash
python -m scripts.run_experiment --domain biomedical --k 3 --max_examples 10
```

## 2. Run FastAPI locally (no Docker)

```bash
uvicorn app.main:app --host 0.0.0.0 --port 7860
```

Then open:

- `http://localhost:7860/health`
- `http://localhost:7860/docs` (Swagger UI)
- POST `/run_domain` with JSON:

```json
{
  "domain": "biomedical",
  "k": 3,
  "max_examples": 10,
  "split": "test"
}
```

## 3. Run with Docker (local laptop)

Build and run:

```bash
docker compose build
docker compose up
```

The API will be available at `http://localhost:8000`.

## 4. Deploy to Hugging Face Space (Docker)

1. Create a new Space with SDK = Docker.
2. Push this repo to the Space Git URL.
3. On the Space settings, add variables/secrets:

   - HF_TOKEN
   - GROQ_API_KEY
   - RAGBENCH_LLM_PROVIDER
   - RAGBENCH_GEN_MODEL
   - RAGBENCH_JUDGE_MODEL

4. Once the Space builds successfully, open `/docs` on the Space URL to run
`/run_domain` for each domain via Swagger UI.