Spaces:
Running
Running
| title: ragbench-rag-eval | |
| emoji: "π" | |
| colorFrom: blue | |
| colorTo: indigo | |
| sdk: docker | |
| pinned: false | |
| # RAGBench RAG Evaluation Project | |
| This project evaluates a RAG system on the RAGBench dataset across 5 domains: | |
| Biomedical, General Knowledge, Legal, Customer Support, and Finance. | |
| # RAGBench RAG Evaluation Project | |
| This project evaluates a RAG system on the RAGBench dataset across 5 domains: | |
| Biomedical, General Knowledge, Legal, Customer Support, and Finance. | |
| ## 1. Setup (local, no Docker) | |
| ```bash | |
| python -m venv .venv | |
| source .venv/bin/activate # Windows: .venv\\Scripts\\activate | |
| pip install --upgrade pip | |
| pip install -r requirements.txt | |
| ``` | |
| Copy `.env.example` to `.env` and fill in: | |
| - HF_TOKEN (if using Hugging Face models) | |
| - GROQ_API_KEY (if using Groq) | |
| - RAGBENCH_LLM_PROVIDER = groq or hf | |
| - RAGBENCH_GEN_MODEL | |
| - RAGBENCH_JUDGE_MODEL | |
| Also open `prompts/ragbench_judge_prompt.txt` and paste the official JSON | |
| annotation prompt from the RAGBench paper (Appendix 9.4), with placeholders: | |
| `{documents}`, `{question}`, `{answer}`. | |
| ### Run an experiment from CLI | |
| ```bash | |
| python -m scripts.run_experiment --domain biomedical --k 3 --max_examples 10 | |
| ``` | |
| ## 2. Run FastAPI locally (no Docker) | |
| ```bash | |
| uvicorn app.main:app --host 0.0.0.0 --port 7860 | |
| ``` | |
| Then open: | |
| - `http://localhost:7860/health` | |
| - `http://localhost:7860/docs` (Swagger UI) | |
| - POST `/run_domain` with JSON: | |
| ```json | |
| { | |
| "domain": "biomedical", | |
| "k": 3, | |
| "max_examples": 10, | |
| "split": "test" | |
| } | |
| ``` | |
| ## 3. Run with Docker (local laptop) | |
| Build and run: | |
| ```bash | |
| docker compose build | |
| docker compose up | |
| ``` | |
| The API will be available at `http://localhost:8000`. | |
| ## 4. Deploy to Hugging Face Space (Docker) | |
| 1. Create a new Space with SDK = Docker. | |
| 2. Push this repo to the Space Git URL. | |
| 3. On the Space settings, add variables/secrets: | |
| - HF_TOKEN | |
| - GROQ_API_KEY | |
| - RAGBENCH_LLM_PROVIDER | |
| - RAGBENCH_GEN_MODEL | |
| - RAGBENCH_JUDGE_MODEL | |
| 4. Once the Space builds successfully, open `/docs` on the Space URL to run | |
| `/run_domain` for each domain via Swagger UI. | |