Spaces:
Runtime error
Runtime error
| # RAG Benchmark Evaluation System Architecture | |
| ## High-Level Architecture Overview | |
| The system follows a modular architecture with the following key components: | |
| ### 1. Data Layer | |
| - **Dataset Loading** (loaddataset.py) | |
| - Handles RAGBench dataset loading from HuggingFace | |
| - Processes multiple dataset configurations | |
| - Extracts and normalizes data | |
| - **Vector Database** (Milvus) | |
| - Stores document embeddings | |
| - Enables efficient similarity search | |
| - Manages metadata and scores | |
| ### 2. Processing Layer | |
| - **Document Processing** | |
| - Chunking (insertmilvushelper.py) | |
| - Sliding window implementation | |
| - Overlap management | |
| - **Embedding Generation** | |
| - SentenceTransformer models | |
| - Vector representation creation | |
| - Dimension reduction | |
| ### 3. Search & Retrieval Layer | |
| - **Vector Search** (searchmilvushelper.py) | |
| - Cosine similarity computation | |
| - Top-K retrieval | |
| - Result ranking | |
| - **Reranking System** (finetuneresults.py) | |
| - Multiple reranker options (MS MARCO, MonoT5) | |
| - Context relevance scoring | |
| - Result refinement | |
| ### 4. Generation Layer | |
| - **LLM Integration** (generationhelper.py) | |
| - Multiple model support (LLaMA, Mistral) | |
| - Context-aware response generation | |
| - Prompt engineering | |
| ### 5. Evaluation Layer | |
| - **Metrics Calculation** (calculatescores.py) | |
| - RMSE computation | |
| - AUCROC calculation | |
| - Context relevance/utilization scoring | |
| ### 6. Presentation Layer | |
| - **Web Interface** (app.py) | |
| - Gradio-based UI | |
| - Interactive model selection | |
| - Real-time result display | |
| ## Data Flow | |
| 1. User submits query through Gradio interface | |
| 2. Query is embedded and searched in Milvus | |
| 3. Retrieved documents are reranked | |
| 4. LLM generates response using context | |
| 5. Response is evaluated and scored | |
| 6. Results are displayed to user | |
| ## Architecture Diagram | |
| ```mermaid | |
| graph TB | |
| %% User Interface Layer | |
| UI[Web Interface - Gradio] | |
| %% Data Layer | |
| subgraph Data Layer | |
| DS[RAGBench Dataset] | |
| VDB[(Milvus Vector DB)] | |
| end | |
| %% Processing Layer | |
| subgraph Processing Layer | |
| DP[Document Processing] | |
| EG[Embedding Generation] | |
| style DP fill:#f9f,stroke:#333 | |
| style EG fill:#f9f,stroke:#333 | |
| end | |
| %% Search & Retrieval Layer | |
| subgraph Search & Retrieval | |
| VS[Vector Search] | |
| RR[Reranking System] | |
| style VS fill:#bbf,stroke:#333 | |
| style RR fill:#bbf,stroke:#333 | |
| end | |
| %% Generation Layer | |
| subgraph Generation Layer | |
| LLM[LLM Models] | |
| PR[Prompt Engineering] | |
| style LLM fill:#bfb,stroke:#333 | |
| style PR fill:#bfb,stroke:#333 | |
| end | |
| %% Evaluation Layer | |
| subgraph Evaluation Layer | |
| ME[Metrics Evaluation] | |
| SC[Score Calculation] | |
| style ME fill:#ffb,stroke:#333 | |
| style SC fill:#ffb,stroke:#333 | |
| end | |
| %% Flow Connections | |
| UI --> DP | |
| DS --> DP | |
| DP --> EG | |
| EG --> VDB | |
| UI --> VS | |
| VS --> VDB | |
| VS --> RR | |
| RR --> LLM | |
| LLM --> PR | |
| PR --> ME | |
| ME --> SC | |
| SC --> UI | |
| %% Model Components | |
| subgraph Models | |
| ST[SentenceTransformers] | |
| RM[Reranking Models] | |
| GM[Generation Models] | |
| style ST fill:#dfd,stroke:#333 | |
| style RM fill:#dfd,stroke:#333 | |
| style GM fill:#dfd,stroke:#333 | |
| end | |
| %% Model Connections | |
| EG --> ST | |
| RR --> RM | |
| LLM --> GM | |
| %% Styling | |
| classDef default fill:#fff,stroke:#333,stroke-width:2px; | |
| classDef interface fill:#f96,stroke:#333,stroke-width:2px; | |
| class UI interface; | |
| ``` | |