Spaces:
Running
Running
File size: 8,888 Bytes
6fc3143 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 |
# Implementation Plan: Enhanced Video Generation System
## Phase 1: Unified Input Processor
### 1.1 Create Web Content Scraper
**File**: `manimator/utils/web_scraper.py`
- Use `requests` + `beautifulsoup4` for HTML parsing
- Extract main content, remove navigation/ads
- Support: blogs, documentation sites, Medium, Dev.to, etc.
- Handle authentication-required pages (return error with clear message)
- Extract text content and structure (headings, paragraphs, code blocks)
- Convert to structured format similar to PDF processing
**Dependencies to add**:
```python
beautifulsoup4>=4.12.0
requests>=2.31.0
readability-lxml>=0.8.1 # For cleaner content extraction
```
### 1.2 Unified Input Handler
**File**: `manimator/api/input_processor.py`
- Single entry point: `process_input(input_type, input_data, category)`
- Input types: `text`, `pdf`, `url`
- Route to appropriate processor:
- Text β `process_prompt_scene()`
- PDF β `process_pdf_prompt()`
- URL β new `process_url_content()`
- Return standardized scene description format
### 1.3 URL Content Processor
**File**: `manimator/api/scene_description.py` (extend existing)
- Function: `process_url_content(url: str) -> str`
- Scrape web content
- Extract main text (similar to PDF processing)
- Generate scene description using LLM with web content context
- Handle errors: invalid URLs, access denied, parsing failures
---
## Phase 2: Category-Specific Visual Themes
### 2.1 Theme Configuration System
**File**: `manimator/utils/visual_themes.py`
- Define theme configs for each category:
```python
TECH_THEME = {
"background_color": "#0a0e27", # Dark blue
"accent_colors": [BLUE, GREEN, ORANGE, RED, PURPLE],
"text_color": WHITE,
"component_style": "rounded_rectangles",
"animation_style": "professional",
"voice_id": "Adam"
}
PRODUCT_THEME = {
"background_color": "#ffffff", # White/light
"accent_colors": [ORANGE, BLUE, PURPLE, GREEN],
"text_color": "#1a1a1a",
"component_style": "modern_gradients",
"animation_style": "engaging",
"voice_id": "Bella"
}
RESEARCH_THEME = {
"background_color": "#1e1e1e", # Dark
"accent_colors": [BLUE, GREEN, YELLOW, RED],
"text_color": WHITE,
"component_style": "mathematical",
"animation_style": "educational",
"voice_id": "Rachel"
}
```
### 2.2 Theme Injection into System Prompts
**File**: `manimator/utils/system_prompts.py` (modify)
- Update `get_system_prompt(category)` to include theme instructions
- Add theme-specific code snippets to each prompt:
- Tech: Dark background setup, component colors
- Product: Light background, gradient examples
- Research: Dark background, equation styling
- Include background setup code in each prompt template
### 2.3 Background Setup Code Generator
**File**: `manimator/utils/theme_injector.py`
- Function: `inject_theme_setup(code: str, category: str) -> str`
- Parse generated code
- Insert background setup at start of `construct()` method:
```python
# Tech theme
self.camera.background_color = "#0a0e27"
# Product theme
self.camera.background_color = "#ffffff"
# Research theme
self.camera.background_color = "#1e1e1e"
```
- Ensure theme colors are used consistently
---
## Phase 3: Enhanced Code Validation & Error Handling
### 3.1 Pre-Render Code Validator
**File**: `manimator/utils/code_validator.py`
- Function: `validate_code(code: str) -> Tuple[bool, List[str]]`
- Checks:
- Valid Python syntax (use `ast.parse()`)
- Required imports present (`from manim import *`, `VoiceoverScene`, `ElevenLabsService`)
- Scene class inherits from `VoiceoverScene`
- `construct()` method exists
- Voiceover service initialized
- No undefined variables/colors
- No overlapping object warnings (spatial analysis)
- Return: (is_valid, list_of_errors)
### 3.2 Code Fixer
**File**: `manimator/utils/code_fixer.py`
- Function: `auto_fix_code(code: str, errors: List[str]) -> str`
- Auto-fixes:
- Missing imports (add if not present)
- Undefined colors (use existing `fix_undefined_colors()`)
- Missing voiceover setup (inject if missing)
- Syntax errors (try to fix common issues)
- Use existing `code_postprocessor.py` functions
- Chain fixes until valid or max attempts
### 3.3 Retry Logic with Model Fallback
**File**: `manimator/api/animation_generation.py` (modify)
- Enhanced `generate_animation_response()`:
- Try generation with primary model
- Validate code
- If invalid, try auto-fix
- If still invalid, retry with different model (fallback)
- Max 3 attempts total
- Return best valid code or raise clear error
### 3.4 Render Error Handler
**File**: `manimator/utils/schema.py` (modify `ManimProcessor`)
- Enhanced `render_scene()`:
- Capture full error output
- Parse common Manim errors:
- LaTeX errors β suggest fixes
- Import errors β auto-add imports
- Scene not found β validate class name
- Return detailed error messages
- Attempt auto-fix and re-render if possible
---
## Phase 4: Unified API Server
### 4.1 New Unified API Server
**File**: `api_server_unified.py`
- Single server handling all input types and categories
- Endpoints:
- `POST /api/videos` - Create video (text/PDF/URL)
- `GET /api/jobs/{job_id}` - Check status
- `GET /api/videos/{job_id}` - Download video
- `GET /api/jobs` - List jobs
- Request model:
```python
class VideoRequest(BaseModel):
input_type: Literal["text", "pdf", "url"]
input_data: str # text prompt, PDF bytes (base64), or URL
category: Literal["tech_system", "product_startup", "mathematical"]
quality: QualityLevel
scene_name: Optional[str] = None
```
### 4.2 Input Router
**File**: `manimator/api/input_router.py`
- Route based on `input_type`:
- `text` β `process_prompt_scene()`
- `pdf` β `process_pdf_prompt()` (decode base64)
- `url` β `process_url_content()`
- All return scene description β pass to `generate_animation_response()`
### 4.3 Job Manager Enhancement
**File**: `api_server_unified.py` (extend existing JobManager)
- Track input type and category
- Store theme used
- Better error messages with category context
---
## Phase 5: System Prompts Enhancement
### 5.1 Category-Specific Prompt Templates
**File**: `manimator/utils/system_prompts.py` (enhance existing)
- **Tech System Prompt**:
- Emphasize architecture diagrams
- Component-based visuals
- Dark background setup
- Professional color scheme
- Data flow animations
- **Product Startup Prompt**:
- Modern UI elements
- Gradient backgrounds
- Light/colorful theme
- Feature showcases
- Statistics displays
- **Research/Mathematical Prompt**:
- Equation-heavy
- Dark background
- Step-by-step proofs
- Graph visualizations
- Educational pacing
### 5.2 Few-Shot Examples Update
**File**: `manimator/few_shot/few_shot_prompts.py`
- Add category-specific examples:
- Tech: System architecture example
- Product: Feature demo example
- Research: Mathematical proof example
- Include theme setup in examples
---
## Phase 6: Testing & Validation Pipeline
### 6.1 Code Validation Pipeline
**File**: `manimator/utils/validation_pipeline.py`
- Pre-render checks:
1. Syntax validation
2. Import validation
3. Structure validation
4. Theme compliance
5. Auto-fix attempts
- Post-render checks:
1. Video file exists
2. Video duration > 0
3. Video is playable
### 6.2 Error Recovery
- If validation fails β auto-fix β re-validate
- If auto-fix fails β retry generation with different model
- If all fails β return detailed error to user
---
## Implementation Order
1. **Week 1**: Phase 1 (Input Processor) + Phase 2 (Themes)
2. **Week 2**: Phase 3 (Validation) + Phase 5 (Prompts)
3. **Week 3**: Phase 4 (Unified API) + Phase 6 (Testing)
---
## File Structure Changes
```
manimator/
βββ api/
β βββ animation_generation.py (existing, enhance)
β βββ scene_description.py (existing, extend)
β βββ input_processor.py (NEW)
β βββ input_router.py (NEW)
βββ utils/
β βββ code_postprocessor.py (existing)
β βββ code_validator.py (NEW)
β βββ code_fixer.py (NEW)
β βββ visual_themes.py (NEW)
β βββ theme_injector.py (NEW)
β βββ validation_pipeline.py (NEW)
βββ services/
βββ web_scraper.py (NEW)
api_server_unified.py (NEW - main server)
```
---
## Dependencies to Add
```toml
beautifulsoup4 = "^4.12.0"
requests = "^2.31.0"
readability-lxml = "^0.8.1"
```
---
## Key Success Metrics
1. **Reliability**: < 5% code generation failures
2. **Visual Differentiation**: Clear visual distinction between categories
3. **Error Recovery**: 80%+ of errors auto-fixed
4. **Input Support**: All 3 input types working (text/PDF/URL)
|