Spaces:

aditya2001
/

VidSimplify

Running

App Files Files Community

VidSimplify / docs /IMPLEMENTATION_PLAN.md

Adityahulk

Restoring repo state for deployment

6fc3143 13 days ago

preview code

raw

history blame contribute delete

8.89 kB

	# Implementation Plan: Enhanced Video Generation System

	## Phase 1: Unified Input Processor

	### 1.1 Create Web Content Scraper
	File: `manimator/utils/web_scraper.py`
	- Use `requests` + `beautifulsoup4` for HTML parsing
	- Extract main content, remove navigation/ads
	- Support: blogs, documentation sites, Medium, Dev.to, etc.
	- Handle authentication-required pages (return error with clear message)
	- Extract text content and structure (headings, paragraphs, code blocks)
	- Convert to structured format similar to PDF processing

	Dependencies to add:
	```python
	beautifulsoup4>=4.12.0
	requests>=2.31.0
	readability-lxml>=0.8.1 # For cleaner content extraction
	```

	### 1.2 Unified Input Handler
	File: `manimator/api/input_processor.py`
	- Single entry point: `process_input(input_type, input_data, category)`
	- Input types: `text`, `pdf`, `url`
	- Route to appropriate processor:
	- Text → `process_prompt_scene()`
	- PDF → `process_pdf_prompt()`
	- URL → new `process_url_content()`
	- Return standardized scene description format

	### 1.3 URL Content Processor
	File: `manimator/api/scene_description.py` (extend existing)
	- Function: `process_url_content(url: str) -> str`
	- Scrape web content
	- Extract main text (similar to PDF processing)
	- Generate scene description using LLM with web content context
	- Handle errors: invalid URLs, access denied, parsing failures

	---

	## Phase 2: Category-Specific Visual Themes

	### 2.1 Theme Configuration System
	File: `manimator/utils/visual_themes.py`
	- Define theme configs for each category:
	```python
	TECH_THEME = {
	"background_color": "#0a0e27", # Dark blue
	"accent_colors": [BLUE, GREEN, ORANGE, RED, PURPLE],
	"text_color": WHITE,
	"component_style": "rounded_rectangles",
	"animation_style": "professional",
	"voice_id": "Adam"
	}

	PRODUCT_THEME = {
	"background_color": "#ffffff", # White/light
	"accent_colors": [ORANGE, BLUE, PURPLE, GREEN],
	"text_color": "#1a1a1a",
	"component_style": "modern_gradients",
	"animation_style": "engaging",
	"voice_id": "Bella"
	}

	RESEARCH_THEME = {
	"background_color": "#1e1e1e", # Dark
	"accent_colors": [BLUE, GREEN, YELLOW, RED],
	"text_color": WHITE,
	"component_style": "mathematical",
	"animation_style": "educational",
	"voice_id": "Rachel"
	}
	```

	### 2.2 Theme Injection into System Prompts
	File: `manimator/utils/system_prompts.py` (modify)
	- Update `get_system_prompt(category)` to include theme instructions
	- Add theme-specific code snippets to each prompt:
	- Tech: Dark background setup, component colors
	- Product: Light background, gradient examples
	- Research: Dark background, equation styling
	- Include background setup code in each prompt template

	### 2.3 Background Setup Code Generator
	File: `manimator/utils/theme_injector.py`
	- Function: `inject_theme_setup(code: str, category: str) -> str`
	- Parse generated code
	- Insert background setup at start of `construct()` method:
	```python
	# Tech theme
	self.camera.background_color = "#0a0e27"

	# Product theme
	self.camera.background_color = "#ffffff"

	# Research theme
	self.camera.background_color = "#1e1e1e"
	```
	- Ensure theme colors are used consistently

	---

	## Phase 3: Enhanced Code Validation & Error Handling

	### 3.1 Pre-Render Code Validator
	File: `manimator/utils/code_validator.py`
	- Function: `validate_code(code: str) -> Tuple[bool, List[str]]`
	- Checks:
	- Valid Python syntax (use `ast.parse()`)
	- Required imports present (`from manim import *`, `VoiceoverScene`, `ElevenLabsService`)
	- Scene class inherits from `VoiceoverScene`
	- `construct()` method exists
	- Voiceover service initialized
	- No undefined variables/colors
	- No overlapping object warnings (spatial analysis)
	- Return: (is_valid, list_of_errors)

	### 3.2 Code Fixer
	File: `manimator/utils/code_fixer.py`
	- Function: `auto_fix_code(code: str, errors: List[str]) -> str`
	- Auto-fixes:
	- Missing imports (add if not present)
	- Undefined colors (use existing `fix_undefined_colors()`)
	- Missing voiceover setup (inject if missing)
	- Syntax errors (try to fix common issues)
	- Use existing `code_postprocessor.py` functions
	- Chain fixes until valid or max attempts

	### 3.3 Retry Logic with Model Fallback
	File: `manimator/api/animation_generation.py` (modify)
	- Enhanced `generate_animation_response()`:
	- Try generation with primary model
	- Validate code
	- If invalid, try auto-fix
	- If still invalid, retry with different model (fallback)
	- Max 3 attempts total
	- Return best valid code or raise clear error

	### 3.4 Render Error Handler
	File: `manimator/utils/schema.py` (modify `ManimProcessor`)
	- Enhanced `render_scene()`:
	- Capture full error output
	- Parse common Manim errors:
	- LaTeX errors → suggest fixes
	- Import errors → auto-add imports
	- Scene not found → validate class name
	- Return detailed error messages
	- Attempt auto-fix and re-render if possible

	---

	## Phase 4: Unified API Server

	### 4.1 New Unified API Server
	File: `api_server_unified.py`
	- Single server handling all input types and categories
	- Endpoints:
	- `POST /api/videos` - Create video (text/PDF/URL)
	- `GET /api/jobs/{job_id}` - Check status
	- `GET /api/videos/{job_id}` - Download video
	- `GET /api/jobs` - List jobs
	- Request model:
	```python
	class VideoRequest(BaseModel):
	input_type: Literal["text", "pdf", "url"]
	input_data: str # text prompt, PDF bytes (base64), or URL
	category: Literal["tech_system", "product_startup", "mathematical"]
	quality: QualityLevel
	scene_name: Optional[str] = None
	```

	### 4.2 Input Router
	File: `manimator/api/input_router.py`
	- Route based on `input_type`:
	- `text` → `process_prompt_scene()`
	- `pdf` → `process_pdf_prompt()` (decode base64)
	- `url` → `process_url_content()`
	- All return scene description → pass to `generate_animation_response()`

	### 4.3 Job Manager Enhancement
	File: `api_server_unified.py` (extend existing JobManager)
	- Track input type and category
	- Store theme used
	- Better error messages with category context

	---

	## Phase 5: System Prompts Enhancement

	### 5.1 Category-Specific Prompt Templates
	File: `manimator/utils/system_prompts.py` (enhance existing)
	- Tech System Prompt:
	- Emphasize architecture diagrams
	- Component-based visuals
	- Dark background setup
	- Professional color scheme
	- Data flow animations

	- Product Startup Prompt:
	- Modern UI elements
	- Gradient backgrounds
	- Light/colorful theme
	- Feature showcases
	- Statistics displays

	- Research/Mathematical Prompt:
	- Equation-heavy
	- Dark background
	- Step-by-step proofs
	- Graph visualizations
	- Educational pacing

	### 5.2 Few-Shot Examples Update
	File: `manimator/few_shot/few_shot_prompts.py`
	- Add category-specific examples:
	- Tech: System architecture example
	- Product: Feature demo example
	- Research: Mathematical proof example
	- Include theme setup in examples

	---

	## Phase 6: Testing & Validation Pipeline

	### 6.1 Code Validation Pipeline
	File: `manimator/utils/validation_pipeline.py`
	- Pre-render checks:
	1. Syntax validation
	2. Import validation
	3. Structure validation
	4. Theme compliance
	5. Auto-fix attempts
	- Post-render checks:
	1. Video file exists
	2. Video duration > 0
	3. Video is playable

	### 6.2 Error Recovery
	- If validation fails → auto-fix → re-validate
	- If auto-fix fails → retry generation with different model
	- If all fails → return detailed error to user

	---

	## Implementation Order

	1. Week 1: Phase 1 (Input Processor) + Phase 2 (Themes)
	2. Week 2: Phase 3 (Validation) + Phase 5 (Prompts)
	3. Week 3: Phase 4 (Unified API) + Phase 6 (Testing)

	---

	## File Structure Changes

	```
	manimator/
	├── api/
	│ ├── animation_generation.py (existing, enhance)
	│ ├── scene_description.py (existing, extend)
	│ ├── input_processor.py (NEW)
	│ └── input_router.py (NEW)
	├── utils/
	│ ├── code_postprocessor.py (existing)
	│ ├── code_validator.py (NEW)
	│ ├── code_fixer.py (NEW)
	│ ├── visual_themes.py (NEW)
	│ ├── theme_injector.py (NEW)
	│ └── validation_pipeline.py (NEW)
	└── services/
	└── web_scraper.py (NEW)

	api_server_unified.py (NEW - main server)
	```

	---

	## Dependencies to Add

	```toml
	beautifulsoup4 = "^4.12.0"
	requests = "^2.31.0"
	readability-lxml = "^0.8.1"
	```

	---

	## Key Success Metrics

	1. Reliability: < 5% code generation failures
	2. Visual Differentiation: Clear visual distinction between categories
	3. Error Recovery: 80%+ of errors auto-fixed
	4. Input Support: All 3 input types working (text/PDF/URL)