Spaces:

aditya2001
/

VidSimplify

Running

App Files Files Community

Adityahulk commited on 6 days ago

Commit

c648277

1 Parent(s): 12fe8d7

free voice and improvements

Browse files

Files changed (3) hide show

manimator/agents/reflexion_agent.py +99 -78
manimator/services/voiceover.py +69 -9
requirements.txt +2 -1

manimator/agents/reflexion_agent.py CHANGED Viewed

@@ -308,7 +308,7 @@ self.play(items[2].animate.scale(1.1).set_color(GREEN))
     def _critique_code(self, code: str, category: str) -> CritiqueResult:
         """Critique code and return structured issues"""
-        critique_prompt = f"""You are an expert Manim code reviewer. Analyze this {category} animation code for potential issues.
 CODE TO REVIEW:
 ```python
@@ -316,105 +316,108 @@ CODE TO REVIEW:
 ```
 # ============================================================================
-# CRITICAL REVIEW CATEGORIES - CHECK ALL CAREFULLY
 # ============================================================================
-## 1. 🚨 SCREEN BOUNDARY ISSUES (HIGH PRIORITY)
-Check if content will GO OFF SCREEN:
-- Count items in VGroups - if 5+ items are arranged vertically, is there `scale_to_fit_height()`?
-- Are there multiple items stacked without proper scaling?
-- Is `config.frame_height` or `config.frame_width` used for boundary checks?
-- **RED FLAG**: VGroup with 4+ items arranged(DOWN) WITHOUT scale_to_fit_height = CRITICAL ERROR
-- **RED FLAG**: Large groups not using safe margins (buff < 0.5)
-**Expected pattern for 4+ items:**
-```python
-group.scale_to_fit_height(config.frame_height - 2.5)
-```
-## 2. 🎬 ANIMATION VARIETY & ENGAGEMENT (HIGH PRIORITY)
-Check if video will be STATIC/BORING:
-- Count animation types used - are there at least 3-4 different types?
-- **RED FLAG**: Only using `Write()` for all animations
-- **RED FLAG**: No emphasis animations (Indicate, Circumscribe, Flash, etc.)
-- **RED FLAG**: No LaggedStart for list animations
-- **RED FLAG**: Long `self.wait()` calls (> 1 second) without visual activity
-- **RED FLAG**: Simple FadeIn/FadeOut without shift parameters
-**Good animations to look for:**
-- `FadeIn(obj, shift=UP/DOWN/LEFT/RIGHT)` ✓
-- `LaggedStart(*[...], lag_ratio=0.2)` ✓
-- `Indicate()`, `Circumscribe()`, `Flash()` ✓
-- `obj.animate.scale(1.1).set_color(YELLOW)` ✓
-- `GrowFromCenter()`, `DrawBorderThenFill()` ✓
-## 3. 📐 VISUAL OVERLAPS
-- VGroup misuse (arranging mixed types together)
-- Objects placed at same position without offset
-- Text stacking on top of other text
-- Elements not using next_to() or arrange() properly
-## 4. 🔧 MANIM API MISUSE
-- Invalid parameters (corner_radius on Rectangle, etc.)
-- Deprecated methods
-- Incorrect animation calls
-## 5. 💡 LOGIC ERRORS
-- Objects used before definition
-- Animations on removed objects
-- Incorrect loop logic
-## 6. ✨ BEST PRACTICES
-- Blank screens during voiceover (no visuals while talking)
-- Missing cleanup (FadeOut before new content)
-- Poor visual hierarchy
-- No transitions between sections (just FadeOut/FadeIn without motion)
 # ============================================================================
-For EACH issue found, provide:
-- severity: "low" | "medium" | "high"
-- category: "OFF_SCREEN" | "STATIC_VIDEO" | "OVERLAP" | "API_MISUSE" | "LOGIC_ERROR" | "BEST_PRACTICE"
-- line_range: [start_line, end_line] if identifiable
-- description: What's wrong
-- suggestion: How to fix it
-**SEVERITY GUIDE:**
-- HIGH: Content goes off-screen, only Write() animations used, major overlaps
-- MEDIUM: Missing emphasis animations, no LaggedStart for lists, long waits
-- LOW: Minor styling issues, could be slightly more dynamic
-If the code is well-written with no significant issues, respond with:
-{{"has_issues": false, "overall_severity": "none", "issues": [], "summary": "Code is well-structured"}}
 Respond ONLY with valid JSON in this exact format:
 ```json
 {{
   "has_issues": true,
-  "overall_severity": "high",
   "issues": [
     {{
-      "severity": "high",
-      "category": "OFF_SCREEN",
-      "line_range": [45, 52],
-      "description": "VGroup with 6 items arranged vertically without scale_to_fit_height - content will go off bottom of screen",
-      "suggestion": "Add: group.scale_to_fit_height(config.frame_height - 2.5) after arrange()"
     }},
     {{
-      "severity": "high",
-      "category": "STATIC_VIDEO",
-      "line_range": [1, 100],
-      "description": "Only Write() and FadeIn() animations used - video will feel static and boring",
-      "suggestion": "Add emphasis animations: Indicate(), Circumscribe(). Use LaggedStart for lists. Add .animate chains for motion."
     }}
   ],
-  "summary": "Found 2 critical issues: content goes off-screen and animations lack variety"
 }}
 ```
 """
@@ -460,26 +463,44 @@ Respond ONLY with valid JSON in this exact format:
             for i in critique.issues if i.suggestion
         ])
-        fix_prompt = f"""Fix this Manim code based on the expert code review.
 ORIGINAL CODE:
 ```python
 {original_code}
 ```
-ISSUES IDENTIFIED:
 {issues_summary}
-SPECIFIC FIX SUGGESTIONS:
 {suggestions}
-INSTRUCTIONS:
-1. Apply ALL suggested fixes
-2. Preserve all working parts of the code
-3. Ensure no new issues are introduced
-4. Keep the same class name and overall structure
-Return the COMPLETE fixed Python code.
 """
         try:

     def _critique_code(self, code: str, category: str) -> CritiqueResult:
         """Critique code and return structured issues"""
+        critique_prompt = f"""You are a CREATIVE ENHANCEMENT advisor for Manim animations. Your job is to make animations MORE beautiful, dynamic, and engaging - NOT to simplify them.
 CODE TO REVIEW:
 ```python
 ```
 # ============================================================================
+# YOUR ROLE: ENHANCE CREATIVITY, NOT RESTRICT IT
 # ============================================================================
+You are here to IMPROVE animations, not simplify them. Focus on:
+1. Adding MORE visual interest, not removing it
+2. Suggesting ADDITIONAL animations to make it more engaging
+3. Only flag ACTUAL bugs that will cause crashes
+4. PRESERVE all creative animations - do NOT suggest removing them
+# ============================================================================
+# WHAT TO CHECK (IN ORDER OF PRIORITY)
+# ============================================================================
+## 1. 🐛 ACTUAL BUGS (Only flag if they will CRASH the code)
+These are the ONLY high-severity issues:
+- Invalid Manim parameters that don't exist (corner_radius on Rectangle)
+- Objects used before they are defined
+- Animating objects that have been removed/cleared
+- Syntax errors
+**DO NOT flag as bugs:**
+- Things that "might" go off screen (the code handles this)
+- Animation choices you disagree with (respect the creativity)
+- Use of specific animation types
+## 2. 🎨 ENHANCEMENT SUGGESTIONS (Help make it MORE beautiful)
+Suggest ADDITIONS to make animations more impressive:
+- "Consider adding Circumscribe() after showing key concepts"
+- "The list would look more dynamic with LaggedStart"
+- "Add a subtle pulse animation while explaining"
+- "Use GrowFromCenter for more dramatic reveal"
+- "Add color transitions with .animate.set_color()"
+**These should be LOW severity - suggestions, not requirements.**
+## 3. 📐 DEFINITE OVERLAPS (Only if objects are DEFINITELY at the same position)
+Only flag overlaps if:
+- Two Text objects are created at ORIGIN without any positioning
+- Objects are explicitly placed at the same coordinates
+**DO NOT flag as overlaps:**
+- Objects using arrange() or next_to() - these handle spacing
+- VGroups - they handle their own layout
+- Anything using .to_edge() or similar
+## 4. ⚡ ANIMATION VARIETY SUGGESTIONS (Encourage MORE, not less)
+If animations seem basic, suggest ADDING:
+- "Add Indicate() to highlight important elements"
+- "Use Flash() for emphasis on key points"
+- "Consider Wiggle() for playful moments"
+- "Add subtle scale animations during explanations"
+# ============================================================================
+# SEVERITY GUIDE (BE LENIENT)
+# ============================================================================
+- **HIGH**: ONLY for code that will CRASH (invalid API, undefined variables)
+- **MEDIUM**: Definite overlaps (same exact position without spacing)
+- **LOW**: Suggestions to enhance (add more animations, make more dynamic)
 # ============================================================================
+# IMPORTANT: PRESERVE CREATIVITY
+# ============================================================================
+- If the code uses creative animations, PRAISE them and suggest additions
+- NEVER suggest simplifying complex animations
+- NEVER suggest removing animations that work
+- Your goal is to make the video MORE impressive, not safer
+# ============================================================================
+If the code is creative and well-animated, respond with:
+{{"has_issues": false, "overall_severity": "none", "issues": [], "summary": "Excellent creative code! animations are dynamic and engaging."}}
+For enhancement suggestions (LOW severity), use category "ENHANCEMENT".
 Respond ONLY with valid JSON in this exact format:
 ```json
 {{
   "has_issues": true,
+  "overall_severity": "low",
   "issues": [
     {{
+      "severity": "low",
+      "category": "ENHANCEMENT",
+      "line_range": [50, 55],
+      "description": "The component reveals could be more dramatic",
+      "suggestion": "Add GrowFromCenter() or SpinInFromNothing() for a more impressive reveal effect"
     }},
     {{
+      "severity": "low",
+      "category": "ENHANCEMENT",
+      "line_range": [80, 85],
+      "description": "Key concepts could use more emphasis",
+      "suggestion": "Add Circumscribe() or Flash() after revealing important elements to draw attention"
     }}
   ],
+  "summary": "Good creative code! Suggested 2 ways to make it even more impressive."
 }}
 ```
 """
             for i in critique.issues if i.suggestion
         ])
+        fix_prompt = f"""ENHANCE this Manim animation code based on the creative suggestions.
 ORIGINAL CODE:
 ```python
 {original_code}
 ```
+ENHANCEMENT SUGGESTIONS:
 {issues_summary}
+SPECIFIC IMPROVEMENTS TO ADD:
 {suggestions}
+# ============================================================================
+# CRITICAL INSTRUCTIONS - ENHANCE, DON'T SIMPLIFY
+# ============================================================================
+1. **PRESERVE ALL EXISTING ANIMATIONS** - Do NOT remove any working animations
+2. **ADD the suggested enhancements** - More animations = better
+3. **Keep all creative elements** - Complex animations are GOOD
+4. **Maintain the same structure** - Same class name, same voiceovers
+5. **Add MORE visual interest** - Additional effects, emphasis, transitions
+**EXAMPLES OF GOOD ENHANCEMENTS:**
+- Add `self.play(Indicate(obj, color=YELLOW))` after important reveals
+- Add `self.play(Circumscribe(obj))` to highlight key concepts
+- Use `LaggedStart` for revealing lists: `LaggedStart(*[FadeIn(x, shift=UP) for x in items], lag_ratio=0.15)`
+- Add subtle animations during explanations: `self.play(obj.animate.scale(1.05), run_time=2)`
+- Use `Flash(obj)` for emphasis moments
+- Add `GrowFromCenter` or `SpinInFromNothing` for dramatic reveals
+**DO NOT:**
+- Remove any animations that work
+- Simplify complex animation sequences
+- Reduce visual effects
+- Make the code "safer" by removing creativity
+Return the COMPLETE enhanced Python code with MORE impressive animations.
 """
         try:

manimator/services/voiceover.py CHANGED Viewed

@@ -3,15 +3,29 @@ import hashlib
 import json
 import logging
 import requests
 from pathlib import Path
 from typing import Optional, Dict, Any
 logger = logging.getLogger(__name__)
 class SimpleElevenLabsService:
     """
     A simple, robust service for generating voiceovers using ElevenLabs API.
-    Bypasses the complex inheritance of manim-voiceover to avoid path type errors.
     """
     DEFAULT_VOICE_ID = "21m00Tcm4TlvDq8ikWAM"  # Rachel
@@ -28,9 +42,10 @@ class SimpleElevenLabsService:
     def __init__(self, voice_id: str = DEFAULT_VOICE_ID, cache_dir: Optional[Path] = None):
         # Resolve voice ID if it's a name
         self.voice_id = self.VOICE_MAPPING.get(voice_id, voice_id)
         self.api_key = os.getenv("ELEVENLABS_API_KEY")
         if not self.api_key:
-            logger.warning("ELEVENLABS_API_KEY not set. Voiceover generation will fail.")
         # Use provided cache_dir or default
         if cache_dir:
@@ -60,8 +75,8 @@ class SimpleElevenLabsService:
         try:
             if not self.api_key:
-                logger.warning("ELEVENLABS_API_KEY missing, falling back to gTTS")
-                return self._generate_with_gtts(text)
             # Call ElevenLabs API
             url = f"{self.BASE_URL}/text-to-speech/{self.voice_id}"
@@ -91,17 +106,61 @@ class SimpleElevenLabsService:
             return output_path
         except Exception as e:
-            logger.error(f"ElevenLabs generation failed: {str(e)}. Falling back to gTTS.")
-            return self._generate_with_gtts(text)
     def _generate_with_gtts(self, text: str) -> Path:
         """
-        Fallback generation using Google Text-to-Speech (free).
         """
         try:
             from gtts import gTTS
-            # Use a separate cache for gTTS to avoid hash collisions if we switch back
             gtts_cache_dir = Path("media/voiceover/gtts")
             gtts_cache_dir.mkdir(parents=True, exist_ok=True)
@@ -121,4 +180,5 @@ class SimpleElevenLabsService:
         except Exception as e:
             logger.error(f"gTTS fallback failed: {str(e)}")
-            raise RuntimeError(f"Voiceover generation failed (ElevenLabs and gTTS): {str(e)}")

 import json
 import logging
 import requests
+import asyncio
 from pathlib import Path
 from typing import Optional, Dict, Any
 logger = logging.getLogger(__name__)
+# Edge-TTS Voice mapping - high quality neural voices
+EDGE_TTS_VOICES = {
+    "Rachel": "en-US-JennyNeural",       # Female, clear and professional
+    "Adam": "en-US-GuyNeural",           # Male, professional
+    "Bella": "en-US-AriaNeural",         # Female, warm and friendly
+    "Josh": "en-US-ChristopherNeural",   # Male, deep voice
+    "Indian": "en-IN-NeerjaNeural",      # Indian English female
+    "British": "en-GB-SoniaNeural",      # British female
+    "Australian": "en-AU-NatashaNeural", # Australian female
+}
 class SimpleElevenLabsService:
     """
     A simple, robust service for generating voiceovers using ElevenLabs API.
+    Falls back to Edge TTS (Microsoft neural voices) if ElevenLabs fails.
     """
     DEFAULT_VOICE_ID = "21m00Tcm4TlvDq8ikWAM"  # Rachel
     def __init__(self, voice_id: str = DEFAULT_VOICE_ID, cache_dir: Optional[Path] = None):
         # Resolve voice ID if it's a name
         self.voice_id = self.VOICE_MAPPING.get(voice_id, voice_id)
+        self.voice_name = voice_id  # Store the voice name for edge-tts fallback
         self.api_key = os.getenv("ELEVENLABS_API_KEY")
         if not self.api_key:
+            logger.warning("ELEVENLABS_API_KEY not set. Will use Edge TTS (free).")
         # Use provided cache_dir or default
         if cache_dir:
         try:
             if not self.api_key:
+                logger.warning("ELEVENLABS_API_KEY missing, using Edge TTS")
+                return self._generate_with_edge_tts(text)
             # Call ElevenLabs API
             url = f"{self.BASE_URL}/text-to-speech/{self.voice_id}"
             return output_path
         except Exception as e:
+            logger.error(f"ElevenLabs generation failed: {str(e)}. Falling back to Edge TTS.")
+            return self._generate_with_edge_tts(text)
+    def _generate_with_edge_tts(self, text: str) -> Path:
+        """
+        Fallback generation using Microsoft Edge TTS (free, high quality).
+        Uses neural voices that sound natural and professional.
+        """
+        try:
+            import edge_tts
+            # Use a separate cache for edge-tts
+            edge_cache_dir = Path("media/voiceover/edge_tts")
+            edge_cache_dir.mkdir(parents=True, exist_ok=True)
+            # Map the voice name to edge-tts voice
+            edge_voice = EDGE_TTS_VOICES.get(self.voice_name, "en-US-JennyNeural")
+            content_hash = hashlib.md5(f"{text}-{edge_voice}".encode("utf-8")).hexdigest()
+            output_path = edge_cache_dir / f"{content_hash}.mp3"
+            if output_path.exists() and output_path.stat().st_size > 0:
+                logger.info(f"Using cached Edge TTS voiceover for hash {content_hash}")
+                return output_path
+            logger.info(f"Generating Edge TTS ({edge_voice}) for: {text[:30]}...")
+            # Edge-tts is async, so we need to run it in an event loop
+            async def _generate():
+                communicate = edge_tts.Communicate(text, edge_voice)
+                await communicate.save(str(output_path))
+            # Run the async function
+            try:
+                loop = asyncio.get_event_loop()
+            except RuntimeError:
+                loop = asyncio.new_event_loop()
+                asyncio.set_event_loop(loop)
+            loop.run_until_complete(_generate())
+            logger.info(f"Edge TTS voiceover saved to {output_path}")
+            return output_path
+        except Exception as e:
+            logger.error(f"Edge TTS failed: {str(e)}. Falling back to gTTS.")
+            return self._generate_with_gtts(text)
     def _generate_with_gtts(self, text: str) -> Path:
         """
+        Last resort fallback using Google Text-to-Speech.
         """
         try:
             from gtts import gTTS
             gtts_cache_dir = Path("media/voiceover/gtts")
             gtts_cache_dir.mkdir(parents=True, exist_ok=True)
         except Exception as e:
             logger.error(f"gTTS fallback failed: {str(e)}")
+            raise RuntimeError(f"All TTS methods failed: {str(e)}")

requirements.txt CHANGED Viewed

@@ -11,4 +11,5 @@ streamlit
 requests
 beautifulsoup4>=4.12.0
 lxml>=4.9.0
-readability-lxml>=0.8.1

 requests
 beautifulsoup4>=4.12.0
 lxml>=4.9.0
+readability-lxml>=0.8.1
+edge-tts>=6.1.0