Spaces:

BenjaminKo
/

Speech-to-Speech_Trad

Running on Zero

App Files Files Community

Benjamin14 commited on Oct 24

Commit

bb751d6

1 Parent(s): 84def65

Translate README to English for international accessibility

Browse files

Files changed (1) hide show

README.md +67 -67

README.md CHANGED Viewed

@@ -14,45 +14,45 @@ disable_embedding: false
 # 🎙️ Speech-to-Speech Translator
-Application Gradio moderne pour la traduction audio en temps réel, compatible avec Hugging Face Spaces (Zero GPU).
-## ✨ Fonctionnalités
-- 🎵 **Enregistrement Audio** : Interface intuitive pour enregistrer jusqu'à 30 secondes
-- 🎙️ **Transcription Automatique** : STT (Speech-to-Text) avec modèle Whisper optimisé
-- 🌍 **Traduction en Temps Réel** : Français ↔ Anglais avec Helsinki-NLP
-- 🔊 **Synthèse Vocale** : TTS (Text-to-Speech) avec gTTS et détection automatique de langue
-- 🚀 **Zero GPU Compatible** : Optimisé pour Hugging Face Spaces avec GPU à la demande
-- 🎨 **Interface Moderne** : Design responsive avec CSS personnalisé et animations
-- ⚡ **Gestion Intelligente** : Détection automatique de l'environnement (local/cloud)
-- 🔧 **Configuration Automatique** : Ports et paramètres adaptés selon le déploiement
-## 🚀 Utilisation
-### Interface Web (Gradio)
-1. **Enregistrer l'Audio** : Cliquez sur "Record" et parlez dans votre microphone (max 30 secondes)
-2. **Configurer les Langues** : Sélectionnez la langue source (fr/en) et cible (en/fr)
-3. **Traiter l'Audio** : Cliquez sur "🚀 Process Audio"
-4. **Consulter les Résultats** :
-   - **Onglet "🔊 Generated Audio"** : Audio traduit généré
-   - **Onglet "🎙️ Transcription"** : Texte transcrit
-   - **Onglet "🌍 Translation"** : Texte traduit
-### Flux de Traitement
 ```
 Audio Input → STT (Whisper) → Translation (Helsinki-NLP) → TTS (gTTS) → Audio Output
      ↓              ↓                    ↓                      ↓
-   Enregistré    Transcrit           Traduit              Audio Généré
 ```
-### Fonctionnalités Avancées
-- **Limitation Automatique** : Les enregistrements > 30s sont automatiquement tronqués
-- **Détection de Langue** : Le TTS détecte automatiquement la langue du texte traduit
-- **Interface Responsive** : Design adaptatif avec animations et transitions fluides
-- **Gestion d'Erreurs** : Messages de statut en temps réel avec codes couleur
 ## 🛠️ Installation
@@ -107,32 +107,32 @@ TradLiveHug/
 The `requirements.txt` file contains all necessary dependencies for automatic deployment.
-## 🎯 Fonctionnalités Techniques
 ### STT (Speech-to-Text)
-- **Modèle** : OpenAI Whisper Small (openai/whisper-small)
-- **Optimisation** : CPU/GPU adaptatif selon l'environnement
-- **Langues** : Support français et anglais avec détection automatique
-- **Limitation** : Troncature automatique à 30 secondes
-- **Performance** : Optimisé pour Zero GPU de Hugging Face Spaces
-### Traduction
-- **Modèles** : Helsinki-NLP Opus-MT (fr-en et en-fr)
-- **Support** : Français ↔ Anglais bidirectionnel
-- **Nettoyage** : Suppression automatique des préfixes de traduction
-- **Performance** : Chargement intelligent des modèles
 ### TTS (Text-to-Speech)
-- **Moteur** : gTTS (Google Text-to-Speech) pour tous les environnements
-- **Détection** : Langue automatique basée sur le contenu du texte
-- **Qualité** : Voix naturelles pour français et anglais
-- **Format** : MP3 optimisé pour la diffusion web
 ### Architecture
-- **Zero GPU** : Support complet avec décorateur `@spaces.GPU`
-- **Environnement** : Détection automatique local vs Hugging Face Spaces
-- **Interface** : Gradio avec CSS personnalisé et design moderne
-- **Gestion d'erreurs** : Système robuste avec messages de statut
 ## 🔍 Usage Examples
@@ -155,37 +155,37 @@ Translation: "Je suis heureux de vous rencontrer"
 Audio: [French audio file]
 ```
-## 🐛 Dépannage
-### Problèmes Courants
-- **"Models not loaded"** : Attendez le chargement initial (1-2 minutes)
-- **"No transcription"** : Vérifiez la qualité audio et le volume
-- **"TTS Error"** : Vérifiez la connexion internet pour gTTS
-- **"Audio too long"** : L'audio est automatiquement tronqué à 30 secondes
-- **"Processing error"** : Vérifiez les logs pour plus de détails
 ### Performance
-- **Premier lancement** : 1-2 minutes (téléchargement des modèles)
-- **Traitement audio** : 5-15 secondes selon la durée
-- **Mémoire** : ~2-3 GB RAM requis
-- **GPU** : Utilisation automatique si disponible (Zero GPU sur HF Spaces)
-### Configuration Environnement
-- **Local** : Détection automatique GPU/CPU, port libre automatique
-- **Hugging Face Spaces** : Configuration Zero GPU automatique
-- **Déploiement** : Ports et paramètres adaptés automatiquement
 ## 📝 Notes
-- **Zero GPU** : Compatible avec Hugging Face Spaces Zero GPU
-- **Optimisation** : CPU/GPU adaptatif selon l'environnement
-- **Interface** : Design moderne avec CSS personnalisé et animations
-- **Formats** : Support WAV, MP3 et autres formats audio courants
-- **Limitation** : Audio automatiquement tronqué à 30 secondes
-- **Déploiement** : Configuration automatique pour local et cloud
 ## 📄 License

 # 🎙️ Speech-to-Speech Translator
+Modern Gradio application for real-time audio translation, compatible with Hugging Face Spaces (Zero GPU).
+## ✨ Features
+- 🎵 **Audio Recording** : Intuitive interface to record up to 30 seconds
+- 🎙️ **Automatic Transcription** : STT (Speech-to-Text) with optimized Whisper model
+- 🌍 **Real-time Translation** : French ↔ English with Helsinki-NLP
+- 🔊 **Speech Synthesis** : TTS (Text-to-Speech) with gTTS and automatic language detection
+- 🚀 **Zero GPU Compatible** : Optimized for Hugging Face Spaces with on-demand GPU
+- 🎨 **Modern Interface** : Responsive design with custom CSS and animations
+- ⚡ **Smart Management** : Automatic environment detection (local/cloud)
+- 🔧 **Auto Configuration** : Ports and parameters adapted according to deployment
+## 🚀 Usage
+### Web Interface (Gradio)
+1. **Record Audio** : Click "Record" and speak into your microphone (max 30 seconds)
+2. **Configure Languages** : Select source (fr/en) and target (en/fr) languages
+3. **Process Audio** : Click "🚀 Process Audio"
+4. **View Results** :
+   - **"🔊 Generated Audio" Tab** : Generated translated audio
+   - **"🎙️ Transcription" Tab** : Transcribed text
+   - **"🌍 Translation" Tab** : Translated text
+### Processing Flow
 ```
 Audio Input → STT (Whisper) → Translation (Helsinki-NLP) → TTS (gTTS) → Audio Output
      ↓              ↓                    ↓                      ↓
+   Recorded     Transcribed          Translated           Generated Audio
 ```
+### Advanced Features
+- **Automatic Limitation** : Recordings > 30s are automatically truncated
+- **Language Detection** : TTS automatically detects the language of translated text
+- **Responsive Interface** : Adaptive design with smooth animations and transitions
+- **Error Management** : Real-time status messages with color codes
 ## 🛠️ Installation
 The `requirements.txt` file contains all necessary dependencies for automatic deployment.
+## 🎯 Technical Features
 ### STT (Speech-to-Text)
+- **Model** : OpenAI Whisper Small (openai/whisper-small)
+- **Optimization** : Adaptive CPU/GPU according to environment
+- **Languages** : French and English support with automatic detection
+- **Limitation** : Automatic truncation to 30 seconds
+- **Performance** : Optimized for Hugging Face Spaces Zero GPU
+### Translation
+- **Models** : Helsinki-NLP Opus-MT (fr-en and en-fr)
+- **Support** : Bidirectional French ↔ English
+- **Cleaning** : Automatic removal of translation prefixes
+- **Performance** : Smart model loading
 ### TTS (Text-to-Speech)
+- **Engine** : gTTS (Google Text-to-Speech) for all environments
+- **Detection** : Automatic language based on text content
+- **Quality** : Natural voices for French and English
+- **Format** : MP3 optimized for web streaming
 ### Architecture
+- **Zero GPU** : Full support with `@spaces.GPU` decorator
+- **Environment** : Automatic detection local vs Hugging Face Spaces
+- **Interface** : Gradio with custom CSS and modern design
+- **Error Handling** : Robust system with status messages
 ## 🔍 Usage Examples
 Audio: [French audio file]
 ```
+## 🐛 Troubleshooting
+### Common Issues
+- **"Models not loaded"** : Wait for initial loading (1-2 minutes)
+- **"No transcription"** : Check audio quality and volume
+- **"TTS Error"** : Check internet connection for gTTS
+- **"Audio too long"** : Audio is automatically truncated to 30 seconds
+- **"Processing error"** : Check logs for more details
 ### Performance
+- **First launch** : 1-2 minutes (model download)
+- **Audio processing** : 5-15 seconds depending on duration
+- **Memory** : ~2-3 GB RAM required
+- **GPU** : Automatic usage if available (Zero GPU on HF Spaces)
+### Environment Configuration
+- **Local** : Automatic GPU/CPU detection, automatic free port
+- **Hugging Face Spaces** : Automatic Zero GPU configuration
+- **Deployment** : Ports and parameters automatically adapted
 ## 📝 Notes
+- **Zero GPU** : Compatible with Hugging Face Spaces Zero GPU
+- **Optimization** : Adaptive CPU/GPU according to environment
+- **Interface** : Modern design with custom CSS and animations
+- **Formats** : Support for WAV, MP3 and other common audio formats
+- **Limitation** : Audio automatically truncated to 30 seconds
+- **Deployment** : Automatic configuration for local and cloud
 ## 📄 License