Spaces:
Sleeping
A newer version of the Gradio SDK is available:
6.2.0
title: Interactive AI Voice Chat
emoji: π
colorFrom: pink
colorTo: red
sdk: gradio
sdk_version: 6.0.0
app_file: app.py
pinned: false
license: mit
short_description: Real-time AI voice assistant through natural speech
π Overview
Interactive AI Voice Chat is a real-time voice-driven assistant deployed on Hugging Face Spaces.
It allows users to speak naturally through their microphone and receive intelligent AI responses, both in text and audio format. The system leverages modern speech-to-text, large language models, and text-to-speech technologies to deliver a seamless conversational experience.
This project demonstrates a complete workflow from local development to live deployment on the Hugging Face platform.
β¨ Key Features
- π€ Real-time voice input processing
- π€ AI-powered responses using
google/gemma-2-2b-it - π Text-to-Speech audio replies
- π Publicly accessible live demo
- β‘ Optimized for CPU Basic hardware
- π§© Secure token-based model access
π₯οΈ How It Works
- User speaks into the microphone.
- Speech is converted to text using STT engine.
- The text is processed by the AI model.
- AI generates a response.
- Response is converted back into audio and played to the user.
π Project Structure
ai-voice-chat-test/
- β
- βββ app.py # Main application logic
- βββ README.md # Documentation
- βββ requirements.txt # Python dependencies
- βββ runtime.txt # Python version
- βββ apt.txt # System dependencies (ffmpeg)
- βββ .gitattributes # Git LFS configuration
- βββ .gitignore # Ignored files and folders
- βββ assets/ # Optional media resources
βοΈ Installation (Local Setup - Optional)
To run this project locally:
git clone https://huggingface.co/spaces/bdstar/ai-voice-chat-test
cd ai-voice-chat-test
pip install -r requirements.txt
Set your Hugging Face token:
export HF_TOKEN="your_token_here"
Run the application:
python app.py
Open your browser and visit:
http://localhost:7860
π¦ Python Environment
- Python Version: 3.11
- Gradio Version: 5.49.1
- Optimized for: CPU Basic
β οΈ Notes & Limitations
- Running on CPU may result in slower response times.
- Initial model loading may take few seconds.
- For production use, GPU-backed hardware is recommended.
- This project is intended for demonstration and learning purposes.
π Deployment Steps Summary
- Prepare project structure
- Configure requirements.txt and runtime.txt
- Add HF_TOKEN as secret
- Push source code to Hugging Face Space
- Monitor build logs
- Access live demo
π Credits
- Model: google/gemma-2-2b-it
- Platform: Hugging Face Spaces
- UI Framework: Gradio
- Speech Engine: Faster Whisper
- TTS System: PyDub + Soundfile
π£ Feedback & Contributions
Feel free to fork this Space, suggest improvements, or contribute new features. Your feedback is highly appreciated!
β If you like this project, don't forget to star the repository!
If you'd like:
- π· Screenshots section
- π₯ Video tutorial link area
- π· Badges (Deploy, Python, Gradio)
- π¨βπ» Author profile section
Just tell me β I can enhance the README further π