|
|
--- |
|
|
title: Sentiment Model Comparison |
|
|
emoji: π |
|
|
colorFrom: pink |
|
|
colorTo: indigo |
|
|
sdk: streamlit |
|
|
sdk_version: 5.37.0 |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
license: mit |
|
|
short_description: Compare sentiment predictions from two deep learning models |
|
|
--- |
|
|
|
|
|
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |
|
|
|
|
|
# π Sentiment Model Comparison App |
|
|
|
|
|
This Streamlit app compares two sentiment classification models trained on IMDB movie reviews. |
|
|
|
|
|
- Model A: 6M params, 50k vocab (fast & lightweight) |
|
|
- Model B: 34M params, 256k vocab (high capacity) |
|
|
- Ensemble: Average of both predictions |
|
|
|
|
|
π **Live Demo:** [Try it on Spaces](https://huggingface.co/spaces/Daksh0505/sentiment-model-comparison) |
|
|
|
|
|
--- |
|
|
|
|
|
## π Features |
|
|
|
|
|
- Enter single review text or upload a CSV (`review` column) |
|
|
- Get predictions from both models + ensemble average |
|
|
- Compare probabilities visually |
|
|
- Submit feedback (saved to Google Sheets) |
|
|
|
|
|
|
|
|
## π§ Models |
|
|
|
|
|
### πΉ Model A |
|
|
- Filename: `sentiment_model_imdb_6.6M.keras` |
|
|
- **Trainable Parameters**: ~6.6 million |
|
|
- **Total Parameters**: ~13.06 million |
|
|
- **Vocabulary Size**: 50,000 tokens |
|
|
- Description: Lightweight and efficient; optimized for speed. |
|
|
|
|
|
### πΉ Model B |
|
|
- Filename: `sentiment_model_imdb_34M.keras` |
|
|
- **Trainable Parameters**: ~34 million |
|
|
- **Total Parameters**: ~99.43 million |
|
|
- **Vocabulary Size**: 256,000 tokens |
|
|
- Description: Larger and more expressive; higher accuracy on nuanced reviews. |
|
|
|
|
|
--- |
|
|
|
|
|
## π Tokenizers |
|
|
|
|
|
Each model uses its own tokenizer in Keras JSON format: |
|
|
|
|
|
- `tokenizer_50k.json` β used with Model A |
|
|
- `tokenizer_256k.json` β used with Model B |
|
|
|
|
|
--- |
|
|
|
|
|
## π§ Load Models & Tokenizers (from Hugging Face Hub) |
|
|
|
|
|
```python |
|
|
from huggingface_hub import hf_hub_download |
|
|
from tensorflow.keras.models import load_model |
|
|
from tensorflow.keras.preprocessing.text import tokenizer_from_json |
|
|
import json |
|
|
|
|
|
# === Model A === |
|
|
model_path_a = hf_hub_download(repo_id="Daksh0505/sentiment-model-imdb", filename="sentiment_model_imdb_6.6M.keras") |
|
|
tokenizer_path_a = hf_hub_download(repo_id="Daksh0505/sentiment-model-imdb", filename="tokenizer_50k.json") |
|
|
|
|
|
with open(tokenizer_path_a, "r") as f: |
|
|
tokenizer_a = tokenizer_from_json(json.load(f)) |
|
|
|
|
|
model_a = load_model(model_path_a) |
|
|
|
|
|
# === Model B === |
|
|
model_path_b = hf_hub_download(repo_id="Daksh0505/sentiment-model-imdb", filename="sentiment_model_imdb_34M.keras") |
|
|
tokenizer_path_b = hf_hub_download(repo_id="Daksh0505/sentiment-model-imdb", filename="tokenizer_256k.json") |
|
|
|
|
|
with open(tokenizer_path_b, "r") as f: |
|
|
tokenizer_b = tokenizer_from_json(json.load(f)) |
|
|
|
|
|
model_b = load_model(model_path_b) |
|
|
``` |
|
|
--- |
|
|
|
|
|
## π Dataset |
|
|
|
|
|
- **Source:** [IMDB Multi-Movie Dataset](https://huggingface.co/datasets/Daksh0505/IMDB-Reviews) |
|
|
|
|
|
|
|
|
## Citation (Please add if you use this dataset) |
|
|
```ruby |
|
|
@misc{imdb-multimovie-reviews, |
|
|
title = {IMDb Multi-Movie Review Dataset}, |
|
|
author = {Daksh Bhardwaj}, |
|
|
year = {2025}, |
|
|
url = {https://huggingface.co/datasets/Daksh0505/IMDB-Reviews |
|
|
note = {Accessed: 2025-07-17} |
|
|
} |
|
|
``` |
|
|
--- |