Spaces:

Daksh0505
/

sentiment-model-comparison

Sleeping

App Files Files Community

sentiment-model-comparison / README.md

Daksh0505

Update README.md

17ae0b4 verified 5 months ago

preview code

raw

history blame contribute delete

3.06 kB

	---
	title: Sentiment Model Comparison
	emoji: 🚀
	colorFrom: pink
	colorTo: indigo
	sdk: streamlit
	sdk_version: 5.37.0
	app_file: app.py
	pinned: false
	license: mit
	short_description: Compare sentiment predictions from two deep learning models
	---

	Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

	# 📊 Sentiment Model Comparison App

	This Streamlit app compares two sentiment classification models trained on IMDB movie reviews.

	- Model A: 6M params, 50k vocab (fast & lightweight)
	- Model B: 34M params, 256k vocab (high capacity)
	- Ensemble: Average of both predictions

	🔗 Live Demo: [Try it on Spaces](https://huggingface.co/spaces/Daksh0505/sentiment-model-comparison)

	---

	## 🔍 Features

	- Enter single review text or upload a CSV (`review` column)
	- Get predictions from both models + ensemble average
	- Compare probabilities visually
	- Submit feedback (saved to Google Sheets)


	## 🧠 Models

	### 🔹 Model A
	- Filename: `sentiment_model_imdb_6.6M.keras`
	- Trainable Parameters: ~6.6 million
	- Total Parameters: ~13.06 million
	- Vocabulary Size: 50,000 tokens
	- Description: Lightweight and efficient; optimized for speed.

	### 🔹 Model B
	- Filename: `sentiment_model_imdb_34M.keras`
	- Trainable Parameters: ~34 million
	- Total Parameters: ~99.43 million
	- Vocabulary Size: 256,000 tokens
	- Description: Larger and more expressive; higher accuracy on nuanced reviews.

	---

	## 🗂 Tokenizers

	Each model uses its own tokenizer in Keras JSON format:

	- `tokenizer_50k.json` → used with Model A
	- `tokenizer_256k.json` → used with Model B

	---

	## 🔧 Load Models & Tokenizers (from Hugging Face Hub)

	```python
	from huggingface_hub import hf_hub_download
	from tensorflow.keras.models import load_model
	from tensorflow.keras.preprocessing.text import tokenizer_from_json
	import json

	# === Model A ===
	model_path_a = hf_hub_download(repo_id="Daksh0505/sentiment-model-imdb", filename="sentiment_model_imdb_6.6M.keras")
	tokenizer_path_a = hf_hub_download(repo_id="Daksh0505/sentiment-model-imdb", filename="tokenizer_50k.json")

	with open(tokenizer_path_a, "r") as f:
	tokenizer_a = tokenizer_from_json(json.load(f))

	model_a = load_model(model_path_a)

	# === Model B ===
	model_path_b = hf_hub_download(repo_id="Daksh0505/sentiment-model-imdb", filename="sentiment_model_imdb_34M.keras")
	tokenizer_path_b = hf_hub_download(repo_id="Daksh0505/sentiment-model-imdb", filename="tokenizer_256k.json")

	with open(tokenizer_path_b, "r") as f:
	tokenizer_b = tokenizer_from_json(json.load(f))

	model_b = load_model(model_path_b)
	```
	---

	## 📁 Dataset

	- Source: [IMDB Multi-Movie Dataset](https://huggingface.co/datasets/Daksh0505/IMDB-Reviews)


	## Citation (Please add if you use this dataset)
	```ruby
	@misc{imdb-multimovie-reviews,
	title = {IMDb Multi-Movie Review Dataset},
	author = {Daksh Bhardwaj},
	year = {2025},
	url = {https://huggingface.co/datasets/Daksh0505/IMDB-Reviews
	note = {Accessed: 2025-07-17}
	}
	```
	---