Daksh0505 commited on
Commit
39a757e
Β·
verified Β·
1 Parent(s): 1e9abcf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +88 -0
README.md CHANGED
@@ -12,3 +12,91 @@ short_description: Compare sentiment predictions from two deep learning models
12
  ---
13
 
14
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
13
 
14
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
15
+
16
+ # πŸ“Š Sentiment Model Comparison App
17
+
18
+ This Streamlit app compares two sentiment classification models trained on IMDB movie reviews.
19
+
20
+ - Model A: 6M params, 50k vocab (fast & lightweight)
21
+ - Model B: 34M params, 256k vocab (high capacity)
22
+ - Ensemble: Average of both predictions
23
+
24
+ πŸ”— **Live Demo:** [Try it on Spaces](https://huggingface.co/spaces/Daksh0505/sentiment-model-comparison)
25
+
26
+ ---
27
+
28
+ ## πŸ” Features
29
+
30
+ - Enter single review text or upload a CSV (`review` column)
31
+ - Get predictions from both models + ensemble average
32
+ - Compare probabilities visually
33
+ - Submit feedback (saved to Google Sheets)
34
+
35
+ ---
36
+
37
+ ## πŸ“ Dataset
38
+
39
+ - **Source:** [IMDB Multi-Movie Dataset](https://huggingface.co/datasets/Daksh0505/IMDB-Reviews)
40
+
41
+ ```bibtex
42
+ @misc{imdb-multimovie-reviews,
43
+ title = {IMDb Multi-Movie Review Dataset},
44
+ author = {Daksh Bhardwaj},
45
+ year = {2025},
46
+ url = {https://huggingface.co/datasets/Daksh0505/IMDB-Reviews}
47
+ }
48
+
49
+ ---
50
+
51
+ ## 🧠 Models
52
+
53
+ ### πŸ”Ή Model A
54
+ - Filename: `sentiment_model_imdb_6.6M.keras`
55
+ - **Trainable Parameters**: ~6.6 million
56
+ - **Total Parameters**: ~13.06 million
57
+ - **Vocabulary Size**: 50,000 tokens
58
+ - Description: Lightweight and efficient; optimized for speed.
59
+
60
+ ### πŸ”Ή Model B
61
+ - Filename: `sentiment_model_imdb_34M.keras`
62
+ - **Trainable Parameters**: ~34 million
63
+ - **Total Parameters**: ~99.43 million
64
+ - **Vocabulary Size**: 256,000 tokens
65
+ - Description: Larger and more expressive; higher accuracy on nuanced reviews.
66
+
67
+ ---
68
+
69
+ ## πŸ—‚ Tokenizers
70
+
71
+ Each model uses its own tokenizer in Keras JSON format:
72
+
73
+ - `tokenizer_50k.json` β†’ used with Model A
74
+ - `tokenizer_256k.json` β†’ used with Model B
75
+
76
+ ---
77
+
78
+ ## πŸ”§ Load Models & Tokenizers (from Hugging Face Hub)
79
+
80
+ ```python
81
+ from huggingface_hub import hf_hub_download
82
+ from tensorflow.keras.models import load_model
83
+ from tensorflow.keras.preprocessing.text import tokenizer_from_json
84
+ import json
85
+
86
+ # === Model A ===
87
+ model_path_a = hf_hub_download(repo_id="Daksh0505/sentiment-model-imdb", filename="sentiment_model_imdb_6.6M.keras")
88
+ tokenizer_path_a = hf_hub_download(repo_id="Daksh0505/sentiment-model-imdb", filename="tokenizer_50k.json")
89
+
90
+ with open(tokenizer_path_a, "r") as f:
91
+ tokenizer_a = tokenizer_from_json(json.load(f))
92
+
93
+ model_a = load_model(model_path_a)
94
+
95
+ # === Model B ===
96
+ model_path_b = hf_hub_download(repo_id="Daksh0505/sentiment-model-imdb", filename="sentiment_model_imdb_34M.keras")
97
+ tokenizer_path_b = hf_hub_download(repo_id="Daksh0505/sentiment-model-imdb", filename="tokenizer_256k.json")
98
+
99
+ with open(tokenizer_path_b, "r") as f:
100
+ tokenizer_b = tokenizer_from_json(json.load(f))
101
+
102
+ model_b = load_model(model_path_b)