software-zetic (ZeticAI)

published a model 8 days ago

software-zetic/YOLO26n

Updated 8 days ago

updated 8 models 13 days ago

New activity in sentence-transformers/all-mpnet-base-v2 4 months ago

all-mpnet-base-v2 Complete On-device Study: SOTA Sentence Embeddings on Mobile

#42 opened 4 months ago by

software-zetic

reacted to yeonseok-zeticai's post with 👀🚀 5 months ago

Post

3719

⚡ RexBERT Complete On-device Study: Comprehensive Performance Analysis Across Mobile Devices
(Check details at https://mlange.zetic.ai/p/Steve/RexBERT)

TL;DR: Transformer models are now practical for real-time mobile applications. The cloud-to-edge AI migration is complete.

- Original model from @thebajajra

🎯 Study Overview:
- Model: RexBERT (ModernBERT for E-commerce)
- Focus: Real-world deployment viability and performance analysis

📊 Key Performance Metrics:

Latency Results:
- NPU (Best): 4.74ms average
- GPU: 12.56ms average
- CPU: 35.16ms average

NPU Advantage: 16.98x speedup over CPU

Memory Efficiency:
- Model Size: 568.96 MB (compressed for mobile)
- Runtime Memory: 299.01 MB peak consumption
- Load Memory Range: 285 MB - 1,072 MB across devices

Accuracy Preservation:
- FP16 Precision: 63.72 dB
- Quantized Mode: Available with minimal accuracy loss
- Inference Quality: Production-grade maintained

🛠 Technical Implementation:
(Runnable with Copy & Paste at the ZETIC.MLange link: https://mlange.zetic.ai/p/Steve/RexBERT)

This study demonstrates that:

Transformer models are viable for real-time mobile applications
NPU acceleration provides the breakthrough needed for practical deployment
Mobile-first AI architecture is now technically feasible
The performance gap between cloud and edge inference is rapidly closing

🚀 Real-World Applications Enabled:

E-commerce Intelligence:
- Instant product search and discovery
- Real-time semantic matching
- Context-aware recommendations
- Natural language query processing

Conversational Commerce:
- Voice-to-product search
- Chatbot-style shopping assistance
- Intent recognition and classification
- Multi-turn conversation handling

Privacy-First AI:
- On-device processing (no data transmission)
- GDPR/privacy regulation compliant
- Reduced server infrastructure costs
- Offline capability maintenance

Are you ready to integrate BERT-level language understanding into your mobile applications?

ZeticAI

AI & ML interests

Recent Activity

Organizations

software-zetic/YOLO26n

ZETIC-ai/mediapipe-hand-detection

ZETIC-ai/mediapipe-pose-estimation

ZETIC-ai/chronos-bolt-tiny

ZETIC-ai/whisper-tiny-encoder

ZETIC-ai/whisper-tiny-decoder

ZETIC-ai/whisper-small-decoder

ZETIC-ai/YOLO26l

ZETIC-ai/YOLO26s

all-mpnet-base-v2 Complete On-device Study: SOTA Sentence Embeddings on Mobile

ZeticAI

AI & ML interests

Recent Activity

Organizations

software-zetic's activity

all-mpnet-base-v2 Complete On-device Study: SOTA Sentence Embeddings on Mobile