Norod78 (Doron Adler)

reacted to sergiopaniego's post with 👍 about 1 month ago

Post

5365

fine-tuning a 14B model with TRL + SFT on a free Colab (T4 GPU)?
thanks to the latest TRL optimizations, you actually can!
sharing a new notebook showing how to do it 😎

colab: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_trl_lora_qlora.ipynb

notebooks in TRL: https://github.com/huggingface/trl/tree/main/examples/notebooks

2 replies

·

reacted to John6666's post with 👍 about 1 month ago

Post

25742

If your Space stops working after restarting mainly for the last 5 days (https://discuss.huggingface.co/t/my-space-suddenly-went-offline-the-cpu-cannot-restart/151121/22), try some of following.
1. Add pydantic==2.10.6 to requirements.txt or upgrade Gradio to the latest version.
2. Upgrade PyTorch to 2.2.0 or later (torch>=2.2.0 for Zero GPU space).
3. Fix Transformers to 4.49.0 or earlier (transformers<=4.49.0for spaces using Transformers or Diffusers).
4. Fix huggingface_hub to the old version (huggingface_hub==0.25.2 for if an error like cached_download is not available occurs or inference does not work properly)
5. Specifying WORKDIR in Dockerfile may cause the application to fail to start with error 137. (Docker Spaces, https://discuss.huggingface.co/t/error-code-137-cache-error/152177)

About pydantic==2.10.6:
https://discuss.huggingface.co/t/error-no-api-found/146226
https://discuss.huggingface.co/t/internal-server-error-bool-not-iterable/149494

Edit:
Zero GPU space has been upgraded from A100 to H200.
This is likely the reason why older versions of PyTorch are no longer supported.
In fact, an error message to that effect was displayed.
zero-gpu-explorers/README#163

2 replies

·

posted an update about 1 month ago

Post

1673

Multilingual Tokenization Showdown
Analyzing 12 LLM Tokenizers Across 204 Languages.

First, I've created a dataset with Wikipedia's "Cat" article text in 272 languages:
Norod78/WikiCat-Multilingual

For each language entry with at least 100 words, I tokenized the text using 12 tokenizers and calculated the "Characters per token" ratio and "Word per token" ratio. The higher this ratio is, the more information each token represents on average for that language (and perhaps allowing the llm to potentially learn more per-parameter if trained on a dataset of that language).

You can see a slideshow summary of the results here:
https://norod.github.io/wikicat-tokenizer-eval/tokenizer-slideshow.html

I hope I interpreted the results correctly, I've made the code available on GitHub so you can re-create the raw results jsonl with this repo:
https://github.com/Norod/wikicat-tokenizer-eval

Post on X:
https://x.com/Norod78/status/1984366900550266999

reacted to ronantakizawa's post with 🚀 about 2 months ago

Post

3816

Introducing AWQ and GPTQ quantized versions of SmolVLM from Hugging Face!

These models only had their text models quantized, and had a 50% model size reduction (4GB~2GB) while keeping model degradation under 1% on the DocVQA benchmark.

#huggingface #smolvlm #smollm

ronantakizawa/SmolVLM-Instruct-awq

ronantakizawa/SmolVLM-Instruct-gptq

reacted to SelmaNajih001's post with 👍 2 months ago

Post

2287

Finally, I uploaded the model I developed for my master’s thesis! Given a financial event, it provides explained predictions based on a dataset of past news and central bank speeches.
Try it out here:
SelmaNajih001/StockPredictionExplanation
(Just restart the space and wait a minute)

The dataset used for RAG can be found here:
SelmaNajih001/FinancialNewsAndCentralBanksSpeeches-Summary-Rag
While the dataset used for the training is:
SelmaNajih001/FinancialClassification

I also wrote an article to explain how I've done the training. You can find it here https://huggingface.co/blog/SelmaNajih001/explainable-financial-predictions

2 replies

·

reacted to yeonseok-zeticai's post with 🚀 2 months ago

Post

3722

🎯 RetinaFace On-Device Deployment Study: NPU Acceleration Breakthrough!
(Check details at :https://mlange.zetic.ai/p/Steve/RetinaFace)

TL;DR: Successfully deployed RetinaFace with ZETIC.MLange achieving 1.43ms inference on mobile NPU!

🔍 Complete Performance Analysis:
Latency Comparison:
- NPU: 1.43ms (Winner! 🏆)
- GPU: 3.75ms
- CPU: 21.42ms

Accuracy Metrics - SNR:
- FP16: 56.98 dB
- Integer Quantized: 48.03 dB
(Precision-Performance: Excellent trade-off maintained)

Memory Footprint:
- Model Size: 2.00 MB (highly compressed)
- Runtime Memory: 14.58 MB peak
- Deployment Ready: ✅ Production optimized

🛠 Technical Implementation:
(Runnable with Copy & Paste at the MLange link!)

📊 Device Compatibility Matrix:
Tested on 50+ devices including Samsung Galaxy series, Google Pixel lineup, and Xiaomi devices, iPhones and iPads.
Consistent sub-5ms performance across the board!

🚀 Applications Unlocked:
- Real-time AR/VR face tracking
- Privacy-preserving edge authentication
- Live video processing pipelines
- Mobile security applications
- Interactive camera filters

The democratization of high-performance computer vision on mobile devices is happening NOW! This study proves that complex CV models can run efficiently on consumer hardware without compromising accuracy.
Want to reproduce these results? Check out the benchmark methodology and implementation guide!

reacted to omarkamali's post with 🚀 2 months ago

Post

1601

**Wikipedia Monthly's September edition is now live 🎉**

Highlights of this edition:
· 🗣️ 341 languages
· 📚 63.1M articles
· 📦 86.5GB of data

This update also solves upload issues in the August edition where some languages had missing parts. Happy data engineering!

omarkamali/wikipedia-monthly

2 replies

·

reacted to DawnC's post with 🔥 3 months ago

Post

6700

PawMatchAI — Now with SBERT-Powered Recommendations! 🐶✨

⭐️ NEW: Description-based recommendations are here!
Just type in your lifestyle or preferences (e.g. “I live in an apartment and want a quiet dog”), and PawMatchAI uses SBERT semantic embeddings to understand your needs and suggest compatible breeds.

What can PawMatchAI do today?
📸 Upload a photo to identify your dog from 124 breeds with detailed info.
⚖️ Compare two breeds side-by-side, from grooming needs to health insights.
📊 Visualize breed traits with radar and comparison charts.
🎨 Try Style Transfer to turn your dog’s photo into anime, watercolor, cyberpunk, and more.

What’s next?
🎯 More fine-tuned recommendations.
📱 Mobile-friendly deployment.
🐾 Expansion to additional species.

My goal:
To make breed discovery not only accurate but also interactive and fun — combining computer vision, semantic understanding, and creativity to help people find their perfect companion.

👉 Try it here:
DawnC/PawMatchAI

If you enjoy PawMatchAI, please give the project a ❤️ — it really helps and keeps me motivated to keep improving!

#ComputerVision #SBERT #DeepLearning #MachineLearning #TechForLife

reacted to Xenova's post with 🚀 4 months ago

Post

11848

Okay this is insane... WebGPU-accelerated semantic video tracking, powered by DINOv3 and Transformers.js! 🤯
Demo (+ source code): webml-community/DINOv3-video-tracking

This will revolutionize AI-powered video editors... which can now run 100% locally in your browser, no server inference required (costs $0)! 😍

How does it work? 🤔
1️⃣ Generate and cache image features for each frame
2️⃣ Create a list of embeddings for selected patch(es)
3️⃣ Compute cosine similarity between each patch and the selected patch(es)
4️⃣ Highlight those whose score is above some threshold

... et voilà! 🥳

You can also make selections across frames to improve temporal consistency! This is super useful if the object changes its appearance slightly throughout the video.

Excited to see what the community builds with it!

1 reply

·

reacted to anakin87's post with 👍 4 months ago

Post

4735

Want to quickly try Gemma 3 270m? 💎💬

I made a simple Space to do that: anakin87/gemma-3-270m-it

⚡ Fast: Flash Attention, Zero GPU
⚙️ Configurable

reacted to daavoo's post with 😎 9 months ago

Post

2028

🤖 🗺Mapped all(?) the swimming pools ️🏊 around another town with https://github.com/mozilla-ai/osm-ai-helper.

This time, I have mapped and contributed to https://www.openstreetmap.org more than 100 swimming pools around my wife's hometown. Only took about 20min to find them all (+~3 min verification) in a free Colab GPU🚀

Try it yourself around a single point: mozilla-ai/osm-ai-helper

reacted to etemiz's post with 😎 9 months ago

Post

1718

Started fine tuning Gemma 3 using evolutionary approach. It is not the worst model according to AHA leaderboard and it is one of the smart according to lmarena.ai. My objective is to make it based, anti woke, wise, beneficial and then some.

Several GPUs are fine tuning it at the same time, each using a different dataset and using QLoRA and the successful ones are merged later. Compared to LoRa this allows faster training and also reduced overfitting because the merge operation heals overfitting. The problem with this could be the 4 bit quantization may make models dumber. But I am not looking for sheer IQ. Too much mind is a problem anyway :)

Has anyone tried parallel QLoRa and merge before?

I also automated the dataset selection and benchmarking and converging to objectives (the fit function, the reward). It is basically trying to get higher score in AHA Leaderboard as fast as possible with a diverse set of organisms that "evolve by training".

I want to release some cool stuff when I have the time:
- how an answer to a single question changes over time, with each training round or day
- a chart to show AHA alignment over training rounds

3 replies

·

reacted to multimodalart's post with 🚀 10 months ago

Post

35548

New feature 🔥
Image models and LoRAs now have little previews 🤏

If you don't know where to start to find them, I invite you to browse cool LoRAs in the profile of some amazing fine-tuners: @artificialguybr , @alvdansen , @DoctorDiffusion , @e-n-v-y , @KappaNeuro @ostris

3 replies

·

reacted to schuler's post with 👍 10 months ago

Post

7265

📢 New Research Alert: Making Language Models Smaller & Smarter!

Thrilled to share the latest technical report demonstrating how to reduce language model parameters by 77% while maintaining performance.

The secret? Grouped pointwise convolutions. Yes. We brought a method from computer vision to the transformers arena.

🔑 Key Findings:
• 77% parameter reduction.
• Maintained model capabilities.
• Improved generalization.

Paper: https://www.researchgate.net/publication/388835829_SAVING_77_OF_THE_PARAMETERS_IN_LARGE_LANGUAGE_MODELS_TECHNICAL_REPORT
Code: https://github.com/joaopauloschuler/less-parameters-llm

2 replies

·

reacted to grimjim's post with 👍 10 months ago

Post

2630

I've made yet another merge of reasoning models with incremental gains on the current Open LLM leaderboard.
open-llm-leaderboard/open_llm_leaderboard

Merging in DeepSeek R1 distillation to Llama 3.1 8B (at 10% task arithmetic weight, using the Llama 3.1 8B base model as the case rather than the instruct model) with a prior best merge resulted in a slightly lower IFEval, but a higher result in every other benchmark save for MMLU-PRO, which went down only marginally. MATH Lvl5 and GPQA went up palpably.
grimjim/DeepSauerHuatuoSkywork-R1-o1-Llama-3.1-8B

This result is currently my best Llama 3.1 8B merge result to date. The actual R1 distillation itself scored quite badly, so this would seem to be another case of unexpected formatting (reflected in IFEval) hurting the evaluation results, obscuring the strength of a model.

It is also possible to use the text generation feature of this model to generate roleplay completions. Based on informal testing, this model's bias toward problem-solving will subtly impact narration.

reacted to merve's post with 🚀 10 months ago

Post

2387

IBM released ibm-granite/granite-vision-3.1-2b-preview, a small vision LM with impressive performance on different tasks 😮🔥

it comes with transformers and vLLM support from the get-go 💗
you can run it in Colab T4, so I built a notebook to put it to test, find it here: https://github.com/merveenoyan/smol-vision/blob/main/inference_gists/IBM_Granite_Vision.ipynb

reacted to wenhuach's post with 👍 12 months ago

Post

2347

Are we the only providers of INT4 quantized models for Llama 3.2 VL?
OPEA/Llama-3.2-90B-Vision-Instruct-int4-sym-inc
OPEA/Llama-3.2-11B-Vision-Instruct-int4-sym-inc

3 replies

·

reacted to nyuuzyou's post with 👍 12 months ago

Post

1333

🎮 GoodGame.ru Clips Dataset - nyuuzyou/goodgame

A collection of 39,280 video clips metadata from GoodGame.ru streaming platform featuring:

- Complete clip information including direct video URLs and thumbnails
- Streamer details like usernames and avatars
- Engagement metrics such as view counts
- Game categories and content classifications
- Released under Creative Commons Zero (CC0) license

This extensive clips collection provides a valuable resource for developing and evaluating video-based AI applications, especially in Russian gaming and streaming contexts.

reacted to merve's post with ❤️ 12 months ago

Post

2857

Aya by Cohere For AI can now see! 👀

C4AI community has built Maya 8B, a new open-source multilingual VLM built on SigLIP and Aya 8B 🌱 works on 8 languages! 🗣️

The authors extend Llava dataset using Aya's translation capabilities with 558k examples!
ry it here kkr5155/maya_demo

Dataset maya-multimodal/pretrain

Model maya-multimodal/maya 👏
kudos @nahidalam and team

1 reply

·

reacted to dylanebert's post with 🔥 about 1 year ago

Post

5900

🟦 New open-source Image-to-3D model from Microsoft

TRELLIS: Structured 3D Latents for Scalable and Versatile 3D Generation

it's really good! the topology isn't clean, but it's a very very good 3D reference

https://huggingface.co/JeffreyXiang/TRELLIS-image-large

2 replies

·

Doron Adler PRO

AI & ML interests

Recent Activity

Organizations

Doron Adler PRO

AI & ML interests

Recent Activity

Organizations

Norod78's activity