Spaces:
Running
A newer version of the Gradio SDK is available:
6.1.0
title: Fistal AI
emoji: ๐
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: 6.0.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: Finetuning Studio
python_version: 3.11
tags:
- mcp-in-action-track-enterprise
- mcp-in-action-track-consumer
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
๐ Fistal AI - Autonomous Fine-Tuning Platform
Agentic AI that seamlessly finetunes LLM's with Unsloth and Modal
๐ฏ What is Fistal AI?
Fistal AI is an autonomous fine-tuning platform that transforms the complex process of training custom language models into a single-click experience. Simply specify your topic, and Fistal handles everything:
- ๐ค Synthetic Dataset Generation - Creates high-quality training data using LLMs
- ๐ Automatic Data Formatting - Converts to chat/instruction format
- ๐๏ธ Serverless Training - Fine-tunes models on Modal's GPU infrastructure
- ๐ LLM-as-Judge Evaluation - Validates model performance
- ๐ค Hugging Face Deployment - Publishes your model automatically
No ML expertise required. No infrastructure setup. Just results.
โจ Features
๐จ Intuitive Interface
- Clean Gradio-based web UI hosted on Hugging Face Spaces
- Real-time training progress with educational insights
- Automatic Hugging Face integration with one-click model access
- Direct model upload in native HF format (ready to use immediately)
โก Blazing Fast Training
- 3x faster dataset generation with parallel API calls (Gemini)
- 2x faster training with Unsloth optimization and Modal GPU's
- 70% less memory usage via 4-bit quantization
- Training completes in 10-20 minutes for 500 samples
๐ง Smart Defaults
- 4-bit quantization for optimal quality/size balance
- LoRA fine-tuning (updates only 0.1% of parameters)
- Supports 1B-3B parameter models (Qwen, Llama, Gemma, Phi)
- Automatic hyperparameter optimization
- Native HF format upload (no conversion needed)
๐ฌ Quality Assurance
- LLM-as-judge evaluation system
- Coherence, relevance, and accuracy testing
- Comprehensive evaluation reports
- Real-time monitoring of training metrics
๐ MCP-Powered Workflow
- Agentic orchestration using Model Context Protocol (MCP)
- 4 specialized MCP tools for end-to-end automation
- Intelligent decision-making throughout the pipeline
- Seamless tool coordination for optimal results
โจ Sponsors:
- Modal Labs : Seamless T4 GPU access
- Gemini API : Handles majority of LLM tasks including data generation and agentic control
Watch Demo:
Note: The demo runs only 5 samples for speed, but you can scale it to 2000+ in real use.
How Fistal AI Works Behind the Scenes
- Fistal AI runs on an agentic workflow powered by LangGraph.
- Instead of a fixed script, an AI agent decides what step to run next.
- All the actual work (dataset generation, formatting, training, evaluation) is done by MCP tools.
- The agent just thinks โ MCP tools do the work โ agent continues automatically.
The MCP Server
Hosts four tools:
generate_json_dataโ creates synthetic training dataformat_jsonโ converts it to ChatML formatfinetune_modelโ runs Unsloth training on Modalllm_as_judgeโ evaluates the trained model
Each tool is isolated and safe.
Returns clean, structured results that the agent uses.
Pipeline Flow (Step-by-Step)
1. Dataset Generation Agent calls the tool โ LLMs generate 20โ500 examples in parallel.
2. Dataset Formatting Agent calls next tool โ raw dataset becomes ChatML/instruction format.
3. Fine-Tuning Agent launches training on Modal using Unsloth + 4-bit QLoRA.
4. Evaluation Agent runs LLM-as-judge โ gets coherence/relevance/accuracy/ROUGE/BLEU scores with evaluate library.
5. Final Output The model and adapters are automatically uploaded to the user's(mahreenfathima) Hugging Face account (based on the HF token provided). Automatic Evaluation Report generated.
๐ ๏ธ The 4 MCP Tools
generate_json_data- Purpose: Synthetic dataset generation
- Input: Topic, sample count, task type
- Process: Parallel API calls to Gemini + Groq with intelligent prompt engineering
- Output: JSON dataset with diverse, high-quality examples
- MCP Role: Agent invokes this tool first, receives confirmation, then proceeds
format_json- Purpose: Convert raw data to training(ChatML) format
- Input: Raw JSON dataset path
- Process: Transforms to chat/instruction format optimized for fine-tuning
- Output: Formatted dataset ready for training
- MCP Role: Agent receives dataset path from previous tool, formats it automatically
finetune_model- Purpose: Execute serverless training
- Input: Formatted dataset, model name, hyperparameters
- Process: Deploys training job to Modal with Unsloth optimization
- Output: Fine-tuned model weights + training metrics
- MCP Role: Agent monitors training progress, handles failures, manages GPU resources
- Internal Functions (executed within Modal):
train_with_modal: Runs finetuning process with Unsloth and saves model in Volumeupload_to_hf_from_volume: Pushes the trained model weights to Hugging Face Hub repository
llm_as_judge- Purpose: Quality evaluation
- Input: Fine-tuned model path, test cases
- Process: Generates test prompts, evaluates responses, scores quality
- Output: Comprehensive evaluation report with metrics
- MCP Role: Final validation step, agent parses results and presents to user
- Internal Functions (executed within Modal):
evaluate_model: Runs validation metrics on the fine-tuned model during/after training
๐ง Fistal's Agentic Approach
# Agent makes decisions based on context
agent decides: "User wants Python dataset"
โ invokes generate_json_data with optimal parameters
agent observes: "Dataset generated successfully"
โ invokes format_json with received path
agent monitors: "Training at 50%, loss decreasing"
โ continues monitoring, adjusts if needed
agent validates: "Model trained, run evaluation"
โ invokes llm_as_judge for quality check
Benefits:
- ๐ฏ Intelligent Decision Making: Agent chooses best parameters and strategies
- ๐ Error Recovery: Automatically retries failed steps with adjusted parameters
- ๐ Context Awareness: Each tool receives relevant context from previous steps
- ๐ Security: MCP provides secure tool execution
- ๐ง Modularity: Tools can be updated independently without breaking the workflow
- ๐ Scalability: Easy to add new tools (e.g., hyperparameter tuning, multi-GPU training)
๐ ๏ธ Tech Stack
๐ Performance Metrics
| Metric | Value | Details |
|---|---|---|
| Dataset Generation | 3x faster | Parallel processing with API keys |
| Training Speed | 2x faster | Unsloth optimization |
| Memory Usage | -70% | 4-bit quantization |
| Training Time | 10-20 min | For 500 samples on T4 GPU |
| Model Size | ~1-2 GB | Native HF format (safetensors) |
| Parameters Updated | 0.1% | LoRA efficiency |
| MCP Tools | 4 | Autonomous workflow management |
๐ง Supported Models & Tasks
Prominent Models (1B-3B Parameters)
Qwen/Qwen2.5-1.5B-InstructQwen/Qwen2.5-3B-Instructmeta-llama/Llama-3.2-1B-Instructmeta-llama/Llama-3.2-3B-Instructgoogle/gemma-2-2b-itmicrosoft/Phi-3.5-mini-instruct
Popular Task Types
- text-generation: General text completion and content creation
- question-answering: Q&A pairs and knowledge retrieval
Output Format
- Native Hugging Face format (safetensors + adapter weights)
- Immediately usable with transformers library
- Compatible with HF Inference API
๐ฎ Try It Now
๐ Launch Fistal AI Demo
๐ฑ Read LinkedIn Post
Hosted on Hugging Face Spaces - No installation required!
๐ License
This project is licensed under the APACHE License - see the LICENSE file for details.
๐ Acknowledgments
- Anthropic MCP - For the powerful tool integration protocol
- Unsloth - For making fine-tuning accessible and fast
- Modal - For serverless GPU infrastructure
- Hugging Face - For model hosting and Spaces platform
- Google Gemini - For powerful API access
- LangGraph - For agentic orchestration framework
- Gradio - For building the interactive UI effortlessly
Powered by MCP โข Unsloth โข Modal โข Hugging Face โข Gemini API
โค๏ธ Like our space our HuggingFace โข ๐ Try the demo โข ๐ฑ Share on LinkedIn