FistalAI / README.md
mahreenfathima's picture
Update README.md
536f160 verified

A newer version of the Gradio SDK is available: 6.1.0

Upgrade
metadata
title: Fistal AI
emoji: ๐Ÿš€
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: 6.0.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: Finetuning Studio
python_version: 3.11
tags:
  - mcp-in-action-track-enterprise
  - mcp-in-action-track-consumer

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

๐Ÿš€ Fistal AI - Autonomous Fine-Tuning Platform

HF Space Python Modal Gemini Unsloth MCP Gradio Agentic AI 1B-3B Models Evaluation Report

Agentic AI that seamlessly finetunes LLM's with Unsloth and Modal

๐ŸŽฎ Try Demo โ€ข ๐Ÿ“ฑ LinkedIn Post


๐ŸŽฏ What is Fistal AI?

Fistal AI is an autonomous fine-tuning platform that transforms the complex process of training custom language models into a single-click experience. Simply specify your topic, and Fistal handles everything:

  • ๐Ÿค– Synthetic Dataset Generation - Creates high-quality training data using LLMs
  • ๐Ÿ”„ Automatic Data Formatting - Converts to chat/instruction format
  • ๐Ÿ‹๏ธ Serverless Training - Fine-tunes models on Modal's GPU infrastructure
  • ๐Ÿ“Š LLM-as-Judge Evaluation - Validates model performance
  • ๐Ÿค— Hugging Face Deployment - Publishes your model automatically

No ML expertise required. No infrastructure setup. Just results.


โœจ Features

๐ŸŽจ Intuitive Interface

  • Clean Gradio-based web UI hosted on Hugging Face Spaces
  • Real-time training progress with educational insights
  • Automatic Hugging Face integration with one-click model access
  • Direct model upload in native HF format (ready to use immediately)

โšก Blazing Fast Training

  • 3x faster dataset generation with parallel API calls (Gemini)
  • 2x faster training with Unsloth optimization and Modal GPU's
  • 70% less memory usage via 4-bit quantization
  • Training completes in 10-20 minutes for 500 samples

๐Ÿง  Smart Defaults

  • 4-bit quantization for optimal quality/size balance
  • LoRA fine-tuning (updates only 0.1% of parameters)
  • Supports 1B-3B parameter models (Qwen, Llama, Gemma, Phi)
  • Automatic hyperparameter optimization
  • Native HF format upload (no conversion needed)

๐Ÿ”ฌ Quality Assurance

  • LLM-as-judge evaluation system
  • Coherence, relevance, and accuracy testing
  • Comprehensive evaluation reports
  • Real-time monitoring of training metrics

๐Ÿ”Œ MCP-Powered Workflow

  • Agentic orchestration using Model Context Protocol (MCP)
  • 4 specialized MCP tools for end-to-end automation
  • Intelligent decision-making throughout the pipeline
  • Seamless tool coordination for optimal results

โœจ Sponsors:

  • Modal Labs : Seamless T4 GPU access
  • Gemini API : Handles majority of LLM tasks including data generation and agentic control

Watch Demo:

Note: The demo runs only 5 samples for speed, but you can scale it to 2000+ in real use.

Demo Fistal


How Fistal AI Works Behind the Scenes

  • Fistal AI runs on an agentic workflow powered by LangGraph.
  • Instead of a fixed script, an AI agent decides what step to run next.
  • All the actual work (dataset generation, formatting, training, evaluation) is done by MCP tools.
  • The agent just thinks โ†’ MCP tools do the work โ†’ agent continues automatically.

The MCP Server

  • Hosts four tools:

    • generate_json_data โ†’ creates synthetic training data
    • format_json โ†’ converts it to ChatML format
    • finetune_model โ†’ runs Unsloth training on Modal
    • llm_as_judge โ†’ evaluates the trained model
  • Each tool is isolated and safe.

  • Returns clean, structured results that the agent uses.


Pipeline Flow (Step-by-Step)

  • 1. Dataset Generation Agent calls the tool โ†’ LLMs generate 20โ€“500 examples in parallel.

  • 2. Dataset Formatting Agent calls next tool โ†’ raw dataset becomes ChatML/instruction format.

  • 3. Fine-Tuning Agent launches training on Modal using Unsloth + 4-bit QLoRA.

  • 4. Evaluation Agent runs LLM-as-judge โ†’ gets coherence/relevance/accuracy/ROUGE/BLEU scores with evaluate library.

  • 5. Final Output The model and adapters are automatically uploaded to the user's(mahreenfathima) Hugging Face account (based on the HF token provided). Automatic Evaluation Report generated.


๐Ÿ› ๏ธ The 4 MCP Tools

  1. generate_json_data

    • Purpose: Synthetic dataset generation
    • Input: Topic, sample count, task type
    • Process: Parallel API calls to Gemini + Groq with intelligent prompt engineering
    • Output: JSON dataset with diverse, high-quality examples
    • MCP Role: Agent invokes this tool first, receives confirmation, then proceeds
  2. format_json

    • Purpose: Convert raw data to training(ChatML) format
    • Input: Raw JSON dataset path
    • Process: Transforms to chat/instruction format optimized for fine-tuning
    • Output: Formatted dataset ready for training
    • MCP Role: Agent receives dataset path from previous tool, formats it automatically
  3. finetune_model

    • Purpose: Execute serverless training
    • Input: Formatted dataset, model name, hyperparameters
    • Process: Deploys training job to Modal with Unsloth optimization
    • Output: Fine-tuned model weights + training metrics
    • MCP Role: Agent monitors training progress, handles failures, manages GPU resources
    • Internal Functions (executed within Modal):
      • train_with_modal: Runs finetuning process with Unsloth and saves model in Volume
      • upload_to_hf_from_volume: Pushes the trained model weights to Hugging Face Hub repository
  4. llm_as_judge

    • Purpose: Quality evaluation
    • Input: Fine-tuned model path, test cases
    • Process: Generates test prompts, evaluates responses, scores quality
    • Output: Comprehensive evaluation report with metrics
    • MCP Role: Final validation step, agent parses results and presents to user
  • Internal Functions (executed within Modal):
    • evaluate_model: Runs validation metrics on the fine-tuned model during/after training

๐Ÿง  Fistal's Agentic Approach

# Agent makes decisions based on context
agent decides: "User wants Python dataset"
  โ†’ invokes generate_json_data with optimal parameters
  
agent observes: "Dataset generated successfully"
  โ†’ invokes format_json with received path
  
agent monitors: "Training at 50%, loss decreasing"
  โ†’ continues monitoring, adjusts if needed
  
agent validates: "Model trained, run evaluation"
  โ†’ invokes llm_as_judge for quality check

Benefits:

  • ๐ŸŽฏ Intelligent Decision Making: Agent chooses best parameters and strategies
  • ๐Ÿ”„ Error Recovery: Automatically retries failed steps with adjusted parameters
  • ๐Ÿ“Š Context Awareness: Each tool receives relevant context from previous steps
  • ๐Ÿ”’ Security: MCP provides secure tool execution
  • ๐Ÿ”ง Modularity: Tools can be updated independently without breaking the workflow
  • ๐Ÿ“ˆ Scalability: Easy to add new tools (e.g., hyperparameter tuning, multi-GPU training)

๐Ÿ› ๏ธ Tech Stack

Core Technologies

  • Unsloth - 2x faster training, 70% less VRAM
  • Modal - Serverless GPU infrastructure
  • Gradio - Web interface on HF Spaces
  • LangGraph - Agentic workflow orchestration
  • MCP - Tool integration protocol
  • HUGGING FACE - Uploads model into repository with hf tokens

AI Models & APIs

  • Gemini Flash 2.0 - Fast dataset generation
  • Groq (Llama 3.1 70B) - LLM evaluation
  • Hugging Face - Model hosting & deployment
  • 4-bit Quantization - Optimal quality/size balance
  • Native HF Upload - No format conversion needed

๐Ÿ“Š Performance Metrics

Metric Value Details
Dataset Generation 3x faster Parallel processing with API keys
Training Speed 2x faster Unsloth optimization
Memory Usage -70% 4-bit quantization
Training Time 10-20 min For 500 samples on T4 GPU
Model Size ~1-2 GB Native HF format (safetensors)
Parameters Updated 0.1% LoRA efficiency
MCP Tools 4 Autonomous workflow management

๐Ÿ”ง Supported Models & Tasks

Prominent Models (1B-3B Parameters)

  • Qwen/Qwen2.5-1.5B-Instruct
  • Qwen/Qwen2.5-3B-Instruct
  • meta-llama/Llama-3.2-1B-Instruct
  • meta-llama/Llama-3.2-3B-Instruct
  • google/gemma-2-2b-it
  • microsoft/Phi-3.5-mini-instruct

Popular Task Types

  • text-generation: General text completion and content creation
  • question-answering: Q&A pairs and knowledge retrieval

Output Format

  • Native Hugging Face format (safetensors + adapter weights)
  • Immediately usable with transformers library
  • Compatible with HF Inference API

๐ŸŽฎ Try It Now

๐Ÿš€ Launch Fistal AI Demo

๐Ÿ“ฑ Read LinkedIn Post

Hosted on Hugging Face Spaces - No installation required!


๐Ÿ“ License

This project is licensed under the APACHE License - see the LICENSE file for details.


๐Ÿ™ Acknowledgments

  • Anthropic MCP - For the powerful tool integration protocol
  • Unsloth - For making fine-tuning accessible and fast
  • Modal - For serverless GPU infrastructure
  • Hugging Face - For model hosting and Spaces platform
  • Google Gemini - For powerful API access
  • LangGraph - For agentic orchestration framework
  • Gradio - For building the interactive UI effortlessly

Powered by MCP โ€ข Unsloth โ€ข Modal โ€ข Hugging Face โ€ข Gemini API

โค๏ธ Like our space our HuggingFace โ€ข ๐Ÿš€ Try the demo โ€ข ๐Ÿ“ฑ Share on LinkedIn