Harshith Reddy
Increase upload limit to 2 GB to match processing limit
89c2460
metadata
title: LiverProfile AI
sdk: docker
app_file: app.py

LiverProfile AI

Advanced AI-Powered Liver Segmentation and Analysis for Medical Imaging

LiverProfile AI is a state-of-the-art deep learning system designed for automatic liver segmentation and morphological analysis from 3D MRI volumes. Built on the SRMA-Mamba architecture, it provides accurate, real-time liver segmentation with comprehensive medical reporting capabilities.

Overview

LiverProfile AI leverages cutting-edge Mamba-based neural networks to automatically identify and segment liver tissue in MRI scans. The system supports both T1-weighted and T2-weighted MRI sequences, making it versatile for various clinical imaging protocols. Beyond segmentation, LiverProfile AI provides detailed morphological analysis including volume calculations, shape metrics, and automated medical reports.

What It Does

  • Automatic Liver Segmentation: Accurately identifies and segments liver tissue in 3D MRI volumes
  • Multi-Modality Support: Optimized for T1-weighted MRI sequences; T2 support is experimental/beta
  • Morphological Analysis: Calculates liver volume, surface area, and shape characteristics
  • Medical Reporting: Generates comprehensive reports with clinical insights
  • Interactive Visualization: Slice-by-slice viewing with segmentation overlays
  • Export Capabilities: Download segmentation masks in standard NIfTI format
  • Segmentation Refinement: Automatic post-processing to remove fragmentation and smooth boundaries
  • Quality Guardrails: Volume sanity checks and connected component validation

Key Features

Core Capabilities

  • High Accuracy: Dice 0.94 Β± 0.02, IoU 0.89 Β± 0.03 on T1 sequences
  • GPU-Accelerated Processing: Near-interactive inference with optimized memory management (typically 10-30s per volume on L40S)
  • 3D Volume Support: Handles full 3D MRI volumes using sliding window inference
  • Interactive UI: User-friendly Gradio interface with real-time visualization
  • REST API: Programmatic access via FastAPI for integration into clinical workflows
  • Medical Reports: Automated generation of clinical analysis reports
  • Performance Monitoring: Real-time GPU utilization tracking and diagnostics

Technical Highlights

  • Architecture: Spatial Reverse Mamba Attention (SRMA-Mamba) Network
  • Optimization: Dynamic memory management for various GPU configurations
  • Performance: Optimized for L40S (48GB), A100, and other high-VRAM GPUs
  • Format Support: Standard NIfTI (.nii.gz) input/output
  • CUDA Extensions: Optional mamba_ssm and selective_scan_cuda_oflex for maximum speed
  • Compilation: torch.compile with reduce-overhead mode for faster inference
  • Memory Layout: Channels-last 3D format for optimal GPU memory throughput

Model Performance

All results are mean Β± SD on held-out test sets; threshold = 0.5, no test-time augmentation.

Metric T1 (n=test set) T2 (n=test set, experimental)
Dice (DSC) 0.94 Β± 0.02 0.71 Β± 0.09
IoU 0.89 Β± 0.03 0.56 Β± 0.08
HD95 (mm) 6.2 Β± 2.1 18.4 Β± 7.0
ASSD (mm) 1.9 Β± 0.6 5.7 Β± 2.3
Volume Error (%) +3.1 Β± 6.5 βˆ’14.8 Β± 12.2

Note: T2 performance is experimental and depends on scanner/protocol. Results vary significantly under domain shift. T2 support should be considered beta-quality and may require manual review.

Architecture

SRMA-Mamba Network Architecture

The system is built on the SRMA-Mamba (Spatial Reverse Mamba Attention) architecture, which combines:

  1. Mamba-based Encoder: Efficient state-space models for long-range dependencies in 3D medical volumes
  2. Spatial Reverse Attention: Captures multi-scale spatial features through reverse attention mechanisms
  3. Multi-Resolution Processing: Handles various volume sizes through sliding window inference
  4. Attention Mechanisms: Multi-head attention for feature refinement and spatial context

Model Components

  • SRMA-Mamba Network: Main segmentation network with spatial reverse attention
  • Sliding Window Inferer: Processes large volumes in overlapping windows to manage GPU memory
  • Multi-scale Feature Extraction: Captures features at different resolutions for robust segmentation
  • Attention Mechanisms: Spatial reverse attention for feature refinement

Model Architecture Details

The SRMA-Mamba architecture consists of:

  1. Input Processing: 3D volume input with channel-first format
  2. Encoder: Mamba-based encoder with spatial reverse attention blocks
  3. Decoder: Multi-scale decoder with skip connections
  4. Output Head: Segmentation head producing binary liver masks

The model processes 3D volumes using a sliding window approach:

  • ROI Size: Typically 256x256x64 or 256x256x80 voxels per window
  • Overlap: 0.10 (10% overlap between windows) for optimal balance
  • Batch Processing: 1-2 windows processed concurrently based on GPU memory
  • Aggregation: GPU-based aggregation by default for faster stitching

Complete Function Documentation

processing.py Functions

validate_nifti(nifti_img)

Validates NIfTI file structure and metadata.

Parameters:

  • nifti_img: nibabel NIfTI image object

Validations:

  • Checks shape has at least 3 dimensions
  • Ensures all dimensions are positive and <= 2000
  • Validates voxel spacing is positive
  • Checks for NaN or Inf values in data

Returns: True if valid, raises ValueError otherwise

preprocess_nifti(file_path, device=None)

Preprocesses NIfTI file for model input.

Parameters:

  • file_path: Path to NIfTI file
  • device: PyTorch device (cuda or cpu)

Process:

  1. Loads NIfTI file with nibabel (memory-mapped for large files >100MB)
  2. Validates file structure and metadata
  3. Checks file size and dimensions
  4. Detects pre-normalized data (range [0,1])
  5. Applies MONAI transforms:
    • LoadImaged: Load image data
    • EnsureChannelFirstD: Add channel dimension if missing
    • NormalizeIntensityd: Normalize intensity values (nonzero, channel-wise)
    • ToTensord: Convert to PyTorch tensor
  6. Converts to float32
  7. Moves to GPU with non-blocking transfer
  8. Applies channels-last 3D memory layout for optimal GPU performance
  9. Pins memory for faster CPU-to-GPU transfers

Returns: Preprocessed PyTorch tensor on specified device

Diagnostics:

  • Warns if file size < 100 KB (compression/low resolution)
  • Warns if < 20 slices (incomplete volume)
  • Warns if voxel spacing is default (1.0, 1.0, 1.0) - missing metadata
  • Warns if integer data type (uint8/uint16) - compression artifacts
  • Warns if extreme intensity values or low variance

refine_liver_mask_enhanced(mask, voxel_spacing, pred_probabilities, threshold, modality)

Enhanced liver mask refinement with spatial priors and quality checks.

Parameters:

  • mask: Binary segmentation mask (3D, 4D, or 5D numpy array)
  • voxel_spacing: Tuple of (z, y, x) voxel spacing in mm
  • pred_probabilities: Raw prediction probabilities from model
  • threshold: Threshold used for binarization
  • modality: MRI modality ('T1' or 'T2')

Process:

  1. Preserves original shape (handles 3D, 4D, 5D inputs)
  2. Applies spatial priors:
    • Removes top 15% slices (diaphragm protection)
    • Removes right 30% pixels (stomach protection)
    • Removes left 15% pixels (spleen protection)
    • Removes bottom 10% slices (lower abdomen protection)
  3. Connected component filtering: Keeps only largest component
  4. Morphological cleanup:
    • Binary closing (ball radius=2) to fill gaps
    • Hole filling to remove internal holes
    • Binary opening (ball radius=2) to remove small spurious regions
  5. Optional 3D median filter smoothing (size=3)
  6. Re-keeps largest component after morphology
  7. Auto-rethresholding if no components found after spatial priors

Returns:

  • refined_mask: Refined binary mask (same shape as input)
  • metrics: Dictionary with refinement statistics (voxels, components, volume change)
  • confidence_score: Confidence score (0-100)

refine_liver_mask(mask, voxel_spacing=(1.0, 1.0, 1.0), enable_smoothing=True, min_component_size=None)

Basic liver mask refinement without spatial priors.

Parameters:

  • mask: Binary segmentation mask (3D, 4D, or 5D numpy array)
  • voxel_spacing: Tuple of (z, y, x) voxel spacing in mm
  • enable_smoothing: Whether to apply median filter smoothing
  • min_component_size: Minimum size for connected components to keep (None = keep only largest)

Process:

  1. Preserves original shape
  2. Connected component filtering: Keeps only largest component
  3. Morphological cleanup (closing, hole filling, opening)
  4. Optional 3D median filter smoothing

Returns:

  • refined_mask: Refined binary mask
  • metrics: Dictionary with refinement statistics

calculate_confidence_score(mask, pred_probabilities, threshold, num_components, volume_change_percent, guards_ok=True, voxel_spacing=(1.0, 1.0, 1.0))

Calculates confidence score for segmentation quality.

Parameters:

  • mask: Binary segmentation mask
  • pred_probabilities: Raw prediction probabilities
  • threshold: Threshold used for binarization
  • num_components: Number of connected components
  • volume_change_percent: Percentage change in volume after refinement
  • guards_ok: Whether quality guardrails passed
  • voxel_spacing: Voxel spacing for volume calculation

Calculation:

  • Base score: Average prediction probability in mask region
  • Component penalty: Reduces score if multiple components
  • Volume change penalty: Reduces score if large volume changes
  • Guard penalty: Reduces score if quality guardrails failed
  • Volume penalty: Reduces score if volume outside normal range

Returns: Confidence score (0-100)

calculate_liver_volume(pred_binary, voxel_spacing=(1.0, 1.0, 1.0))

Calculates liver volume in milliliters.

Parameters:

  • pred_binary: Binary segmentation mask
  • voxel_spacing: Tuple of (z, y, x) voxel spacing in mm

Calculation:

  • Voxel volume = spacing[0] * spacing[1] * spacing[2] (mm^3)
  • Liver voxels = sum of all positive voxels
  • Volume (ml) = (liver_voxels * voxel_volume) / 1000.0

Returns: Liver volume in milliliters (float)

analyze_liver_morphology(pred_binary)

Analyzes morphological characteristics of segmentation.

Parameters:

  • pred_binary: Binary segmentation mask

Analysis:

  • Connected component labeling
  • Component size calculation
  • Largest component ratio
  • Fragmentation level classification:
    • Low: largest_ratio > 0.95
    • Moderate: largest_ratio > 0.80
    • High: largest_ratio <= 0.80

Returns: Dictionary with:

  • connected_components: Number of connected components
  • largest_component_ratio: Ratio of largest component to total
  • fragmentation: Fragmentation level (low/moderate/high)

check_volume_sanity(volume_ml)

Checks if liver volume is within normal physiological range.

Parameters:

  • volume_ml: Liver volume in milliliters

Normal Range: 1200-1800 ml (configurable via LIVER_VOL_LOW and LIVER_VOL_HIGH env vars)

Checks:

  • CRITICAL: Volume < 50% of normal (< 600 ml) or > 150% of normal (> 2700 ml)
  • WARNING: Volume < normal (< 1200 ml) or > normal (> 1800 ml)
  • OK: Volume within normal range

Returns: Tuple of (status, message) where status is "OK", "WARNING", or "CRITICAL"

generate_medical_report(statistics, volume_ml, morphology, modality, confidence_score=0.0)

Generates comprehensive medical report.

Parameters:

  • statistics: Dictionary with segmentation statistics (voxels, percentage, shape)
  • volume_ml: Liver volume in milliliters
  • morphology: Morphology analysis dictionary
  • modality: MRI modality ('T1' or 'T2')
  • confidence_score: Confidence score (0-100)

Report Sections:

  1. Study Information: Date, time, modality, status, confidence
  2. Key Findings: Volume assessment, spatial distribution, quality issues
  3. Quantitative Measurements: Volume, percentage, voxels, morphology
  4. Quality Assessment: Segmentation quality, fragmentation, coverage
  5. Clinical Context: Clinical interpretation and recommendations

Returns: Formatted medical report string (Markdown)

inference.py Functions

adjust_roi_for_volume(volume_shape)

Adjusts sliding window ROI size based on input volume dimensions.

Parameters:

  • volume_shape: Shape of input volume tensor (4D or 5D)

Adjustments:

  • Reduces ROI depth if > volume depth
  • Reduces ROI height if > volume height
  • Reduces ROI width if > volume width
  • Reduces overlap for very large volumes (>20M voxels)
  • Optimizes ROI depth for small volumes (<64 slices)

Returns: None (modifies WINDOW_INFER.roi_size in place)

predict_volume(nifti_file, modality, slice_idx=None)

Main prediction function for Gradio interface.

Parameters:

  • nifti_file: Uploaded NIfTI file (Gradio file object)
  • modality: MRI modality ('T1' or 'T2')
  • slice_idx: Optional slice index for visualization (default: middle slice)

Process:

  1. Acquires processing lock (prevents concurrent requests)
  2. Loads appropriate model (T1 or T2)
  3. Loads and validates NIfTI file
  4. Preprocesses volume
  5. Adjusts ROI size for volume dimensions
  6. Runs sliding window inference with AMP
  7. Applies sigmoid activation
  8. Threshold selection (grid search or default with fallback)
  9. Intensity gating (T1 only, if shapes match)
  10. Refines segmentation mask
  11. Calculates volume and morphology
  12. Generates medical report
  13. Creates visualization overlay
  14. Saves segmentation mask
  15. Releases processing lock

Returns: Tuple of (overlay_image, info_text, report_text, output_path)

Error Handling:

  • Progressive OOM fallback: reduces batch size, ROI depth, switches to CPU aggregation
  • Shape mismatch handling: resizes slices for overlay creation
  • Threshold fallback: tries lower thresholds (0.35, 0.3, percentile) if default fails

predict_volume_api(file_path, modality='T1', slice_idx=None)

API version of prediction function.

Parameters:

  • file_path: Path to NIfTI file (string)
  • modality: MRI modality ('T1' or 'T2')
  • slice_idx: Optional slice index for visualization

Process: Same as predict_volume but returns JSON response

Returns: Dictionary with:

  • success: Boolean
  • volume_ml: Liver volume
  • liver_percentage: Percentage of scan volume
  • segmentation_path: Path to saved mask
  • report: Medical report text
  • segmentation_file: Base64-encoded mask file
  • overlay_image: Base64-encoded overlay PNG
  • morphology: Morphology analysis dictionary
  • error: Error message if failed

safe_predict_volume(nifti_file, modality, slice_idx=None)

Safe wrapper for predict_volume with error handling.

Parameters:

  • nifti_file: Uploaded NIfTI file
  • modality: MRI modality
  • slice_idx: Optional slice index

Returns: Same as predict_volume, but catches all exceptions and returns error message

model_loader.py Functions

clear_gpu_memory()

Clears GPU memory by unloading models.

Process:

  • Deletes MODEL_T1 and MODEL_T2
  • Deletes WINDOW_INFER
  • Clears CUDA cache
  • Synchronizes CUDA operations

Returns: None

load_model(modality='T1')

Loads and configures SRMA-Mamba model for inference.

Parameters:

  • modality: Model modality ('T1' or 'T2')

Process:

  1. Initializes CUDA device with retry logic
  2. Builds SRMA-Mamba architecture from config
  3. Loads pre-trained checkpoint weights
  4. Moves model to GPU
  5. Sets model to evaluation mode
  6. Configures TF32 for faster matmul operations
  7. Enables cuDNN benchmarking
  8. Applies torch.compile if enabled
  9. Configures sliding window inferer based on available VRAM:
    • Very High VRAM (>40GB): ROI [256, 256, 80], batch_size=2
    • High VRAM (>30GB): ROI [256, 256, 64], batch_size=2
    • Medium VRAM (20-30GB): ROI [256, 256, 64], batch_size=1
    • Low VRAM (10-20GB): ROI [224, 224, 64], batch_size=1
    • Very Low VRAM (<10GB): Progressively smaller ROI, batch_size=1
  10. Sets aggregation device (GPU by default, CPU if VRAM < 2GB)
  11. Runs warm-up inference to trigger compilation and kernel autotuning
  12. Stores model in global variable (MODEL_T1 or MODEL_T2)

Returns: Loaded model instance

Checkpoint Loading:

  • Searches for checkpoint_T1.pth or checkpoint_T2.pth in multiple locations
  • Falls back to Hugging Face Hub download if local file not found
  • Handles both 'state_dict' and direct state dict formats

app.py Functions

fix_gradio_schema_bug()

Monkeypatch to fix Gradio 4.44.x schema bug.

Issue: Gradio crashes when additionalProperties is boolean instead of dict in JSON schema.

Fix:

  • Patches gradio_client.utils.get_type to handle boolean schemas
  • Patches Blocks._get_api_info to normalize schemas before API generation
  • Converts boolean additionalProperties to empty dict

Returns: None

log_startup_health()

Logs comprehensive startup health information.

Information Logged:

  • PyTorch version and CUDA availability
  • GPU name and memory status
  • TF32 settings (matmul and conv)
  • cuDNN benchmark status
  • torch.compile status
  • Library versions (MONAI, Gradio, NiBabel)
  • CUDA extensions status (mamba_ssm, selective_scan_cuda_oflex)
  • Environment variables (PYTORCH_ALLOC_CONF, ENABLE_CUDNN_BENCHMARK, etc.)

Returns: None

create_interface()

Creates Gradio interface for web UI.

Components:

  • File upload input for NIfTI files
  • Modality selector (T1/T2)
  • Slice index slider for visualization
  • Predict button
  • Output image display (segmentation overlay)
  • Output info text (volume, statistics)
  • Output report text (medical report)
  • Output file download (segmentation mask)

Returns: Gradio Blocks object

Processing Pipeline

Complete Workflow

  1. Input Validation

    • File format verification (NIfTI)
    • Shape validation (minimum 3D, maximum 2000 per dimension)
    • Voxel spacing validation
    • NaN/Inf value detection
    • File size limits (upload: max 2 GB, processing: max 2 GB)
    • File size warnings (< 100 KB may indicate compression)
    • Dimension warnings (< 20 slices may indicate incomplete volume)
    • Metadata validation (voxel spacing, affine matrix)
  2. Preprocessing (processing.py::preprocess_nifti)

    • Load NIfTI file with nibabel (memory-mapped for large files >100MB)
    • Apply MONAI transforms:
      • LoadImaged: Load image data
      • EnsureChannelFirstD: Add channel dimension if missing
      • NormalizeIntensityd: Normalize intensity values (nonzero, channel-wise)
      • ToTensord: Convert to PyTorch tensor
    • Convert to float32
    • Move to GPU with non-blocking transfer
    • Apply channels-last 3D memory layout for optimal GPU performance
    • Pin memory for faster CPU-to-GPU transfers
  3. Model Loading (model_loader.py::load_model)

    • Build SRMA-Mamba architecture
    • Load pre-trained checkpoint weights (T1 or T2 modality)
    • Move model to GPU
    • Enable TF32 for faster matmul operations
    • Enable cuDNN benchmarking
    • Apply torch.compile (reduce-overhead mode by default)
    • Configure sliding window inferer based on available VRAM:
      • Very High VRAM (>40GB): ROI [256, 256, 80], batch_size=2
      • High VRAM (>30GB): ROI [256, 256, 64], batch_size=2
      • Medium VRAM (20-30GB): ROI [256, 256, 64], batch_size=1
      • Low VRAM (<20GB): Progressively smaller ROI and batch_size=1
    • Set aggregation device (GPU by default, CPU only if VRAM < 2GB)
    • Run warm-up inference to trigger compilation and kernel autotuning
  4. Inference (inference.py::predict_volume)

    • Adjust ROI size based on input volume dimensions
    • Monitor GPU utilization in real-time (background thread)
    • Run sliding window inference with:
      • Automatic Mixed Precision (AMP) enabled
      • GPU compute, GPU aggregation (default)
      • Channels-last 3D memory layout
    • Apply sigmoid activation to convert logits to probabilities
    • Threshold selection:
      • Grid search: T1 uses [0.60-0.80], T2 uses [0.30-0.70]
      • Default: T1=0.65, T2=0.5
      • Fallback: Tries 0.35, 0.3, percentile-based if default gives 0 voxels
  5. Post-Processing (inference.py)

    • Intensity gating (T1 only):
      • Calculates liver-like intensity range from right upper quadrant
      • Clamps predictions outside intensity range
      • Skips if shape mismatch between prediction and original data
    • Size-aware auto-tune:
      • Increases threshold if mask fraction > 4% or volume > 2200 ml
    • Progressive OOM fallback:
      • Stage 1: Reduce sw_batch_size to 1
      • Stage 2: Reduce ROI depth to 48
      • Stage 3: Reduce ROI depth to 32
      • Stage 4: Switch to CPU aggregation
  6. Segmentation Refinement (processing.py::refine_liver_mask_enhanced)

    • Binarize mask (threshold > 0.5)
    • Apply spatial priors:
      • Remove top 15% slices (diaphragm protection)
      • Remove right 30% pixels (stomach protection)
      • Remove left 15% pixels (spleen protection)
      • Remove bottom 10% slices (lower abdomen protection)
    • Connected component filtering: Keep only largest component
    • Morphological cleanup:
      • Binary closing (ball radius=2) to fill gaps
      • Hole filling to remove internal holes
      • Binary opening (ball radius=2) to remove small spurious regions
    • Optional 3D median filter smoothing (size=3)
    • Re-keep largest component after morphology
    • Preserve original shape (3D, 4D, or 5D)
  7. Analysis and Reporting

    • Calculate liver volume (ml) from voxel count and spacing
    • Analyze morphology (connected components, fragmentation)
    • Calculate confidence score
    • Quality guardrails:
      • Volume sanity check (normal range: 1200-1800 ml)
      • Connected component validation (expect 1 component)
      • Warnings for extreme values or fragmentation
    • Generate medical report
  8. Visualization

    • Create overlay image (green mask on grayscale MRI)
    • Extract middle slice or specified slice index
    • Handle shape mismatches by resizing prediction slice to match original
    • Convert to PIL Image for display
  9. Output

    • Save refined segmentation mask as NIfTI file
    • Return volume statistics, report, and visualization

Dependencies and Libraries

Core Deep Learning Frameworks

  • torch>=2.0.0: PyTorch deep learning framework with CUDA support
  • torchvision>=0.15.0: Computer vision utilities and models
  • monai>=1.4.0: Medical Open Network for AI - medical image processing, sliding window inference, transforms

Medical Imaging

  • nibabel>=5.3.0: Neuroimaging Informatics Technology Initiative format support for reading/writing NIfTI files
  • scipy>=1.10.0: Scientific computing library for morphological operations, connected components, filtering
  • scikit-image>=0.20.0: Image processing library for binary morphological operations (closing, opening, hole filling)

Web Framework and API

  • gradio==4.44.1: Interactive web interface for machine learning models
  • fastapi>=0.115: Modern, fast web framework for building REST APIs
  • uvicorn>=0.30: ASGI server for running FastAPI applications
  • python-multipart>=0.0.6: Multipart form data parsing for file uploads

Data Processing and Utilities

  • numpy>=1.24.0: Numerical computing library for array operations
  • pandas>=2.0.0: Data manipulation and analysis
  • Pillow>=9.5.0: Python Imaging Library for image processing and visualization
  • opencv-python>=4.8.0: Computer vision library for image operations

Model Architecture and Training

  • timm>=0.6.12: PyTorch Image Models - provides DropPath and other layer utilities
  • fvcore>=0.1.5: Facebook Vision core utilities for model analysis
  • einops: Tensor operations with readable syntax
  • ninja: Build system for compiling CUDA extensions
  • packaging: Version and dependency management utilities
  • setuptools: Python packaging and distribution utilities
  • wheel: Built-package format for Python

Hugging Face Integration

  • huggingface-hub>=0.20.0: Client library for interacting with Hugging Face Hub
  • transformers>=4.30.0: State-of-the-art natural language processing models

Configuration and Utilities

  • pyyaml>=6.0: YAML parser for configuration files
  • yacs>=0.1.8: Yet Another Configuration System for managing configs
  • tqdm>=4.65.0: Progress bars for long-running operations
  • scikit-learn>=1.3.0: Machine learning utilities

Performance Monitoring

  • pynvml>=11.0.0: Python bindings for NVIDIA Management Library - GPU utilization monitoring

Optional CUDA Extensions (for maximum speed)

  • mamba-ssm>=2.2.2: CUDA-accelerated Mamba state-space model operations
  • selective_scan_cuda_oflex: Custom CUDA extension for selective scan operations (built from source)

Hugging Face Spaces

  • spaces>=0.26.0: Hugging Face Spaces SDK for GPU resource management

Performance Optimizations

Memory Management

  • Dynamic ROI size adjustment based on available VRAM
  • Automatic batch size reduction on OOM
  • CPU aggregation fallback for very low VRAM (<2GB)
  • Pinned memory for faster transfers
  • Memory-mapped NIfTI loading for large files
  • Models stay loaded between requests (no reload overhead)

GPU Acceleration

  • Channels-last 3D memory layout for better cache utilization
  • TF32 enabled for faster matmul operations
  • cuDNN benchmarking enabled
  • GPU aggregation by default (faster stitching)
  • Non-blocking transfers with pinned memory

Compilation and Caching

  • torch.compile with reduce-overhead mode (faster first run)
  • Optional max-autotune mode for maximum speed
  • Warm-up inference to trigger kernel autotuning
  • cuDNN autotune cache preservation between requests
  • Models stay loaded between requests (no reload overhead)

CUDA Extensions

  • Optional mamba_ssm for faster Mamba operations
  • Optional selective_scan_cuda_oflex for faster selective scan
  • Automatic fallback to PyTorch implementations if extensions unavailable
  • Setup script (setup.sh) for building extensions

Configuration

Environment Variables

  • ENABLE_TORCH_COMPILE: Enable/disable torch.compile (default: false)
  • TORCH_COMPILE_MODE: Compile mode - "reduce-overhead" (default), "max-autotune", or "default"
  • ENABLE_CUDNN_BENCHMARK: Enable cuDNN benchmarking (default: true)
  • INFERENCE_TIMEOUT: Maximum inference time in seconds (default: 1800)
  • MAX_GRADIO_CONCURRENCY: Maximum concurrent Gradio requests (default: 1)
  • PYTORCH_ALLOC_CONF: PyTorch memory allocator config (default: expandable_segments:True,max_split_size_mb=128). Note: PyTorch uses PYTORCH_ALLOC_CONF for CUDA allocator configuration.
  • T1_THRESHOLD: Default threshold for T1 modality (default: 0.65)
  • SEGMENTATION_THRESHOLD: Default threshold for T2 modality (default: 0.5)
  • LIVER_VOL_LOW: Lower bound of normal liver volume range in ml (default: 1200)
  • LIVER_VOL_HIGH: Upper bound of normal liver volume range in ml (default: 1800)
  • REQUIRE_CUDA_EXTENSIONS: If true, raises ImportError if CUDA extensions not installed (default: false)

Default Settings (Fast + Accurate Preset)

  • AMP: Enabled (Automatic Mixed Precision)
  • TF32: Enabled for faster matmul
  • ROI Size: 256 x 256 x 64 (or 80 for >40GB VRAM)
  • Overlap: 0.10
  • Sliding Window Batch: 1 (or 2 for >30GB VRAM)
  • Compute Device: GPU
  • Aggregation Device: GPU (CPU only if VRAM < 2GB)
  • Memory Layout: Channels-last 3D
  • torch.compile: Disabled by default (enable with ENABLE_TORCH_COMPILE=true for benchmarking only)
  • CUDA Extensions: Optional but recommended

API Documentation

Endpoints

POST /api/segment

Upload a NIfTI file for liver segmentation.

Parameters:

  • file: NIfTI file (multipart/form-data, required)
  • modality: "T1" or "T2" (default: "T1")
  • slice_idx: Optional slice index for visualization (default: middle slice)

Response:

{
  "success": true,
  "volume_ml": 1234.56,
  "liver_percentage": 2.5,
  "status": "NORMAL",
  "mask_path_token": "secure-token-123",
  "mask_download_url": "/api/download/secure-token-123",
  "segmentation_file": "data:application/octet-stream;base64,...",
  "overlay_image": "data:image/png;base64,...",
  "report": "Medical report text...",
  "morphology": {
    "connected_components": 1,
    "largest_component_ratio": 1.0,
    "fragmentation": "low"
  }
}

Note: For mask files > 2 GB, segmentation_file will be null and mask_path_token will be provided. Use mask_download_url to download the file. Tokens expire after 24 hours.

GET /api/health

Check API health and model status.

Response:

{
  "status": "healthy",
  "device": "cuda",
  "model_t1_loaded": true,
  "model_t2_loaded": true,
  "gpu_name": "NVIDIA L40S",
  "gpu_memory_gb": 48.0
}

Interactive API Docs

Visit /docs for Swagger UI documentation with interactive testing.

System Requirements

Recommended Hardware

GPU VRAM Status Performance Settings
Nvidia L40S 48 GB Optimal Best performance ROI [256,256,80], batch=2
Nvidia A100 40-80 GB Excellent Production-ready ROI [256,256,64-80], batch=2
Nvidia L4 24 GB Good Works well ROI [256,256,64], batch=1
Nvidia T4 16 GB Limited May require minimal settings ROI [224,224,48], batch=1

Software Requirements

  • Python 3.10+
  • CUDA 11.8+ or 12.8+ (for GPU acceleration)
  • PyTorch 2.0+ (tested with 2.9)
  • 8GB+ RAM
  • 10GB+ disk space for models and dependencies

Performance Optimization

Automatic Optimization

The system automatically optimizes based on available GPU memory:

  • Very High VRAM (>40GB): ROI [256, 256, 80], batch_size=2, GPU aggregation
  • High VRAM (>30GB): ROI [256, 256, 64], batch_size=2, GPU aggregation
  • Medium VRAM (20-30GB): ROI [256, 256, 64], batch_size=1, GPU aggregation
  • Low VRAM (10-20GB): ROI [224, 224, 64], batch_size=1, GPU aggregation
  • Very Low VRAM (<10GB): Progressively smaller ROI, batch_size=1, CPU aggregation if <2GB

Manual Optimization Tips

  1. Install CUDA Extensions: Run bash setup.sh to build mamba_ssm and selective_scan_cuda_oflex
  2. Monitor GPU Utilization: Check logs for GPU utilization warnings
  3. Adjust Compile Mode: Set TORCH_COMPILE_MODE=max-autotune for maximum speed (after extensions installed)
  4. Disable Compile for Testing: Set ENABLE_TORCH_COMPILE=false for faster first run

Performance Metrics

  • First Inference: 30-60s (with reduce-overhead compile) or 2-5min (with max-autotune)
  • Subsequent Inferences: 10-30s depending on volume size
  • GPU Utilization: Target 70-90%+ (monitored automatically)
  • Memory Usage: 15-25GB typical on L40S with optimal settings

Quality Assurance

Segmentation Refinement Pipeline

The system automatically refines raw model outputs:

  1. Connected Component Filtering: Keeps only the largest component (removes false positives)
  2. Morphological Cleanup:
    • Binary closing (fills gaps)
    • Hole filling (removes internal holes)
    • Binary opening (removes small spurious regions)
  3. Smoothing: Optional 3D median filter for jagged surfaces

Quality Guardrails

  • Volume Sanity Check: Warns if volume outside normal range (1200-1800 ml)
  • Connected Components: Validates single dominant component
  • Fragmentation Analysis: Detects and reports high fragmentation
  • Visual Inspection Recommendations: Suggests manual review for extreme cases

Troubleshooting

Common Segmentation Failures

The model automatically detects and warns about common input quality issues:

1. Low Resolution / Compressed Files

Symptoms:

  • File size < 100 KB
  • Very small dimensions (< 10 voxels in any axis)
  • Low prediction confidence (max < 0.3)

Causes:

  • Downsampled or compressed input loses texture and boundary cues
  • MRI slices depend on voxel intensity gradients - compression distorts them
  • Model loses spatial context with reduced resolution

Solutions:

  • Use original, uncompressed NIfTI files
  • Avoid downsampling before upload
  • Ensure minimum resolution: at least 100x100x20 voxels

2. Missing Metadata

Symptoms:

  • Voxel spacing = (1.0, 1.0, 1.0) (default values)
  • Unusual affine determinant
  • Incorrect volume calculations

Causes:

  • Metadata lost during .nii/.png conversions
  • File compression removes header information
  • Manual conversion tools may not preserve affine/spacing

Solutions:

  • Use original DICOM or NIfTI files with intact headers
  • Verify voxel spacing matches scanner parameters
  • Check affine matrix is preserved during conversion

3. Single Slice or Incomplete Volumes

Symptoms:

  • Very few slices (< 20)
  • Small dimension in one axis
  • Model sees incomplete anatomy

Causes:

  • Only one mid-slice uploaded instead of full volume
  • Cropped or partial volumes
  • Model expects full 3D context

Solutions:

  • Upload complete 3D volumes (typically 50-200 slices)
  • Ensure all anatomical regions are included
  • Model performs best with full volume context

4. Normalization Mismatch

Symptoms:

  • Integer data type (uint8/uint16) instead of float32
  • Extreme intensity values (> 10000 or < -1000)
  • Very low data variance
  • Low prediction confidence

Causes:

  • Input not properly normalized to model's expected range
  • Integer compression artifacts
  • Data type conversion issues

Solutions:

  • Model expects normalized float32 tensors
  • Use original DICOM or properly converted NIfTI
  • Avoid manual intensity scaling or type conversion

5. Threshold Issues

Symptoms:

  • Zero voxels segmented
  • Grid search fails
  • Very low prediction values

Causes:

  • Strict threshold (e.g., 0.5) filters out valid low-confidence voxels
  • Model predictions are low due to input quality issues
  • Threshold too high for the data distribution

Solutions:

  • System automatically tries lower thresholds (0.35, 0.3, percentile-based)
  • Check input quality warnings in logs
  • Verify preprocessing is working correctly

Automatic Diagnostics

The system automatically checks and warns about:

  • File size: Warns if < 100 KB (may indicate compression)
  • Dimensions: Warns if < 20 slices or very small dimensions
  • Voxel spacing: Warns if default (1.0, 1.0, 1.0) values detected
  • Data type: Warns if integer types (uint8/uint16) detected
  • Intensity range: Warns if extreme values or low variance
  • Prediction confidence: Warns if max prediction < 0.3 or mean < 0.1
  • Affine matrix: Warns if unusual determinant values

All warnings are printed in the logs to help diagnose issues before they cause segmentation failures.

Common Issues

  1. OOM (Out of Memory) Errors

    • System automatically reduces ROI size and batch size
    • Check GPU memory with nvidia-smi
    • Restart Space to clear GPU memory if needed
  2. Slow First Inference

    • Normal: torch.compile takes 30-60s on first run
    • Set ENABLE_TORCH_COMPILE=false to disable compilation
    • Install CUDA extensions for faster compilation
  3. Low GPU Utilization

    • Install CUDA extensions (mamba_ssm, selective_scan_cuda_oflex)
    • Verify GPU aggregation is enabled (check logs)
    • Check channels-last layout is active
  4. CUDA Extension Build Failures

    • Ensure CUDA toolkit is installed
    • Check PyTorch and CUDA versions match
    • System will fall back to PyTorch implementations

Limitations

  • Domain shift: Performance may degrade on unseen scanners/protocols, especially T2 sequences. T2 support is experimental and results may vary significantly.
  • Header dependence: Requires valid NIfTI affine/zooms; lossy conversions or missing metadata may cause failures or incorrect volume calculations.
  • Partial FOV: Small field-of-view or partial liver volumes can cause under-segmentation; flagged by quality guardrails.
  • Orientation dependence: Spatial priors assume RAS (Right-Anterior-Superior) orientation. Inputs are automatically reoriented, but unusual orientations may affect spatial prior effectiveness.
  • Body size variance: Normal liver volume range (1200-1800 ml) is for average adult body size. Pediatric patients or extreme body sizes may have different normal ranges and should not trigger false CRITICAL warnings.
  • Not for clinical use: Research only; manual review recommended for all outputs, especially for T2 sequences or when status is WARNING/CRITICAL/FAILURE.

File Structure

srmamamba-liver-segmentation/
β”œβ”€β”€ app.py                 # Main application entry point (Gradio + FastAPI)
β”‚                          # - Sets up environment variables (PYTORCH_ALLOC_CONF, TRITON_CACHE_DIR)
β”‚                          # - Fixes Gradio schema bug (fix_gradio_schema_bug)
β”‚                          # - Logs startup health (log_startup_health)
β”‚                          # - Creates FastAPI app with CORS middleware
β”‚                          # - Creates Gradio interface (create_interface)
β”‚                          # - Defines API endpoints (/segment, /health)
β”‚                          # - Launches Gradio app
β”‚
β”œβ”€β”€ config.py             # Configuration and environment setup
β”‚                          # - Sets OMP_NUM_THREADS
β”‚                          # - Sets PYTORCH_ALLOC_CONF
β”‚                          # - Imports and checks CUDA extensions (mamba_ssm, selective_scan_cuda_oflex)
β”‚                          # - Imports build_SRMAMamba from model configs
β”‚                          # - Defines BUILD_SRMAMAMBA_AVAILABLE flag
β”‚                          # - Defines SRMA_MAMBA_DIR path
β”‚
β”œβ”€β”€ model_loader.py       # Model loading and sliding window configuration
β”‚                          # - clear_gpu_memory(): Unloads models and clears GPU cache
β”‚                          # - load_model(modality): Loads SRMA-Mamba model, configures sliding window
β”‚
β”œβ”€β”€ processing.py         # Preprocessing, refinement, and report generation
β”‚                          # - validate_nifti(): Validates NIfTI file structure
β”‚                          # - preprocess_nifti(): Preprocesses NIfTI for model input
β”‚                          # - refine_liver_mask_enhanced(): Enhanced refinement with spatial priors
β”‚                          # - refine_liver_mask(): Basic refinement without spatial priors
β”‚                          # - calculate_confidence_score(): Calculates segmentation confidence
β”‚                          # - calculate_liver_volume(): Calculates volume in ml
β”‚                          # - analyze_liver_morphology(): Analyzes connected components and fragmentation
β”‚                          # - check_volume_sanity(): Checks if volume is within normal range
β”‚                          # - generate_medical_report(): Generates comprehensive medical report
β”‚
β”œβ”€β”€ inference.py          # Core inference logic and API endpoints
β”‚                          # - adjust_roi_for_volume(): Adjusts ROI size based on volume dimensions
β”‚                          # - predict_volume(): Main prediction function for Gradio
β”‚                          # - predict_volume_api(): API version of prediction function
β”‚                          # - safe_predict_volume(): Safe wrapper with error handling
β”‚
β”œβ”€β”€ requirements.txt      # Python dependencies with pinned versions
β”œβ”€β”€ setup.sh              # CUDA extension build script
β”œβ”€β”€ post_build.sh         # Post-build script for Python Spaces (fallback)
β”œβ”€β”€ postBuild             # Hugging Face Spaces post-build script
β”œβ”€β”€ app.yaml              # Hugging Face Spaces configuration (sdk: docker)
β”œβ”€β”€ Dockerfile            # Docker image definition for deployment
β”œβ”€β”€ checkpoint_T1.pth     # Pre-trained T1 model weights
β”œβ”€β”€ checkpoint_T2.pth     # Pre-trained T2 model weights
β”‚
└── SRMA-Mamba/           # Model architecture code
    β”œβ”€β”€ model/
    β”‚   β”œβ”€β”€ SRMAMamba.py  # Main model architecture
    β”‚   β”œβ”€β”€ vmamba2.py    # Mamba backbone
    β”‚   β”œβ”€β”€ csm_triton.py # Triton kernels (optional)
    β”‚   β”œβ”€β”€ csms6s.py     # Selective scan operations
    β”‚   └── mamba2/       # Mamba2 implementation
    β”‚       β”œβ”€β”€ selective_state_update.py
    β”‚       β”œβ”€β”€ ssd_combined.py
    β”‚       └── ...
    β”œβ”€β”€ configs/
    β”‚   β”œβ”€β”€ config.py     # General configuration
    β”‚   β”œβ”€β”€ model_configs.py  # Model configuration and build function
    β”‚   └── vssm1/
    β”‚       └── vmambav2_tiny_224.yaml  # Model architecture YAML
    └── selective_scan/   # Selective scan CUDA extension source
        β”œβ”€β”€ setup.py      # Extension build script
        └── csrc/         # CUDA source code

Quick Start

Using the Web Interface

  1. Upload a 3D NIfTI MRI volume (.nii.gz format)
  2. Select the MRI modality (T1 or T2)
  3. Click "Segment Liver" to run inference
  4. View the segmentation overlay and medical report
  5. Download the 3D segmentation mask

Using the API

import requests

# Upload and segment
with open('liver_scan.nii.gz', 'rb') as f:
    response = requests.post(
        'https://your-api-url/api/segment',
        files={'file': f},
        data={'modality': 'T1'}
    )
    
result = response.json()
# Access segmentation file, volume, and report

Installation

Requirements

  • Python 3.10+
  • CUDA-capable GPU (recommended: 24GB+ VRAM for optimal performance)
  • CUDA 11.8+ or 12.8+ (for GPU acceleration)
  • PyTorch 2.0+ (tested with PyTorch 2.9)
  • 8GB+ RAM
  • 10GB+ disk space for models and dependencies

Setup

# Clone the repository
git clone https://huggingface.co/spaces/HarshithReddy01/srmamamba-liver-segmentation
cd srmamamba-liver-segmentation

# Install dependencies
pip install -r requirements.txt

# Optional: Build CUDA extensions for maximum speed
bash setup.sh

# Run the application
python app.py

Building CUDA Extensions (Optional, for Maximum Speed)

The setup.sh script automatically builds CUDA extensions:

bash setup.sh

This will:

  1. Install mamba-ssm (CUDA extension for Mamba operations)
  2. Build selective_scan_cuda_oflex (custom CUDA extension)
  3. Verify installation

If extensions are not available, the system automatically falls back to PyTorch implementations (slower but still functional).

Citation

If you use LiverProfile AI in your research, please cite:

Repository:

@software{liverprofile_ai_2025,
  title={LiverProfile AI: SRMA-Mamba Liver Segmentation},
  author={Harshith Reddy},
  year={2025},
  url={https://huggingface.co/spaces/HarshithReddy01/srmamamba-liver-segmentation},
  note={Preprint/In preparation}
}

Related Work (if available):

@article{zeng2025srma,
  title={SRMA-Mamba: Spatial Reverse Mamba Attention Network for Pathological Liver Segmentation in MRI Volumes},
  author={Zeng, Jun and Huang, Yannan and Keles, Elif and Aktas, Halil Ertugrul and Durak, Gorkem and Tomar, Nikhil Kumar and Trinh, Quoc-Huy and Nayak, Deepak Ranjan and Bagci, Ulas and Jha, Debesh},
  journal={arXiv preprint arXiv:2508.12410},
  year={2025},
  note={If published, please use the published citation}
}

Disclaimer

Important: This software is intended for research purposes only. It is not approved for clinical use or diagnostic purposes without proper validation and regulatory approval. Always consult with qualified medical professionals for clinical decision-making.

Contributing

Contributions are welcome! Please feel free to submit issues, feature requests, or pull requests.

Contact

For questions, support, or collaboration inquiries:

License

This project is provided for research and educational purposes. Please refer to the original SRMA-Mamba paper for licensing details.


LiverProfile AI - Empowering Medical Imaging with AI

Built for the medical imaging community