Spaces:

HarshithReddy01
/

srmamamba-liver-segmentation

Paused

App Files Files Community

srmamamba-liver-segmentation / README.md

Harshith Reddy

Increase upload limit to 2 GB to match processing limit

89c2460 about 1 month ago

preview code

raw

history blame contribute delete

44.6 kB

metadata

title: LiverProfile AI
sdk: docker
app_file: app.py

LiverProfile AI

Advanced AI-Powered Liver Segmentation and Analysis for Medical Imaging

LiverProfile AI is a state-of-the-art deep learning system designed for automatic liver segmentation and morphological analysis from 3D MRI volumes. Built on the SRMA-Mamba architecture, it provides accurate, real-time liver segmentation with comprehensive medical reporting capabilities.

Overview

LiverProfile AI leverages cutting-edge Mamba-based neural networks to automatically identify and segment liver tissue in MRI scans. The system supports both T1-weighted and T2-weighted MRI sequences, making it versatile for various clinical imaging protocols. Beyond segmentation, LiverProfile AI provides detailed morphological analysis including volume calculations, shape metrics, and automated medical reports.

What It Does

Automatic Liver Segmentation: Accurately identifies and segments liver tissue in 3D MRI volumes
Multi-Modality Support: Optimized for T1-weighted MRI sequences; T2 support is experimental/beta
Morphological Analysis: Calculates liver volume, surface area, and shape characteristics
Medical Reporting: Generates comprehensive reports with clinical insights
Interactive Visualization: Slice-by-slice viewing with segmentation overlays
Export Capabilities: Download segmentation masks in standard NIfTI format
Segmentation Refinement: Automatic post-processing to remove fragmentation and smooth boundaries
Quality Guardrails: Volume sanity checks and connected component validation

Key Features

Core Capabilities

High Accuracy: Dice 0.94 ± 0.02, IoU 0.89 ± 0.03 on T1 sequences
GPU-Accelerated Processing: Near-interactive inference with optimized memory management (typically 10-30s per volume on L40S)
3D Volume Support: Handles full 3D MRI volumes using sliding window inference
Interactive UI: User-friendly Gradio interface with real-time visualization
REST API: Programmatic access via FastAPI for integration into clinical workflows
Medical Reports: Automated generation of clinical analysis reports
Performance Monitoring: Real-time GPU utilization tracking and diagnostics

Technical Highlights

Architecture: Spatial Reverse Mamba Attention (SRMA-Mamba) Network
Optimization: Dynamic memory management for various GPU configurations
Performance: Optimized for L40S (48GB), A100, and other high-VRAM GPUs
Format Support: Standard NIfTI (.nii.gz) input/output
CUDA Extensions: Optional mamba_ssm and selective_scan_cuda_oflex for maximum speed
Compilation: torch.compile with reduce-overhead mode for faster inference
Memory Layout: Channels-last 3D format for optimal GPU memory throughput

Model Performance

All results are mean ± SD on held-out test sets; threshold = 0.5, no test-time augmentation.

Metric	T1 (n=test set)	T2 (n=test set, experimental)
Dice (DSC)	0.94 ± 0.02	0.71 ± 0.09
IoU	0.89 ± 0.03	0.56 ± 0.08
HD95 (mm)	6.2 ± 2.1	18.4 ± 7.0
ASSD (mm)	1.9 ± 0.6	5.7 ± 2.3
Volume Error (%)	+3.1 ± 6.5	−14.8 ± 12.2

Note: T2 performance is experimental and depends on scanner/protocol. Results vary significantly under domain shift. T2 support should be considered beta-quality and may require manual review.

Architecture

SRMA-Mamba Network Architecture

The system is built on the SRMA-Mamba (Spatial Reverse Mamba Attention) architecture, which combines:

Mamba-based Encoder: Efficient state-space models for long-range dependencies in 3D medical volumes
Spatial Reverse Attention: Captures multi-scale spatial features through reverse attention mechanisms
Multi-Resolution Processing: Handles various volume sizes through sliding window inference
Attention Mechanisms: Multi-head attention for feature refinement and spatial context

Model Components

SRMA-Mamba Network: Main segmentation network with spatial reverse attention
Sliding Window Inferer: Processes large volumes in overlapping windows to manage GPU memory
Multi-scale Feature Extraction: Captures features at different resolutions for robust segmentation
Attention Mechanisms: Spatial reverse attention for feature refinement

Model Architecture Details

The SRMA-Mamba architecture consists of:

Input Processing: 3D volume input with channel-first format
Encoder: Mamba-based encoder with spatial reverse attention blocks
Decoder: Multi-scale decoder with skip connections
Output Head: Segmentation head producing binary liver masks

The model processes 3D volumes using a sliding window approach:

ROI Size: Typically 256x256x64 or 256x256x80 voxels per window
Overlap: 0.10 (10% overlap between windows) for optimal balance
Batch Processing: 1-2 windows processed concurrently based on GPU memory
Aggregation: GPU-based aggregation by default for faster stitching

Complete Function Documentation

processing.py Functions

validate_nifti(nifti_img)

Validates NIfTI file structure and metadata.

Parameters:

nifti_img: nibabel NIfTI image object

Validations:

Checks shape has at least 3 dimensions
Ensures all dimensions are positive and <= 2000
Validates voxel spacing is positive
Checks for NaN or Inf values in data

Returns: True if valid, raises ValueError otherwise

preprocess_nifti(file_path, device=None)

Preprocesses NIfTI file for model input.

Parameters:

file_path: Path to NIfTI file
device: PyTorch device (cuda or cpu)

Process:

Loads NIfTI file with nibabel (memory-mapped for large files >100MB)
Validates file structure and metadata
Checks file size and dimensions
Detects pre-normalized data (range [0,1])
Applies MONAI transforms:
- LoadImaged: Load image data
- EnsureChannelFirstD: Add channel dimension if missing
- NormalizeIntensityd: Normalize intensity values (nonzero, channel-wise)
- ToTensord: Convert to PyTorch tensor
Converts to float32
Moves to GPU with non-blocking transfer
Applies channels-last 3D memory layout for optimal GPU performance
Pins memory for faster CPU-to-GPU transfers

Returns: Preprocessed PyTorch tensor on specified device

Diagnostics:

Warns if file size < 100 KB (compression/low resolution)
Warns if < 20 slices (incomplete volume)
Warns if voxel spacing is default (1.0, 1.0, 1.0) - missing metadata
Warns if integer data type (uint8/uint16) - compression artifacts
Warns if extreme intensity values or low variance

refine_liver_mask_enhanced(mask, voxel_spacing, pred_probabilities, threshold, modality)

Enhanced liver mask refinement with spatial priors and quality checks.

Parameters:

mask: Binary segmentation mask (3D, 4D, or 5D numpy array)
voxel_spacing: Tuple of (z, y, x) voxel spacing in mm
pred_probabilities: Raw prediction probabilities from model
threshold: Threshold used for binarization
modality: MRI modality ('T1' or 'T2')

Process:

Preserves original shape (handles 3D, 4D, 5D inputs)
Applies spatial priors:
- Removes top 15% slices (diaphragm protection)
- Removes right 30% pixels (stomach protection)
- Removes left 15% pixels (spleen protection)
- Removes bottom 10% slices (lower abdomen protection)
Connected component filtering: Keeps only largest component
Morphological cleanup:
- Binary closing (ball radius=2) to fill gaps
- Hole filling to remove internal holes
- Binary opening (ball radius=2) to remove small spurious regions
Optional 3D median filter smoothing (size=3)
Re-keeps largest component after morphology
Auto-rethresholding if no components found after spatial priors

Returns:

refined_mask: Refined binary mask (same shape as input)
metrics: Dictionary with refinement statistics (voxels, components, volume change)
confidence_score: Confidence score (0-100)

refine_liver_mask(mask, voxel_spacing=(1.0, 1.0, 1.0), enable_smoothing=True, min_component_size=None)

Basic liver mask refinement without spatial priors.

Parameters:

mask: Binary segmentation mask (3D, 4D, or 5D numpy array)
voxel_spacing: Tuple of (z, y, x) voxel spacing in mm
enable_smoothing: Whether to apply median filter smoothing
min_component_size: Minimum size for connected components to keep (None = keep only largest)

Process:

Preserves original shape
Connected component filtering: Keeps only largest component
Morphological cleanup (closing, hole filling, opening)
Optional 3D median filter smoothing

Returns:

refined_mask: Refined binary mask
metrics: Dictionary with refinement statistics

calculate_confidence_score(mask, pred_probabilities, threshold, num_components, volume_change_percent, guards_ok=True, voxel_spacing=(1.0, 1.0, 1.0))

Calculates confidence score for segmentation quality.

Parameters:

mask: Binary segmentation mask
pred_probabilities: Raw prediction probabilities
threshold: Threshold used for binarization
num_components: Number of connected components
volume_change_percent: Percentage change in volume after refinement
guards_ok: Whether quality guardrails passed
voxel_spacing: Voxel spacing for volume calculation

Calculation:

Base score: Average prediction probability in mask region
Component penalty: Reduces score if multiple components
Volume change penalty: Reduces score if large volume changes
Guard penalty: Reduces score if quality guardrails failed
Volume penalty: Reduces score if volume outside normal range

Returns: Confidence score (0-100)

calculate_liver_volume(pred_binary, voxel_spacing=(1.0, 1.0, 1.0))

Calculates liver volume in milliliters.

Parameters:

pred_binary: Binary segmentation mask
voxel_spacing: Tuple of (z, y, x) voxel spacing in mm

Calculation:

Voxel volume = spacing[0] * spacing[1] * spacing[2] (mm^3)
Liver voxels = sum of all positive voxels
Volume (ml) = (liver_voxels * voxel_volume) / 1000.0

Returns: Liver volume in milliliters (float)

analyze_liver_morphology(pred_binary)

Analyzes morphological characteristics of segmentation.

Parameters:

pred_binary: Binary segmentation mask

Analysis:

Connected component labeling
Component size calculation
Largest component ratio
Fragmentation level classification:
- Low: largest_ratio > 0.95
- Moderate: largest_ratio > 0.80
- High: largest_ratio <= 0.80

Returns: Dictionary with:

connected_components: Number of connected components
largest_component_ratio: Ratio of largest component to total
fragmentation: Fragmentation level (low/moderate/high)

check_volume_sanity(volume_ml)

Checks if liver volume is within normal physiological range.

Parameters:

volume_ml: Liver volume in milliliters

Normal Range: 1200-1800 ml (configurable via LIVER_VOL_LOW and LIVER_VOL_HIGH env vars)

Checks:

CRITICAL: Volume < 50% of normal (< 600 ml) or > 150% of normal (> 2700 ml)
WARNING: Volume < normal (< 1200 ml) or > normal (> 1800 ml)
OK: Volume within normal range

Returns: Tuple of (status, message) where status is "OK", "WARNING", or "CRITICAL"

generate_medical_report(statistics, volume_ml, morphology, modality, confidence_score=0.0)

Generates comprehensive medical report.

Parameters:

statistics: Dictionary with segmentation statistics (voxels, percentage, shape)
volume_ml: Liver volume in milliliters
morphology: Morphology analysis dictionary
modality: MRI modality ('T1' or 'T2')
confidence_score: Confidence score (0-100)

Report Sections:

Study Information: Date, time, modality, status, confidence
Key Findings: Volume assessment, spatial distribution, quality issues
Quantitative Measurements: Volume, percentage, voxels, morphology
Quality Assessment: Segmentation quality, fragmentation, coverage
Clinical Context: Clinical interpretation and recommendations

Returns: Formatted medical report string (Markdown)

inference.py Functions

adjust_roi_for_volume(volume_shape)

Adjusts sliding window ROI size based on input volume dimensions.

Parameters:

volume_shape: Shape of input volume tensor (4D or 5D)

Adjustments:

Reduces ROI depth if > volume depth
Reduces ROI height if > volume height
Reduces ROI width if > volume width
Reduces overlap for very large volumes (>20M voxels)
Optimizes ROI depth for small volumes (<64 slices)

Returns: None (modifies WINDOW_INFER.roi_size in place)

predict_volume(nifti_file, modality, slice_idx=None)

Main prediction function for Gradio interface.

Parameters:

nifti_file: Uploaded NIfTI file (Gradio file object)
modality: MRI modality ('T1' or 'T2')
slice_idx: Optional slice index for visualization (default: middle slice)

Process:

Acquires processing lock (prevents concurrent requests)
Loads appropriate model (T1 or T2)
Loads and validates NIfTI file
Preprocesses volume
Adjusts ROI size for volume dimensions
Runs sliding window inference with AMP
Applies sigmoid activation
Threshold selection (grid search or default with fallback)
Intensity gating (T1 only, if shapes match)
Refines segmentation mask
Calculates volume and morphology
Generates medical report
Creates visualization overlay
Saves segmentation mask
Releases processing lock

Returns: Tuple of (overlay_image, info_text, report_text, output_path)

Error Handling:

Progressive OOM fallback: reduces batch size, ROI depth, switches to CPU aggregation
Shape mismatch handling: resizes slices for overlay creation
Threshold fallback: tries lower thresholds (0.35, 0.3, percentile) if default fails

predict_volume_api(file_path, modality='T1', slice_idx=None)

API version of prediction function.

Parameters:

file_path: Path to NIfTI file (string)
modality: MRI modality ('T1' or 'T2')
slice_idx: Optional slice index for visualization

Process: Same as predict_volume but returns JSON response

Returns: Dictionary with:

success: Boolean
volume_ml: Liver volume
liver_percentage: Percentage of scan volume
segmentation_path: Path to saved mask
report: Medical report text
segmentation_file: Base64-encoded mask file
overlay_image: Base64-encoded overlay PNG
morphology: Morphology analysis dictionary
error: Error message if failed

safe_predict_volume(nifti_file, modality, slice_idx=None)

Safe wrapper for predict_volume with error handling.

Parameters:

nifti_file: Uploaded NIfTI file
modality: MRI modality
slice_idx: Optional slice index

Returns: Same as predict_volume, but catches all exceptions and returns error message

model_loader.py Functions

clear_gpu_memory()

Clears GPU memory by unloading models.

Process:

Deletes MODEL_T1 and MODEL_T2
Deletes WINDOW_INFER
Clears CUDA cache
Synchronizes CUDA operations

Returns: None

load_model(modality='T1')

Loads and configures SRMA-Mamba model for inference.

Parameters:

modality: Model modality ('T1' or 'T2')

Process:

Initializes CUDA device with retry logic
Builds SRMA-Mamba architecture from config
Loads pre-trained checkpoint weights
Moves model to GPU
Sets model to evaluation mode
Configures TF32 for faster matmul operations
Enables cuDNN benchmarking
Applies torch.compile if enabled
Configures sliding window inferer based on available VRAM:
- Very High VRAM (>40GB): ROI [256, 256, 80], batch_size=2
- High VRAM (>30GB): ROI [256, 256, 64], batch_size=2
- Medium VRAM (20-30GB): ROI [256, 256, 64], batch_size=1
- Low VRAM (10-20GB): ROI [224, 224, 64], batch_size=1
- Very Low VRAM (<10GB): Progressively smaller ROI, batch_size=1
Sets aggregation device (GPU by default, CPU if VRAM < 2GB)
Runs warm-up inference to trigger compilation and kernel autotuning
Stores model in global variable (MODEL_T1 or MODEL_T2)

Returns: Loaded model instance

Checkpoint Loading:

Searches for checkpoint_T1.pth or checkpoint_T2.pth in multiple locations
Falls back to Hugging Face Hub download if local file not found
Handles both 'state_dict' and direct state dict formats

app.py Functions

fix_gradio_schema_bug()

Monkeypatch to fix Gradio 4.44.x schema bug.

Issue: Gradio crashes when additionalProperties is boolean instead of dict in JSON schema.

Fix:

Patches gradio_client.utils.get_type to handle boolean schemas
Patches Blocks._get_api_info to normalize schemas before API generation
Converts boolean additionalProperties to empty dict

Returns: None

log_startup_health()

Logs comprehensive startup health information.

Information Logged:

PyTorch version and CUDA availability
GPU name and memory status
TF32 settings (matmul and conv)
cuDNN benchmark status
torch.compile status
Library versions (MONAI, Gradio, NiBabel)
CUDA extensions status (mamba_ssm, selective_scan_cuda_oflex)
Environment variables (PYTORCH_ALLOC_CONF, ENABLE_CUDNN_BENCHMARK, etc.)

Returns: None

create_interface()

Creates Gradio interface for web UI.

Components:

File upload input for NIfTI files
Modality selector (T1/T2)
Slice index slider for visualization
Predict button
Output image display (segmentation overlay)
Output info text (volume, statistics)
Output report text (medical report)
Output file download (segmentation mask)

Returns: Gradio Blocks object

Processing Pipeline

Complete Workflow

Input Validation
- File format verification (NIfTI)
- Shape validation (minimum 3D, maximum 2000 per dimension)
- Voxel spacing validation
- NaN/Inf value detection
- File size limits (upload: max 2 GB, processing: max 2 GB)
- File size warnings (< 100 KB may indicate compression)
- Dimension warnings (< 20 slices may indicate incomplete volume)
- Metadata validation (voxel spacing, affine matrix)
Preprocessing (processing.py::preprocess_nifti)
- Load NIfTI file with nibabel (memory-mapped for large files >100MB)
- Apply MONAI transforms:
  - LoadImaged: Load image data
  - EnsureChannelFirstD: Add channel dimension if missing
  - NormalizeIntensityd: Normalize intensity values (nonzero, channel-wise)
  - ToTensord: Convert to PyTorch tensor
- Convert to float32
- Move to GPU with non-blocking transfer
- Apply channels-last 3D memory layout for optimal GPU performance
- Pin memory for faster CPU-to-GPU transfers
Model Loading (model_loader.py::load_model)
- Build SRMA-Mamba architecture
- Load pre-trained checkpoint weights (T1 or T2 modality)
- Move model to GPU
- Enable TF32 for faster matmul operations
- Enable cuDNN benchmarking
- Apply torch.compile (reduce-overhead mode by default)
- Configure sliding window inferer based on available VRAM:
  - Very High VRAM (>40GB): ROI [256, 256, 80], batch_size=2
  - High VRAM (>30GB): ROI [256, 256, 64], batch_size=2
  - Medium VRAM (20-30GB): ROI [256, 256, 64], batch_size=1
  - Low VRAM (<20GB): Progressively smaller ROI and batch_size=1
- Set aggregation device (GPU by default, CPU only if VRAM < 2GB)
- Run warm-up inference to trigger compilation and kernel autotuning
Inference (inference.py::predict_volume)
- Adjust ROI size based on input volume dimensions
- Monitor GPU utilization in real-time (background thread)
- Run sliding window inference with:
  - Automatic Mixed Precision (AMP) enabled
  - GPU compute, GPU aggregation (default)
  - Channels-last 3D memory layout
- Apply sigmoid activation to convert logits to probabilities
- Threshold selection:
  - Grid search: T1 uses [0.60-0.80], T2 uses [0.30-0.70]
  - Default: T1=0.65, T2=0.5
  - Fallback: Tries 0.35, 0.3, percentile-based if default gives 0 voxels
Post-Processing (inference.py)
- Intensity gating (T1 only):
  - Calculates liver-like intensity range from right upper quadrant
  - Clamps predictions outside intensity range
  - Skips if shape mismatch between prediction and original data
- Size-aware auto-tune:
  - Increases threshold if mask fraction > 4% or volume > 2200 ml
- Progressive OOM fallback:
  - Stage 1: Reduce sw_batch_size to 1
  - Stage 2: Reduce ROI depth to 48
  - Stage 3: Reduce ROI depth to 32
  - Stage 4: Switch to CPU aggregation
Segmentation Refinement (processing.py::refine_liver_mask_enhanced)
- Binarize mask (threshold > 0.5)
- Apply spatial priors:
  - Remove top 15% slices (diaphragm protection)
  - Remove right 30% pixels (stomach protection)
  - Remove left 15% pixels (spleen protection)
  - Remove bottom 10% slices (lower abdomen protection)
- Connected component filtering: Keep only largest component
- Morphological cleanup:
  - Binary closing (ball radius=2) to fill gaps
  - Hole filling to remove internal holes
  - Binary opening (ball radius=2) to remove small spurious regions
- Optional 3D median filter smoothing (size=3)
- Re-keep largest component after morphology
- Preserve original shape (3D, 4D, or 5D)
Analysis and Reporting
- Calculate liver volume (ml) from voxel count and spacing
- Analyze morphology (connected components, fragmentation)
- Calculate confidence score
- Quality guardrails:
  - Volume sanity check (normal range: 1200-1800 ml)
  - Connected component validation (expect 1 component)
  - Warnings for extreme values or fragmentation
- Generate medical report
Visualization
- Create overlay image (green mask on grayscale MRI)
- Extract middle slice or specified slice index
- Handle shape mismatches by resizing prediction slice to match original
- Convert to PIL Image for display
Output
- Save refined segmentation mask as NIfTI file
- Return volume statistics, report, and visualization

Dependencies and Libraries

Core Deep Learning Frameworks

torch>=2.0.0: PyTorch deep learning framework with CUDA support
torchvision>=0.15.0: Computer vision utilities and models
monai>=1.4.0: Medical Open Network for AI - medical image processing, sliding window inference, transforms

Medical Imaging

nibabel>=5.3.0: Neuroimaging Informatics Technology Initiative format support for reading/writing NIfTI files
scipy>=1.10.0: Scientific computing library for morphological operations, connected components, filtering
scikit-image>=0.20.0: Image processing library for binary morphological operations (closing, opening, hole filling)

Web Framework and API

gradio==4.44.1: Interactive web interface for machine learning models
fastapi>=0.115: Modern, fast web framework for building REST APIs
uvicorn>=0.30: ASGI server for running FastAPI applications
python-multipart>=0.0.6: Multipart form data parsing for file uploads

Data Processing and Utilities

numpy>=1.24.0: Numerical computing library for array operations
pandas>=2.0.0: Data manipulation and analysis
Pillow>=9.5.0: Python Imaging Library for image processing and visualization
opencv-python>=4.8.0: Computer vision library for image operations

Model Architecture and Training

timm>=0.6.12: PyTorch Image Models - provides DropPath and other layer utilities
fvcore>=0.1.5: Facebook Vision core utilities for model analysis
einops: Tensor operations with readable syntax
ninja: Build system for compiling CUDA extensions
packaging: Version and dependency management utilities
setuptools: Python packaging and distribution utilities
wheel: Built-package format for Python

Hugging Face Integration

huggingface-hub>=0.20.0: Client library for interacting with Hugging Face Hub
transformers>=4.30.0: State-of-the-art natural language processing models

Configuration and Utilities

pyyaml>=6.0: YAML parser for configuration files
yacs>=0.1.8: Yet Another Configuration System for managing configs
tqdm>=4.65.0: Progress bars for long-running operations
scikit-learn>=1.3.0: Machine learning utilities

Performance Monitoring

pynvml>=11.0.0: Python bindings for NVIDIA Management Library - GPU utilization monitoring

Optional CUDA Extensions (for maximum speed)

mamba-ssm>=2.2.2: CUDA-accelerated Mamba state-space model operations
selective_scan_cuda_oflex: Custom CUDA extension for selective scan operations (built from source)

Hugging Face Spaces

spaces>=0.26.0: Hugging Face Spaces SDK for GPU resource management

Performance Optimizations

Memory Management

Dynamic ROI size adjustment based on available VRAM
Automatic batch size reduction on OOM
CPU aggregation fallback for very low VRAM (<2GB)
Pinned memory for faster transfers
Memory-mapped NIfTI loading for large files
Models stay loaded between requests (no reload overhead)

GPU Acceleration

Channels-last 3D memory layout for better cache utilization
TF32 enabled for faster matmul operations
cuDNN benchmarking enabled
GPU aggregation by default (faster stitching)
Non-blocking transfers with pinned memory

Compilation and Caching

torch.compile with reduce-overhead mode (faster first run)
Optional max-autotune mode for maximum speed
Warm-up inference to trigger kernel autotuning
cuDNN autotune cache preservation between requests
Models stay loaded between requests (no reload overhead)

CUDA Extensions

Optional mamba_ssm for faster Mamba operations
Optional selective_scan_cuda_oflex for faster selective scan
Automatic fallback to PyTorch implementations if extensions unavailable
Setup script (setup.sh) for building extensions

Configuration

Environment Variables

ENABLE_TORCH_COMPILE: Enable/disable torch.compile (default: false)
TORCH_COMPILE_MODE: Compile mode - "reduce-overhead" (default), "max-autotune", or "default"
ENABLE_CUDNN_BENCHMARK: Enable cuDNN benchmarking (default: true)
INFERENCE_TIMEOUT: Maximum inference time in seconds (default: 1800)
MAX_GRADIO_CONCURRENCY: Maximum concurrent Gradio requests (default: 1)
PYTORCH_ALLOC_CONF: PyTorch memory allocator config (default: expandable_segments:True,max_split_size_mb=128). Note: PyTorch uses PYTORCH_ALLOC_CONF for CUDA allocator configuration.
T1_THRESHOLD: Default threshold for T1 modality (default: 0.65)
SEGMENTATION_THRESHOLD: Default threshold for T2 modality (default: 0.5)
LIVER_VOL_LOW: Lower bound of normal liver volume range in ml (default: 1200)
LIVER_VOL_HIGH: Upper bound of normal liver volume range in ml (default: 1800)
REQUIRE_CUDA_EXTENSIONS: If true, raises ImportError if CUDA extensions not installed (default: false)

Default Settings (Fast + Accurate Preset)

AMP: Enabled (Automatic Mixed Precision)
TF32: Enabled for faster matmul
ROI Size: 256 x 256 x 64 (or 80 for >40GB VRAM)
Overlap: 0.10
Sliding Window Batch: 1 (or 2 for >30GB VRAM)
Compute Device: GPU
Aggregation Device: GPU (CPU only if VRAM < 2GB)
Memory Layout: Channels-last 3D
torch.compile: Disabled by default (enable with ENABLE_TORCH_COMPILE=true for benchmarking only)
CUDA Extensions: Optional but recommended

API Documentation

Endpoints

POST /api/segment

Upload a NIfTI file for liver segmentation.

Parameters:

file: NIfTI file (multipart/form-data, required)
modality: "T1" or "T2" (default: "T1")
slice_idx: Optional slice index for visualization (default: middle slice)

Response:

{
  "success": true,
  "volume_ml": 1234.56,
  "liver_percentage": 2.5,
  "status": "NORMAL",
  "mask_path_token": "secure-token-123",
  "mask_download_url": "/api/download/secure-token-123",
  "segmentation_file": "data:application/octet-stream;base64,...",
  "overlay_image": "data:image/png;base64,...",
  "report": "Medical report text...",
  "morphology": {
    "connected_components": 1,
    "largest_component_ratio": 1.0,
    "fragmentation": "low"
  }
}

Note: For mask files > 2 GB, segmentation_file will be null and mask_path_token will be provided. Use mask_download_url to download the file. Tokens expire after 24 hours.

GET /api/health

Check API health and model status.

Response:

{
  "status": "healthy",
  "device": "cuda",
  "model_t1_loaded": true,
  "model_t2_loaded": true,
  "gpu_name": "NVIDIA L40S",
  "gpu_memory_gb": 48.0
}

Interactive API Docs

Visit /docs for Swagger UI documentation with interactive testing.

System Requirements

Recommended Hardware

GPU	VRAM	Status	Performance	Settings
Nvidia L40S	48 GB	Optimal	Best performance	ROI [256,256,80], batch=2
Nvidia A100	40-80 GB	Excellent	Production-ready	ROI [256,256,64-80], batch=2
Nvidia L4	24 GB	Good	Works well	ROI [256,256,64], batch=1
Nvidia T4	16 GB	Limited	May require minimal settings	ROI [224,224,48], batch=1

Software Requirements

Python 3.10+
CUDA 11.8+ or 12.8+ (for GPU acceleration)
PyTorch 2.0+ (tested with 2.9)
8GB+ RAM
10GB+ disk space for models and dependencies

Performance Optimization

Automatic Optimization

The system automatically optimizes based on available GPU memory:

Very High VRAM (>40GB): ROI [256, 256, 80], batch_size=2, GPU aggregation
High VRAM (>30GB): ROI [256, 256, 64], batch_size=2, GPU aggregation
Medium VRAM (20-30GB): ROI [256, 256, 64], batch_size=1, GPU aggregation
Low VRAM (10-20GB): ROI [224, 224, 64], batch_size=1, GPU aggregation
Very Low VRAM (<10GB): Progressively smaller ROI, batch_size=1, CPU aggregation if <2GB

Manual Optimization Tips

Install CUDA Extensions: Run bash setup.sh to build mamba_ssm and selective_scan_cuda_oflex
Monitor GPU Utilization: Check logs for GPU utilization warnings
Adjust Compile Mode: Set TORCH_COMPILE_MODE=max-autotune for maximum speed (after extensions installed)
Disable Compile for Testing: Set ENABLE_TORCH_COMPILE=false for faster first run

Performance Metrics

First Inference: 30-60s (with reduce-overhead compile) or 2-5min (with max-autotune)
Subsequent Inferences: 10-30s depending on volume size
GPU Utilization: Target 70-90%+ (monitored automatically)
Memory Usage: 15-25GB typical on L40S with optimal settings

Quality Assurance

Segmentation Refinement Pipeline

The system automatically refines raw model outputs:

Connected Component Filtering: Keeps only the largest component (removes false positives)
Morphological Cleanup:
- Binary closing (fills gaps)
- Hole filling (removes internal holes)
- Binary opening (removes small spurious regions)
Smoothing: Optional 3D median filter for jagged surfaces

Quality Guardrails

Volume Sanity Check: Warns if volume outside normal range (1200-1800 ml)
Connected Components: Validates single dominant component
Fragmentation Analysis: Detects and reports high fragmentation
Visual Inspection Recommendations: Suggests manual review for extreme cases

Troubleshooting

Common Segmentation Failures

The model automatically detects and warns about common input quality issues:

1. Low Resolution / Compressed Files

Symptoms:

File size < 100 KB
Very small dimensions (< 10 voxels in any axis)
Low prediction confidence (max < 0.3)

Causes:

Downsampled or compressed input loses texture and boundary cues
MRI slices depend on voxel intensity gradients - compression distorts them
Model loses spatial context with reduced resolution

Solutions:

Use original, uncompressed NIfTI files
Avoid downsampling before upload
Ensure minimum resolution: at least 100x100x20 voxels

2. Missing Metadata

Symptoms:

Voxel spacing = (1.0, 1.0, 1.0) (default values)
Unusual affine determinant
Incorrect volume calculations

Causes:

Metadata lost during .nii/.png conversions
File compression removes header information
Manual conversion tools may not preserve affine/spacing

Solutions:

Use original DICOM or NIfTI files with intact headers
Verify voxel spacing matches scanner parameters
Check affine matrix is preserved during conversion

3. Single Slice or Incomplete Volumes

Symptoms:

Very few slices (< 20)
Small dimension in one axis
Model sees incomplete anatomy

Causes:

Only one mid-slice uploaded instead of full volume
Cropped or partial volumes
Model expects full 3D context

Solutions:

Upload complete 3D volumes (typically 50-200 slices)
Ensure all anatomical regions are included
Model performs best with full volume context

4. Normalization Mismatch

Symptoms:

Integer data type (uint8/uint16) instead of float32
Extreme intensity values (> 10000 or < -1000)
Very low data variance
Low prediction confidence

Causes:

Input not properly normalized to model's expected range
Integer compression artifacts
Data type conversion issues

Solutions:

Model expects normalized float32 tensors
Use original DICOM or properly converted NIfTI
Avoid manual intensity scaling or type conversion

5. Threshold Issues

Symptoms:

Zero voxels segmented
Grid search fails
Very low prediction values

Causes:

Strict threshold (e.g., 0.5) filters out valid low-confidence voxels
Model predictions are low due to input quality issues
Threshold too high for the data distribution

Solutions:

System automatically tries lower thresholds (0.35, 0.3, percentile-based)
Check input quality warnings in logs
Verify preprocessing is working correctly

Automatic Diagnostics

The system automatically checks and warns about:

File size: Warns if < 100 KB (may indicate compression)
Dimensions: Warns if < 20 slices or very small dimensions
Voxel spacing: Warns if default (1.0, 1.0, 1.0) values detected
Data type: Warns if integer types (uint8/uint16) detected
Intensity range: Warns if extreme values or low variance
Prediction confidence: Warns if max prediction < 0.3 or mean < 0.1
Affine matrix: Warns if unusual determinant values

All warnings are printed in the logs to help diagnose issues before they cause segmentation failures.

Common Issues

OOM (Out of Memory) Errors
- System automatically reduces ROI size and batch size
- Check GPU memory with nvidia-smi
- Restart Space to clear GPU memory if needed
Slow First Inference
- Normal: torch.compile takes 30-60s on first run
- Set ENABLE_TORCH_COMPILE=false to disable compilation
- Install CUDA extensions for faster compilation
Low GPU Utilization
- Install CUDA extensions (mamba_ssm, selective_scan_cuda_oflex)
- Verify GPU aggregation is enabled (check logs)
- Check channels-last layout is active
CUDA Extension Build Failures
- Ensure CUDA toolkit is installed
- Check PyTorch and CUDA versions match
- System will fall back to PyTorch implementations

Limitations

Domain shift: Performance may degrade on unseen scanners/protocols, especially T2 sequences. T2 support is experimental and results may vary significantly.
Header dependence: Requires valid NIfTI affine/zooms; lossy conversions or missing metadata may cause failures or incorrect volume calculations.
Partial FOV: Small field-of-view or partial liver volumes can cause under-segmentation; flagged by quality guardrails.
Orientation dependence: Spatial priors assume RAS (Right-Anterior-Superior) orientation. Inputs are automatically reoriented, but unusual orientations may affect spatial prior effectiveness.
Body size variance: Normal liver volume range (1200-1800 ml) is for average adult body size. Pediatric patients or extreme body sizes may have different normal ranges and should not trigger false CRITICAL warnings.
Not for clinical use: Research only; manual review recommended for all outputs, especially for T2 sequences or when status is WARNING/CRITICAL/FAILURE.

File Structure

srmamamba-liver-segmentation/
├── app.py                 # Main application entry point (Gradio + FastAPI)
│                          # - Sets up environment variables (PYTORCH_ALLOC_CONF, TRITON_CACHE_DIR)
│                          # - Fixes Gradio schema bug (fix_gradio_schema_bug)
│                          # - Logs startup health (log_startup_health)
│                          # - Creates FastAPI app with CORS middleware
│                          # - Creates Gradio interface (create_interface)
│                          # - Defines API endpoints (/segment, /health)
│                          # - Launches Gradio app
│
├── config.py             # Configuration and environment setup
│                          # - Sets OMP_NUM_THREADS
│                          # - Sets PYTORCH_ALLOC_CONF
│                          # - Imports and checks CUDA extensions (mamba_ssm, selective_scan_cuda_oflex)
│                          # - Imports build_SRMAMamba from model configs
│                          # - Defines BUILD_SRMAMAMBA_AVAILABLE flag
│                          # - Defines SRMA_MAMBA_DIR path
│
├── model_loader.py       # Model loading and sliding window configuration
│                          # - clear_gpu_memory(): Unloads models and clears GPU cache
│                          # - load_model(modality): Loads SRMA-Mamba model, configures sliding window
│
├── processing.py         # Preprocessing, refinement, and report generation
│                          # - validate_nifti(): Validates NIfTI file structure
│                          # - preprocess_nifti(): Preprocesses NIfTI for model input
│                          # - refine_liver_mask_enhanced(): Enhanced refinement with spatial priors
│                          # - refine_liver_mask(): Basic refinement without spatial priors
│                          # - calculate_confidence_score(): Calculates segmentation confidence
│                          # - calculate_liver_volume(): Calculates volume in ml
│                          # - analyze_liver_morphology(): Analyzes connected components and fragmentation
│                          # - check_volume_sanity(): Checks if volume is within normal range
│                          # - generate_medical_report(): Generates comprehensive medical report
│
├── inference.py          # Core inference logic and API endpoints
│                          # - adjust_roi_for_volume(): Adjusts ROI size based on volume dimensions
│                          # - predict_volume(): Main prediction function for Gradio
│                          # - predict_volume_api(): API version of prediction function
│                          # - safe_predict_volume(): Safe wrapper with error handling
│
├── requirements.txt      # Python dependencies with pinned versions
├── setup.sh              # CUDA extension build script
├── post_build.sh         # Post-build script for Python Spaces (fallback)
├── postBuild             # Hugging Face Spaces post-build script
├── app.yaml              # Hugging Face Spaces configuration (sdk: docker)
├── Dockerfile            # Docker image definition for deployment
├── checkpoint_T1.pth     # Pre-trained T1 model weights
├── checkpoint_T2.pth     # Pre-trained T2 model weights
│
└── SRMA-Mamba/           # Model architecture code
    ├── model/
    │   ├── SRMAMamba.py  # Main model architecture
    │   ├── vmamba2.py    # Mamba backbone
    │   ├── csm_triton.py # Triton kernels (optional)
    │   ├── csms6s.py     # Selective scan operations
    │   └── mamba2/       # Mamba2 implementation
    │       ├── selective_state_update.py
    │       ├── ssd_combined.py
    │       └── ...
    ├── configs/
    │   ├── config.py     # General configuration
    │   ├── model_configs.py  # Model configuration and build function
    │   └── vssm1/
    │       └── vmambav2_tiny_224.yaml  # Model architecture YAML
    └── selective_scan/   # Selective scan CUDA extension source
        ├── setup.py      # Extension build script
        └── csrc/         # CUDA source code

Quick Start

Using the Web Interface

Upload a 3D NIfTI MRI volume (.nii.gz format)
Select the MRI modality (T1 or T2)
Click "Segment Liver" to run inference
View the segmentation overlay and medical report
Download the 3D segmentation mask

Using the API

import requests

# Upload and segment
with open('liver_scan.nii.gz', 'rb') as f:
    response = requests.post(
        'https://your-api-url/api/segment',
        files={'file': f},
        data={'modality': 'T1'}
    )
    
result = response.json()
# Access segmentation file, volume, and report

Installation

Requirements

Python 3.10+
CUDA-capable GPU (recommended: 24GB+ VRAM for optimal performance)
CUDA 11.8+ or 12.8+ (for GPU acceleration)
PyTorch 2.0+ (tested with PyTorch 2.9)
8GB+ RAM
10GB+ disk space for models and dependencies

Setup

# Clone the repository
git clone https://huggingface.co/spaces/HarshithReddy01/srmamamba-liver-segmentation
cd srmamamba-liver-segmentation

# Install dependencies
pip install -r requirements.txt

# Optional: Build CUDA extensions for maximum speed
bash setup.sh

# Run the application
python app.py

Building CUDA Extensions (Optional, for Maximum Speed)

The setup.sh script automatically builds CUDA extensions:

bash setup.sh

This will:

Install mamba-ssm (CUDA extension for Mamba operations)
Build selective_scan_cuda_oflex (custom CUDA extension)
Verify installation

If extensions are not available, the system automatically falls back to PyTorch implementations (slower but still functional).

Citation

If you use LiverProfile AI in your research, please cite:

Repository:

@software{liverprofile_ai_2025,
  title={LiverProfile AI: SRMA-Mamba Liver Segmentation},
  author={Harshith Reddy},
  year={2025},
  url={https://huggingface.co/spaces/HarshithReddy01/srmamamba-liver-segmentation},
  note={Preprint/In preparation}
}

Related Work (if available):

@article{zeng2025srma,
  title={SRMA-Mamba: Spatial Reverse Mamba Attention Network for Pathological Liver Segmentation in MRI Volumes},
  author={Zeng, Jun and Huang, Yannan and Keles, Elif and Aktas, Halil Ertugrul and Durak, Gorkem and Tomar, Nikhil Kumar and Trinh, Quoc-Huy and Nayak, Deepak Ranjan and Bagci, Ulas and Jha, Debesh},
  journal={arXiv preprint arXiv:2508.12410},
  year={2025},
  note={If published, please use the published citation}
}

Disclaimer

Important: This software is intended for research purposes only. It is not approved for clinical use or diagnostic purposes without proper validation and regulatory approval. Always consult with qualified medical professionals for clinical decision-making.

Contributing

Contributions are welcome! Please feel free to submit issues, feature requests, or pull requests.

Contact

For questions, support, or collaboration inquiries:

Email: harshithreddy0117@gmail.com
Hugging Face Space: srmamamba-liver-segmentation

License

This project is provided for research and educational purposes. Please refer to the original SRMA-Mamba paper for licensing details.

LiverProfile AI - Empowering Medical Imaging with AI

Built for the medical imaging community