YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

SimpleTool: Parallel Decoding for Real-Time LLM Function Calling

Hugging Face | ModelScope | GitHub

This repository contains the weights for RT-Qwen (RealtimeTool), a series of models optimized for low-latency, parallel LLM function calling.

📁 Model Directory Structure

The models are organized by scale, quantization format, and inference framework.

Directly use these folders for inference via vLLM or Transformers.

gguf_models/: Full-precision (F16) GGUF files for all versions.
gguf_quantized/: Quantized GGUF versions including Q4_K_M, Q5_K_M, and Q8_0.

License: Apache-2.0
Status: Models Uploading / Placeholder README

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support