Think2SQL-14B / README.md
anonymous-2321's picture
Create README.md
f0bab53 verified
metadata
base_model: Qwen/Qwen3-14B
library_name: transformers
tags:
  - generated_from_trainer
  - open-r1
  - Text2SQL
  - Reasoning
licence: apache-2.0
language:
  - en

Model Information

This model is the reasoning model for the Text-to-SQL task introduced in Think2SQL: Blueprinting Reward Density and Advantage Scaling for Effective Text-to-SQL Reasoning

This model is a fine-tuned version of Qwen/Qwen3-14B with thinking disabled on the BIRD dataset. It has been trained using TRL.

Quick start

The best model performance is given with its System and User prompts. The model is intended to be used with three inputs: question, evidence, and the database schema.

Required transformers > 4.51.0 to have Qwen3. Make sure to update your transformers installation via pip install --upgrade transformers.

import transformers
import torch
model_id = "anonymous-2321/Think2SQL-14B"
pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

system_message ="""
You are a data science expert that provides well-reasoned and detailed responses. Your task is to understand the schema and generate a valid SQL query to answer the question.
You first think about the reasoning process as an internal monologue and then provide the user with the answer.
Respond in the following format:
<reasoning>
    ...
</reasoning>
<answer>
    ...
</answer>
""".strip()

user_message = """
Answer the following question with the SQL code. Use the piece of evidence and base your answer on the database schema.
Given the question, the evidence and the database schema, return in the <answer> tags only the SQL script that addresses the question.

Database Engine:
SQLite

Question:
Return the product name, sorted alphabetically and by price in descending order.


Evidence:


Database Schema:
CREATE TABLE products (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    price REAL NOT NULL
);

CREATE TABLE customers (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    email TEXT NOT NULL
);
"""


messages = [
    {"role": "system", "content": system_message},
    {"role": "user", "content": user_message},
]

outputs = pipeline(
    messages,
    max_new_tokens=4096,
    temperature=0.6,
    top_p=0.95,
    top_k=20
)
print(outputs[0]["generated_text"][-1])

๐Ÿ“– Overview

Think2SQL is a systematic study on injecting reasoning capabilities into Text-to-SQL through Reinforcement Learning with Verifiable Rewards (RLVR). We uncover the critical interplay between reward density, advantage scaling, and model capacity, proposing novel execution-guided dense rewards and optimal scaling strategies. Our 4B-parameter model achieves reasoning capabilities competitive with state-of-the-art models, while providing a comprehensive analysis for optimizing Text-to-SQL reasoning under computational constraints.

Key Contributions:

  • Execution-guided dense reward function that outperforms binary signals
  • Analysis of advantage scaling mechanics for models of different sizes
  • Evaluation of cold start effects and supervised fine-tuning impact
  • Pareto frontier mapping for training efficiency optimization