Gemma-2-2B Text-to-SQL QLoRA Fine-tuned Model

  • Developed by: rajaykumar12959
  • License: apache-2.0
  • Finetuned from model: unsloth/gemma-2-2b-it-bnb-4bit
  • Dataset: gretelai/synthetic_text_to_sql
  • Task: Text-to-SQL Generation
  • Fine-tuning Method: QLoRA (Quantized Low-Rank Adaptation)

This gemma2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Model Description

This model is specifically fine-tuned to generate SQL queries from natural language questions and database schemas. It excels at handling complex multi-table queries requiring JOINs, aggregations, filtering, and advanced SQL operations.

Key Features

  • ✅ Multi-table JOINs (INNER, LEFT, RIGHT)
  • ✅ Aggregation functions (SUM, COUNT, AVG, MIN, MAX)
  • ✅ GROUP BY and HAVING clauses
  • ✅ Complex WHERE conditions
  • ✅ Subqueries and CTEs
  • ✅ Date/time operations
  • ✅ String functions and pattern matching

Training Configuration

The model was fine-tuned using QLoRA with the following configuration:

# LoRA Configuration
r = 16  # Rank: 16 is a good balance for 2B models
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
lora_alpha = 16
lora_dropout = 0
bias = "none"
use_gradient_checkpointing = "unsloth"

# Training Parameters
max_seq_length = 2048
per_device_train_batch_size = 2
gradient_accumulation_steps = 4  # Effective batch size = 8
warmup_steps = 5
max_steps = 100  # Demo configuration - increase to 300+ for production
learning_rate = 2e-4
optim = "adamw_8bit"  # 8-bit optimizer for memory efficiency
weight_decay = 0.01
lr_scheduler_type = "linear"

Installation

pip install unsloth transformers torch trl datasets

Usage

Loading the Model

from unsloth import FastLanguageModel
import torch

max_seq_length = 2048
dtype = None
load_in_4bit = True

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "rajaykumar12959/gemma-2-2b-text-to-sql-qlora",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

FastLanguageModel.for_inference(model)  # Enable faster inference

Inference Function

def inference_text_to_sql(model, tokenizer, schema, question, max_new_tokens=300):
    """
    Perform inference to generate SQL from natural language question and database schema.
    
    Args:
        model: Fine-tuned Gemma model
        tokenizer: Model tokenizer
        schema: Database schema as string
        question: Natural language question
        max_new_tokens: Maximum tokens to generate
    
    Returns:
        Generated SQL query as string
    """
    # Format the input prompt
    input_prompt = f"""<start_of_turn>user
You are a powerful text-to-SQL model. Your job is to answer questions about a database. You are given a question and context regarding one or more tables.

### Schema:
{schema}

### Question:
{question}<end_of_turn>
<start_of_turn>model
"""
    
    # Tokenize input
    inputs = tokenizer([input_prompt], return_tensors="pt").to("cuda")
    
    # Generate output
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            use_cache=True,
            do_sample=True,
            temperature=0.1,  # Low temperature for more deterministic output
            top_p=0.9,
            pad_token_id=tokenizer.eos_token_id
        )
    
    # Decode and clean the result
    result = tokenizer.batch_decode(outputs)[0]
    sql_query = result.split("<start_of_turn>model")[-1].replace("<end_of_turn>", "").strip()
    
    return sql_query

Example Usage

Example 1: Simple Single-Table Query

# Simple employee database
simple_schema = """
CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    name TEXT,
    department TEXT,
    salary DECIMAL,
    hire_date DATE
);
"""

simple_question = "Find all employees in the 'Engineering' department with salary greater than 75000"

sql_result = inference_text_to_sql(model, tokenizer, simple_schema, simple_question)
print(f"Generated SQL:\n{sql_result}")

Expected Output:

SELECT * FROM employees 
WHERE department = 'Engineering' 
AND salary > 75000;

Training Details

Dataset

  • Source: gretelai/synthetic_text_to_sql
  • Size: 100,000 synthetic text-to-SQL examples
  • Columns used:
    • sql_context: Database schema
    • sql_prompt: Natural language question
    • sql: Target SQL query

Training Process

The model uses a custom formatting function to structure the training data:

def formatting_prompts_func(examples):
    schemas   = examples["sql_context"]
    questions = examples["sql_prompt"] 
    outputs   = examples["sql"]

    texts = []
    for schema, question, output in zip(schemas, questions, outputs):
        text = gemma_prompt.format(schema, question, output) + EOS_TOKEN
        texts.append(text)
    return { "text" : texts, }

Hardware Requirements

  • GPU: Single GPU with 8GB+ VRAM
  • Training Time: ~30 minutes for 100 steps
  • Memory Optimization: 4-bit quantization + 8-bit optimizer

Performance Characteristics

Strengths

  • Excellent performance on multi-table JOINs
  • Accurate aggregation and GROUP BY operations
  • Proper handling of foreign key relationships
  • Good understanding of filtering logic (WHERE/HAVING)

Model Capabilities Test

The model was tested on a complex 4-table JOIN query requiring:

  1. Multi-table JOINs (users → orders → order_items → products)
  2. Category filtering (WHERE p.category = 'Electronics')
  3. User grouping (GROUP BY user fields)
  4. Aggregation (SUM of price × quantity)
  5. Aggregate filtering (HAVING total > 500)

Limitations

  • Training Scale: Trained with only 100 steps for demonstration. For production use, increase max_steps to 300+
  • Context Length: Limited to 2048 tokens maximum sequence length
  • SQL Dialects: Primarily trained on standard SQL syntax
  • Complex Subqueries: May require additional fine-tuning for highly complex nested queries

Reproduction

To reproduce this training:

  1. Clone the notebook: Use the provided Fine_tune_qlora.ipynb
  2. Install dependencies:
    pip install unsloth transformers torch trl datasets
    
  3. Configure training: Adjust max_steps in TrainingArguments for longer training
  4. Run training: Execute all cells in the notebook

Production Training Recommendations

# For production use, update these parameters:
max_steps = 300,  # Increase from 100
warmup_steps = 10,  # Increase warmup
per_device_train_batch_size = 4,  # If you have more GPU memory

Model Card

Parameter Value
Base Model Gemma-2-2B (4-bit quantized)
Fine-tuning Method QLoRA
LoRA Rank 16
Training Steps 100 (demo)
Learning Rate 2e-4
Batch Size 8 (effective)
Max Sequence Length 2048
Dataset Size 100k examples

Citation

@misc{gemma-2-2b-text-to-sql-qlora,
  author = {rajaykumar12959},
  title = {Gemma-2-2B Text-to-SQL QLoRA Fine-tuned Model},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/rajaykumar12959/gemma-2-2b-text-to-sql-qlora}},
}

Acknowledgments

  • Base Model: Google's Gemma-2-2B via Unsloth optimization
  • Dataset: Gretel AI's synthetic text-to-SQL dataset
  • Framework: Unsloth for efficient fine-tuning and TRL for training
  • Method: QLoRA for parameter-efficient training

License

This model is licensed under Apache 2.0. See the LICENSE file for details.


This model is intended for research and educational purposes. Please ensure compliance with your organization's data and AI usage policies when using in production environments.

Downloads last month
15
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rajaykumar12959/gemma-2-2b-sql-finetuned

Finetuned
(141)
this model

Dataset used to train rajaykumar12959/gemma-2-2b-sql-finetuned