Gemma-2-2B Text-to-SQL QLoRA Fine-tuned Model

Developed by: rajaykumar12959
License: apache-2.0
Finetuned from model: unsloth/gemma-2-2b-it-bnb-4bit
Dataset: gretelai/synthetic_text_to_sql
Task: Text-to-SQL Generation
Fine-tuning Method: QLoRA (Quantized Low-Rank Adaptation)

This gemma2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Model Description

This model is specifically fine-tuned to generate SQL queries from natural language questions and database schemas. It excels at handling complex multi-table queries requiring JOINs, aggregations, filtering, and advanced SQL operations.

Key Features

✅ Multi-table JOINs (INNER, LEFT, RIGHT)
✅ Aggregation functions (SUM, COUNT, AVG, MIN, MAX)
✅ GROUP BY and HAVING clauses
✅ Complex WHERE conditions
✅ Subqueries and CTEs
✅ Date/time operations
✅ String functions and pattern matching

Training Configuration

The model was fine-tuned using QLoRA with the following configuration:

# LoRA Configuration
r = 16  # Rank: 16 is a good balance for 2B models
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
lora_alpha = 16
lora_dropout = 0
bias = "none"
use_gradient_checkpointing = "unsloth"

# Training Parameters
max_seq_length = 2048
per_device_train_batch_size = 2
gradient_accumulation_steps = 4  # Effective batch size = 8
warmup_steps = 5
max_steps = 100  # Demo configuration - increase to 300+ for production
learning_rate = 2e-4
optim = "adamw_8bit"  # 8-bit optimizer for memory efficiency
weight_decay = 0.01
lr_scheduler_type = "linear"

Installation

pip install unsloth transformers torch trl datasets

Usage

Loading the Model

from unsloth import FastLanguageModel
import torch

max_seq_length = 2048
dtype = None
load_in_4bit = True

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "rajaykumar12959/gemma-2-2b-text-to-sql-qlora",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

FastLanguageModel.for_inference(model)  # Enable faster inference

Inference Function

def inference_text_to_sql(model, tokenizer, schema, question, max_new_tokens=300):
    """
    Perform inference to generate SQL from natural language question and database schema.
    
    Args:
        model: Fine-tuned Gemma model
        tokenizer: Model tokenizer
        schema: Database schema as string
        question: Natural language question
        max_new_tokens: Maximum tokens to generate
    
    Returns:
        Generated SQL query as string
    """
    # Format the input prompt
    input_prompt = f"""<start_of_turn>user
You are a powerful text-to-SQL model. Your job is to answer questions about a database. You are given a question and context regarding one or more tables.

### Schema:
{schema}

### Question:
{question}<end_of_turn>
<start_of_turn>model
"""
    
    # Tokenize input
    inputs = tokenizer([input_prompt], return_tensors="pt").to("cuda")
    
    # Generate output
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            use_cache=True,
            do_sample=True,
            temperature=0.1,  # Low temperature for more deterministic output
            top_p=0.9,
            pad_token_id=tokenizer.eos_token_id
        )
    
    # Decode and clean the result
    result = tokenizer.batch_decode(outputs)[0]
    sql_query = result.split("<start_of_turn>model")[-1].replace("<end_of_turn>", "").strip()
    
    return sql_query

Example Usage

Example 1: Simple Single-Table Query

# Simple employee database
simple_schema = """
CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    name TEXT,
    department TEXT,
    salary DECIMAL,
    hire_date DATE
);
"""

simple_question = "Find all employees in the 'Engineering' department with salary greater than 75000"

sql_result = inference_text_to_sql(model, tokenizer, simple_schema, simple_question)
print(f"Generated SQL:\n{sql_result}")

Expected Output:

SELECT * FROM employees 
WHERE department = 'Engineering' 
AND salary > 75000;

Training Details

Dataset

Source: gretelai/synthetic_text_to_sql
Size: 100,000 synthetic text-to-SQL examples
Columns used:
- sql_context: Database schema
- sql_prompt: Natural language question
- sql: Target SQL query

Training Process

The model uses a custom formatting function to structure the training data:

def formatting_prompts_func(examples):
    schemas   = examples["sql_context"]
    questions = examples["sql_prompt"] 
    outputs   = examples["sql"]

    texts = []
    for schema, question, output in zip(schemas, questions, outputs):
        text = gemma_prompt.format(schema, question, output) + EOS_TOKEN
        texts.append(text)
    return { "text" : texts, }

Hardware Requirements

GPU: Single GPU with 8GB+ VRAM
Training Time: ~30 minutes for 100 steps
Memory Optimization: 4-bit quantization + 8-bit optimizer

Performance Characteristics

Strengths

Excellent performance on multi-table JOINs
Accurate aggregation and GROUP BY operations
Proper handling of foreign key relationships
Good understanding of filtering logic (WHERE/HAVING)

Model Capabilities Test

The model was tested on a complex 4-table JOIN query requiring:

Multi-table JOINs (users → orders → order_items → products)
Category filtering (WHERE p.category = 'Electronics')
User grouping (GROUP BY user fields)
Aggregation (SUM of price × quantity)
Aggregate filtering (HAVING total > 500)

Limitations

Training Scale: Trained with only 100 steps for demonstration. For production use, increase max_steps to 300+
Context Length: Limited to 2048 tokens maximum sequence length
SQL Dialects: Primarily trained on standard SQL syntax
Complex Subqueries: May require additional fine-tuning for highly complex nested queries

Reproduction

To reproduce this training:

Clone the notebook: Use the provided Fine_tune_qlora.ipynb

Install dependencies:

pip install unsloth transformers torch trl datasets

Configure training: Adjust max_steps in TrainingArguments for longer training
Run training: Execute all cells in the notebook

Production Training Recommendations

# For production use, update these parameters:
max_steps = 300,  # Increase from 100
warmup_steps = 10,  # Increase warmup
per_device_train_batch_size = 4,  # If you have more GPU memory

Model Card

Parameter	Value
Base Model	Gemma-2-2B (4-bit quantized)
Fine-tuning Method	QLoRA
LoRA Rank	16
Training Steps	100 (demo)
Learning Rate	2e-4
Batch Size	8 (effective)
Max Sequence Length	2048
Dataset Size	100k examples

Citation

@misc{gemma-2-2b-text-to-sql-qlora,
  author = {rajaykumar12959},
  title = {Gemma-2-2B Text-to-SQL QLoRA Fine-tuned Model},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/rajaykumar12959/gemma-2-2b-text-to-sql-qlora}},
}

Acknowledgments

Base Model: Google's Gemma-2-2B via Unsloth optimization
Dataset: Gretel AI's synthetic text-to-SQL dataset
Framework: Unsloth for efficient fine-tuning and TRL for training
Method: QLoRA for parameter-efficient training

License

This model is licensed under Apache 2.0. See the LICENSE file for details.

This model is intended for research and educational purposes. Please ensure compliance with your organization's data and AI usage policies when using in production environments.

Downloads last month: 15

Safetensors

Model size

3B params

Tensor type

BF16

Model tree for rajaykumar12959/gemma-2-2b-sql-finetuned

Base model

unsloth/gemma-2-2b-it-bnb-4bit

Finetuned

(141)

this model

rajaykumar12959
/

gemma-2-2b-sql-finetuned