researchaudio's picture
Create README.md
3379b4e verified
metadata
license: apache-2.0
base_model: Qwen/Qwen2.5-Coder-7B-Instruct
tags:
  - code
  - fine-tuned
  - qwen
  - python
  - programming
  - lora
  - production-ready
language:
  - en
pipeline_tag: text-generation
library_name: transformers
datasets:
  - glaiveai/glaive-code-assistant-v2

Qwen2.5-Coder-7B-ResearchAudio-v2

Production-Ready Code Generation Model

Fine-tuned from Qwen2.5-Coder-7B-Instruct to produce complete, clean, production-ready code.

Why This Model?

Base Model This Model
Verbose explanations Concise, to-the-point
Often truncates mid-function Complete implementations
Theory-heavy for system design Working code
Needs post-processing Copy-paste ready

Real-World Comparison

We tested both models on 10 coding problems. This model won 9/10.

Example: Rate Limiter

Prompt: "Design a rate limiter"

Base Model Output:

Designing a rate limiter is crucial for maintaining performance...

### 1. Define Requirements
- **Rate**: Maximum requests allowed per unit time
- **Time Window**: Duration over which rate limit applies
- **Concurrency**: How many clients can be processed...

❌ Theory only, no code

This Model Output:

import time

class RateLimiter:
    def __init__(self, max_requests, period):
        self.max_requests = max_requests
        self.period = period
        self.requests = []

    def is_allowed(self):
        current_time = int(time.time())
        self.requests = [r for r in self.requests if current_time - r < self.period]
        if len(self.requests) >= self.max_requests:
            return False
        self.requests.append(current_time)
        return True

βœ… Working implementation


Example: Binary Search

Base Model: Long explanation, truncated code

Sure! Binary search is an efficient algorithm for finding an item 
from a sorted list of items. It works by repeatedly dividing...

def binary_search(arr, target):
    """
    Perform binary search on a sorted array to find the index...
    
    Parameters:
    arr (list): The sorted list of elements.
    target (int/float): The value to search for...

❌ Verbose, truncated

This Model:

def binary_search(arr, x):
    low = 0
    high = len(arr) - 1

    while low <= high:
        mid = (low + high) // 2
        if arr[mid] == x:
            return mid
        elif arr[mid] < x:
            low = mid + 1
        else:
            high = mid - 1
    return -1

βœ… Clean, complete


Example: Binary Tree Serialization

Base Model: Only showed TreeNode class definition, no actual serialization code

This Model:

class Node:
    def __init__(self, val):
        self.val = val
        self.left = None
        self.right = None

def serialize(root):
    if root is None:
        return '#'
    left = serialize(root.left)
    right = serialize(root.right)
    return str(root.val) + ',' + left + ',' + right

def deserialize(data):
    data_list = data.split(',')
    index = [0]

    def helper():
        if data_list[index[0]] == '#':
            index[0] += 1
            return None
        node = Node(int(data_list[index[0]]))
        index[0] += 1
        node.left = helper()
        node.right = helper()
        return node
    
    return helper()

βœ… Complete serialize AND deserialize


Test Results Summary

Problem Base v2 Winner
LRU Cache Truncated Complete βœ… v2
Binary Search Verbose, truncated Clean, complete βœ… v2
Rate Limiter Theory only Working code βœ… v2
Merge Sort Truncated More complete βœ… v2
Trie Truncated at insert Insert + search βœ… v2
Thread-safe Singleton Complete Complete Tie
Dijkstra Truncated More complete βœ… v2
Retry Decorator Verbose docstrings Concise, working βœ… v2
Connection Pool Truncated Get + release βœ… v2
Binary Tree Serialize TreeNode only Full implementation βœ… v2

Score: 9/10 wins


Training Details

Parameter Value
Base Model Qwen2.5-Coder-7B-Instruct
Dataset glaive-code-assistant-v2
Samples 50,000
Epochs 2
Method LoRA (r=16, alpha=32)
Batch Size 16
Learning Rate 2e-4
Hardware NVIDIA H200
Training Time ~4 hours

Benchmark Comparison

General benchmarks show slight decrease (expected when specializing for code):

Benchmark Base v2 Delta
MMLU 64.6% 62.8% -1.8%
HellaSwag 74.6% 72.8% -1.8%
Winogrande 70.2% 67.5% -2.8%
ARC-Challenge 48.5% 48.4% -0.1%

Trade-off: Small general knowledge drop β†’ Much better code output quality

For a code-focused model, this is the right trade-off.


Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "researchaudio/qwen2.5-coder-7b-researchaudio-v2",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(
    "researchaudio/qwen2.5-coder-7b-researchaudio-v2",
    trust_remote_code=True
)

prompt = "Implement a thread-safe queue in Python"
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=500, do_sample=False)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Best For

βœ… Code generation APIs βœ… IDE extensions / autocomplete βœ… CI/CD automation βœ… System design implementations βœ… Prototyping βœ… Learning algorithms (clear, complete examples)

Not Recommended For

❌ General knowledge Q&A ❌ Long explanations / tutorials ❌ Non-code tasks


Version History

Version Base Dataset Focus
v1 Qwen2.5-Coder-7B 500K mixed (Magicoder, Nemotron, etc.) General code
v2 v1 50K Glaive Production-ready output

Citation

@misc{qwen2.5-coder-researchaudio-v2,
  author = {ResearchAudio},
  title = {Qwen2.5-Coder-7B-ResearchAudio-v2: Production-Ready Code Generation},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/researchaudio/qwen2.5-coder-7b-researchaudio-v2}
}

License

Apache 2.0


Built by ResearchAudio