Create README.md

Browse files

Files changed (1) hide show

README.md +278 -0

README.md ADDED Viewed

	@@ -0,0 +1,278 @@

+---
+license: apache-2.0
+base_model: Qwen/Qwen2.5-Coder-7B-Instruct
+tags:
+  - code
+  - fine-tuned
+  - qwen
+  - python
+  - programming
+  - lora
+  - production-ready
+language:
+  - en
+pipeline_tag: text-generation
+library_name: transformers
+datasets:
+  - glaiveai/glaive-code-assistant-v2
+---
+# Qwen2.5-Coder-7B-ResearchAudio-v2
+**Production-Ready Code Generation Model**
+Fine-tuned from [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) to produce **complete, clean, production-ready code**.
+## Why This Model?
+| Base Model | This Model |
+|------------|------------|
+| Verbose explanations | Concise, to-the-point |
+| Often truncates mid-function | **Complete implementations** |
+| Theory-heavy for system design | **Working code** |
+| Needs post-processing | Copy-paste ready |
+## Real-World Comparison
+We tested both models on 10 coding problems. **This model won 9/10.**
+### Example: Rate Limiter
+**Prompt:** "Design a rate limiter"
+**Base Model Output:**
+```
+Designing a rate limiter is crucial for maintaining performance...
+### 1. Define Requirements
+- **Rate**: Maximum requests allowed per unit time
+- **Time Window**: Duration over which rate limit applies
+- **Concurrency**: How many clients can be processed...
+```
+❌ Theory only, no code
+**This Model Output:**
+```python
+import time
+class RateLimiter:
+    def __init__(self, max_requests, period):
+        self.max_requests = max_requests
+        self.period = period
+        self.requests = []
+    def is_allowed(self):
+        current_time = int(time.time())
+        self.requests = [r for r in self.requests if current_time - r < self.period]
+        if len(self.requests) >= self.max_requests:
+            return False
+        self.requests.append(current_time)
+        return True
+```
+✅ Working implementation
+---
+### Example: Binary Search
+**Base Model:** Long explanation, truncated code
+```
+Sure! Binary search is an efficient algorithm for finding an item
+from a sorted list of items. It works by repeatedly dividing...
+def binary_search(arr, target):
+    """
+    Perform binary search on a sorted array to find the index...
+    Parameters:
+    arr (list): The sorted list of elements.
+    target (int/float): The value to search for...
+```
+❌ Verbose, truncated
+**This Model:**
+```python
+def binary_search(arr, x):
+    low = 0
+    high = len(arr) - 1
+    while low <= high:
+        mid = (low + high) // 2
+        if arr[mid] == x:
+            return mid
+        elif arr[mid] < x:
+            low = mid + 1
+        else:
+            high = mid - 1
+    return -1
+```
+✅ Clean, complete
+---
+### Example: Binary Tree Serialization
+**Base Model:** Only showed TreeNode class definition, no actual serialization code
+**This Model:**
+```python
+class Node:
+    def __init__(self, val):
+        self.val = val
+        self.left = None
+        self.right = None
+def serialize(root):
+    if root is None:
+        return '#'
+    left = serialize(root.left)
+    right = serialize(root.right)
+    return str(root.val) + ',' + left + ',' + right
+def deserialize(data):
+    data_list = data.split(',')
+    index = [0]
+    def helper():
+        if data_list[index[0]] == '#':
+            index[0] += 1
+            return None
+        node = Node(int(data_list[index[0]]))
+        index[0] += 1
+        node.left = helper()
+        node.right = helper()
+        return node
+    return helper()
+```
+✅ Complete serialize AND deserialize
+---
+## Test Results Summary
+| Problem | Base | v2 | Winner |
+|---------|------|-----|--------|
+| LRU Cache | Truncated | Complete | ✅ v2 |
+| Binary Search | Verbose, truncated | Clean, complete | ✅ v2 |
+| Rate Limiter | Theory only | Working code | ✅ v2 |
+| Merge Sort | Truncated | More complete | ✅ v2 |
+| Trie | Truncated at insert | Insert + search | ✅ v2 |
+| Thread-safe Singleton | Complete | Complete | Tie |
+| Dijkstra | Truncated | More complete | ✅ v2 |
+| Retry Decorator | Verbose docstrings | Concise, working | ✅ v2 |
+| Connection Pool | Truncated | Get + release | ✅ v2 |
+| Binary Tree Serialize | TreeNode only | Full implementation | ✅ v2 |
+**Score: 9/10 wins**
+---
+## Training Details
+| Parameter | Value |
+|-----------|-------|
+| Base Model | Qwen2.5-Coder-7B-Instruct |
+| Dataset | [glaive-code-assistant-v2](https://huggingface.co/datasets/glaiveai/glaive-code-assistant-v2) |
+| Samples | 50,000 |
+| Epochs | 2 |
+| Method | LoRA (r=16, alpha=32) |
+| Batch Size | 16 |
+| Learning Rate | 2e-4 |
+| Hardware | NVIDIA H200 |
+| Training Time | ~4 hours |
+---
+## Benchmark Comparison
+General benchmarks show slight decrease (expected when specializing for code):
+| Benchmark | Base | v2 | Delta |
+|-----------|------|-----|-------|
+| MMLU | 64.6% | 62.8% | -1.8% |
+| HellaSwag | 74.6% | 72.8% | -1.8% |
+| Winogrande | 70.2% | 67.5% | -2.8% |
+| ARC-Challenge | 48.5% | 48.4% | -0.1% |
+**Trade-off:** Small general knowledge drop → Much better code output quality
+For a **code-focused model**, this is the right trade-off.
+---
+## Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+model = AutoModelForCausalLM.from_pretrained(
+    "researchaudio/qwen2.5-coder-7b-researchaudio-v2",
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+    trust_remote_code=True
+)
+tokenizer = AutoTokenizer.from_pretrained(
+    "researchaudio/qwen2.5-coder-7b-researchaudio-v2",
+    trust_remote_code=True
+)
+prompt = "Implement a thread-safe queue in Python"
+messages = [{"role": "user", "content": prompt}]
+text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(text, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=500, do_sample=False)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+---
+## Best For
+✅ Code generation APIs
+✅ IDE extensions / autocomplete
+✅ CI/CD automation
+✅ System design implementations
+✅ Prototyping
+✅ Learning algorithms (clear, complete examples)
+## Not Recommended For
+❌ General knowledge Q&A
+❌ Long explanations / tutorials
+❌ Non-code tasks
+---
+## Version History
+| Version | Base | Dataset | Focus |
+|---------|------|---------|-------|
+| v1 | Qwen2.5-Coder-7B | 500K mixed (Magicoder, Nemotron, etc.) | General code |
+| **v2** | v1 | 50K Glaive | **Production-ready output** |
+---
+## Citation
+```bibtex
+@misc{qwen2.5-coder-researchaudio-v2,
+  author = {ResearchAudio},
+  title = {Qwen2.5-Coder-7B-ResearchAudio-v2: Production-Ready Code Generation},
+  year = {2024},
+  publisher = {HuggingFace},
+  url = {https://huggingface.co/researchaudio/qwen2.5-coder-7b-researchaudio-v2}
+}
+```
+---
+## License
+Apache 2.0
+---
+**Built by [ResearchAudio](https://researchaudio.io)**