walidsobhie-code Claude Opus 4.6 commited on
Commit
97fa10c
Β·
1 Parent(s): 729d832

Fix Gradio/huggingface_hub version compatibility

Browse files

- Add gradio>=4.12.0 and huggingface_hub>=0.20.0 to requirements
- Update Dockerfile to upgrade both packages before install
- Add MODEL_CARD.md and merge_adapter.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Files changed (6) hide show
  1. Dockerfile +1 -0
  2. MODEL_CARD.md +88 -0
  3. merge_adapter.py +33 -0
  4. requirements.txt +4 -0
  5. src/cli/main.py +2 -2
  6. test_model.py +1 -1
Dockerfile CHANGED
@@ -3,6 +3,7 @@ FROM python:3.10-slim
3
  WORKDIR /app
4
 
5
  COPY requirements.txt .
 
6
  RUN pip install --no-cache-dir -r requirements.txt
7
 
8
  COPY . .
 
3
  WORKDIR /app
4
 
5
  COPY requirements.txt .
6
+ RUN pip install --no-cache-dir --upgrade gradio huggingface_hub
7
  RUN pip install --no-cache-dir -r requirements.txt
8
 
9
  COPY . .
MODEL_CARD.md ADDED
@@ -0,0 +1,88 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - code
5
+ library_name: transformers
6
+ license: apache-2.0
7
+ tags:
8
+ - code generation
9
+ - python
10
+ - qwen
11
+ - fine-tuned
12
+ - stack-overflow
13
+ - coding-assistant
14
+ ---
15
+
16
+ # Stack 2.9 Fine-tuned
17
+
18
+ A fine-tuned version of [Qwen2.5-Coder-1.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B) trained on Stack Overflow data.
19
+
20
+ ## Model Details
21
+
22
+ - **Base Model:** Qwen/Qwen2.5-Coder-1.5B
23
+ - **Architecture:** Transformer decoder with grouped query attention
24
+ - **Parameters:** 1.5B
25
+ - **Context Length:** 8,192 tokens
26
+ - **Precision:** FP16
27
+ - **Trained on:** Stack Overflow Q&A data
28
+
29
+ ## Capabilities
30
+
31
+ βœ… **Code Generation** β€” Write Python, SQL, JavaScript, and more
32
+ βœ… **Code Completion** β€” Complete functions and snippets
33
+ βœ… **Programming Help** β€” Debug, explain, and refactor code
34
+ βœ… **Natural Language** β€” Answer questions and chat
35
+
36
+ ## Usage
37
+
38
+ ### Python (Transformers)
39
+
40
+ ```python
41
+ from transformers import AutoModelForCausalLM, AutoTokenizer
42
+
43
+ model = AutoModelForCausalLM.from_pretrained("my-ai-stack/stack-2-9-finetuned")
44
+ tokenizer = AutoTokenizer.from_pretrained("my-ai-stack/stack-2-9-finetuned")
45
+
46
+ prompt = "def quick_sort(arr):"
47
+ inputs = tokenizer(prompt, return_tensors="pt")
48
+ outputs = model.generate(**inputs, max_new_tokens=100)
49
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
50
+ ```
51
+
52
+ ### Interactive Chat
53
+
54
+ ```python
55
+ # See chat.py in the repo for an interactive CLI
56
+ ```
57
+
58
+ ## Training Details
59
+
60
+ - **Method:** LoRA fine-tuning
61
+ - **Rank:** 8
62
+ - **Epochs:** ~0.8
63
+ - **Final Loss:** 0.0205
64
+ - **Data:** Stack Overflow code Q&A
65
+
66
+ ## Limitations
67
+
68
+ ⚠️ **Training Contamination** β€” May occasionally repeat training examples
69
+ ⚠️ **Small Model** β€” 1.5B params; larger models (7B, 32B) perform better
70
+ ⚠️ **Single Language** β€” Primarily trained on Python-heavy data
71
+ ⚠️ **No Tool Use** β€” This is a base model, not an agent
72
+
73
+ ## Citation
74
+
75
+ ```bibtex
76
+ @misc{my-ai-stack/stack-2-9-finetuned,
77
+ author = {Walid Sobhi},
78
+ title = {Stack 2.9 Fine-tuned on Stack Overflow},
79
+ year = {2026},
80
+ publisher = {HuggingFace},
81
+ url = {https://huggingface.co/my-ai-stack/stack-2-9-finetuned}
82
+ }
83
+ ```
84
+
85
+ ## Contact
86
+
87
+ - **GitHub:** [my-ai-stack/stack-2.9](https://github.com/my-ai-stack/stack-2.9)
88
+ - **Author:** Walid Sobhi
merge_adapter.py ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from peft import PeftModel
2
+ from transformers import AutoModelForCausalLM, AutoTokenizer
3
+ import torch
4
+ import argparse
5
+
6
+ def main():
7
+ parser = argparse.ArgumentParser()
8
+ parser.add_argument("--base-model", required=True)
9
+ parser.add_argument("--adapter-path", required=True)
10
+ parser.add_argument("--output-path", required=True)
11
+ args = parser.parse_args()
12
+
13
+ print(f"Loading base model: {args.base_model}")
14
+ # Load without device_map to avoid the error
15
+ model = AutoModelForCausalLM.from_pretrained(
16
+ args.base_model,
17
+ torch_dtype=torch.float32 # Use FP32 for stability
18
+ )
19
+ tokenizer = AutoTokenizer.from_pretrained(args.base_model)
20
+
21
+ print(f"Loading adapter: {args.adapter_path}")
22
+ model = PeftModel.from_pretrained(model, args.adapter_path)
23
+
24
+ print("Merging...")
25
+ model = model.merge_and_unload()
26
+
27
+ print(f"Saving to: {args.output_path}")
28
+ model.save_pretrained(args.output_path)
29
+ tokenizer.save_pretrained(args.output_path)
30
+ print("βœ… Done!")
31
+
32
+ if __name__ == "__main__":
33
+ main()
requirements.txt CHANGED
@@ -3,6 +3,10 @@
3
  # Core
4
  stack-cli>=2.9.0
5
 
 
 
 
 
6
  # Training & ML
7
  torch>=2.0.0
8
  transformers>=4.35.0
 
3
  # Core
4
  stack-cli>=2.9.0
5
 
6
+ # Gradio UI
7
+ gradio>=4.12.0
8
+ huggingface_hub>=0.20.0
9
+
10
  # Training & ML
11
  torch>=2.0.0
12
  transformers>=4.35.0
src/cli/main.py CHANGED
@@ -12,8 +12,8 @@ from typing import Optional
12
 
13
  # Add parent directories to path
14
  sys.path.insert(0, str(Path(__file__).parent))
15
- sys.path.insert(0, str(Path(__file__).parent.parent / "stack-2.9-eval"))
16
- sys.path.insert(0, str(Path(__file__).parent.parent / "stack-2.9-training"))
17
 
18
  from model_client import create_model_client, ChatMessage
19
  from pattern_miner import PatternMiner
 
12
 
13
  # Add parent directories to path
14
  sys.path.insert(0, str(Path(__file__).parent))
15
+ sys.path.insert(0, str(Path(__file__).parent.parent / "stack" / "eval"))
16
+ sys.path.insert(0, str(Path(__file__).parent.parent / "stack" / "training"))
17
 
18
  from model_client import create_model_client, ChatMessage
19
  from pattern_miner import PatternMiner
test_model.py CHANGED
@@ -410,7 +410,7 @@ def run_test(model, tokenizer, test_config: Dict) -> Dict:
410
  start_time = time.time()
411
 
412
  # Generate completion
413
- completions = generate_completion(model, tokenizer, prompt, max_tokens=max_tokens)
414
  elapsed = time.time() - start_time
415
 
416
  # Extract and check code
 
410
  start_time = time.time()
411
 
412
  # Generate completion
413
+ completions = generate_completion(model, tokenizer, prompt, max_new_tokens=max_tokens)
414
  elapsed = time.time() - start_time
415
 
416
  # Extract and check code