|
|
--- |
|
|
language: |
|
|
- en |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- text-generation |
|
|
- technical-documentation |
|
|
- readme |
|
|
- qwen |
|
|
- qlora |
|
|
pipeline_tag: text-generation |
|
|
base_model: Qwen/Qwen2.5-Coder-7B-Instruct |
|
|
model-index: |
|
|
- name: Tech-Scribe-v1 |
|
|
results: |
|
|
- task: |
|
|
type: text-generation |
|
|
name: Text Generation |
|
|
dataset: |
|
|
name: collected_data_external |
|
|
type: tech-docs |
|
|
metrics: |
|
|
- type: loss |
|
|
value: 1.1258 |
|
|
--- |
|
|
|
|
|
# Tech Scribe (Qwen 2.5 7B Fine-tune) |
|
|
|
|
|
**Tech Scribe** is a specialized language model fine-tuned to generate high-quality, structured technical documentation (READMEs, Model Cards) from simple project descriptions. It is built on top of `Qwen/Qwen2.5-Coder-7B-Instruct` using QLoRA. |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from peft import PeftModel |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig |
|
|
|
|
|
# Config for 4-bit loading |
|
|
bnb_config = BitsAndBytesConfig( |
|
|
load_in_4bit=True, |
|
|
bnb_4bit_compute_dtype=torch.float16, |
|
|
bnb_4bit_quant_type="nf4" |
|
|
) |
|
|
|
|
|
# Load Base Model |
|
|
base_model_name = "Qwen/Qwen2.5-Coder-7B-Instruct" |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
base_model_name, |
|
|
quantization_config=bnb_config, |
|
|
device_map="auto" |
|
|
) |
|
|
|
|
|
# Load Tech Scribe Adapter |
|
|
adapter_name = "Darmm/tech-scribe-v1" # Example path |
|
|
model = PeftModel.from_pretrained(model, adapter_name) |
|
|
tokenizer = AutoTokenizer.from_pretrained(base_model_name) |
|
|
|
|
|
# Generate |
|
|
project_idea = "A Python library for real-time sentiment analysis using websockets" |
|
|
prompt = f"### Instruction:\nWrite a high-quality technical README or Model Card for the project \"{project_idea}\".\n\n### Response:\n" |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.7) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("### Response:")[1]) |
|
|
``` |
|
|
|
|
|
## Model Description |
|
|
|
|
|
- **Developed by:** Darmm Lab |
|
|
- **Base Model:** `Qwen/Qwen2.5-Coder-7B-Instruct` |
|
|
- **Fine-tuning Method:** QLoRA (4-bit quantization with LoRA adapters) |
|
|
- **Task:** Technical Documentation Generation |
|
|
- **Language:** English |
|
|
|
|
|
## Training (summary) |
|
|
|
|
|
The model was fine-tuned on a curated dataset of high-quality READMEs from top open-source repositories (e.g., PyTorch, FastAPI, React, HuggingFace Transformers). |
|
|
|
|
|
- **Epochs:** 1 (Prototype run) |
|
|
- **Batch size:** 1 (Gradient Accumulation: 8) |
|
|
- **Learning rate:** 2e-4 |
|
|
- **Optimizer:** AdamW |
|
|
- **Hardware:** NVIDIA A100 80GB |
|
|
|
|
|
## Metrics |
|
|
|
|
|
```json |
|
|
{ |
|
|
"eval_loss": 1.1258, |
|
|
"train_loss": 1.2937, |
|
|
"epoch": 0.73 |
|
|
} |
|
|
``` |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
- Rapidly generating boilerplate documentation for new software projects. |
|
|
- converting rough notes into structured Markdown documentation. |
|
|
- Learning best practices for technical writing structure. |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- **Prototype Status:** This model was trained on a small subset of data for demonstration purposes. |
|
|
- **Hallucination:** Like all LLMs, it may generate plausible-sounding but incorrect installation instructions or API calls. Always verify the generated code. |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@misc{techscribe2026, |
|
|
author = {Darmm Lab}, |
|
|
title = {Tech Scribe: Automated Technical Documentation Generator}, |
|
|
year = {2026}, |
|
|
publisher = {Hugging Face}, |
|
|
journal = {Hugging Face Repository}, |
|
|
howpublished = {\url{https://huggingface.co/Darmm/tech-scribe-v1}} |
|
|
} |
|
|
``` |
|
|
|