MainStack
/

marvy-1-14B-lora

+---
+license: apache-2.0
+base_model: Qwen/Qwen2.5-14B-Instruct
+base_model_relation: adapter
+library_name: peft
+pipeline_tag: text-generation
+language:
+  - en
+tags:
+  - servicenow
+  - itsm
+  - csdm
+  - delivery
+  - lora
+  - adapter
+  - qwen2.5
+  - mlx
+---
+# marvy-14B-lora
+LoRA adapter for [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct),
+specializing it for the **full ServiceNow delivery lifecycle**.
+This is the **adapter-only** release (~175 MB). For ready-to-run weights see the
+merged model [`MainStack/marvy-14B`](https://huggingface.co/MainStack/marvy-14B)
+or the quantized [`MainStack/marvy-14B-GGUF`](https://huggingface.co/MainStack/marvy-14B-GGUF).
+> Released under **Apache-2.0**. Built with Qwen — see `NOTICE`.
+📖 **Full usage** (all runtimes + OpenCode wiring): [`USAGE.md`](./USAGE.md) ·
+**Validate it works:** [`VALIDATION.md`](./VALIDATION.md)
+## What it does
+Fine-tunes the base for business analysis, requirements, stakeholder mapping,
+systems inventory, Solution Design Documents, user stories with acceptance
+criteria, implementation planning, test-case generation, validation/critique,
+and end-to-end delivery chains (story → implementation → test).
+## Usage
+### MLX (Apple Silicon)
+```bash
+pip install mlx-lm
+python -m mlx_lm generate \
+  --model Qwen/Qwen2.5-14B-Instruct \
+  --adapter-path .  \
+  --system-prompt "You are a senior ServiceNow delivery consultant..." \
+  --prompt "Write a user story with acceptance criteria for P1 SLA escalation." \
+  --max-tokens 1024 --temp 0.4
+```
+### PEFT (Transformers)
+```python
+from peft import PeftModel
+from transformers import AutoModelForCausalLM, AutoTokenizer
+base = "Qwen/Qwen2.5-14B-Instruct"
+tok = AutoTokenizer.from_pretrained(base)
+model = AutoModelForCausalLM.from_pretrained(base, torch_dtype="auto", device_map="auto")
+model = PeftModel.from_pretrained(model, "MainStack/marvy-14B-lora")
+```
+> Note: the adapter was trained with MLX-LM. The MLX `adapter_config.json` /
+> `adapters.safetensors` are included. A PEFT-format conversion is provided for
+> Transformers users where available; otherwise prefer the MLX path or the
+> merged model.
+## Training summary
+| Setting | Value |
+|---|---|
+| Method | LoRA SFT (rank 32, scale 20, dropout 0.0) |
+| Target keys | q/k/v/o_proj, gate/up/down_proj (top 16 layers) |
+| Max seq length | 8,192 |
+| Effective batch | 16 (batch 1 × grad-accum 16) |
+| Best checkpoint | iter 150 (best validation loss) |
+| Framework | MLX-LM 0.31.3 on Apple Silicon |
+See the merged model card for full dataset, evaluation, and limitations.