---
license: apache-2.0
base_model: Qwen/Qwen2.5-14B-Instruct
base_model_relation: adapter
library_name: peft
pipeline_tag: text-generation
language:
  - en
tags:
  - servicenow
  - itsm
  - csdm
  - delivery
  - lora
  - adapter
  - qwen2.5
  - mlx
---

# marvy-1-14B-lora

**LoRA adapter for marvy-1-14B — the first open model for the full ServiceNow delivery lifecycle. Compose on top of Qwen2.5-14B-Instruct.**

This is the **adapter-only** release (~175 MB). Apply it on
[`Qwen/Qwen2.5-14B-Instruct`](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct)
to specialize the base for end-to-end ServiceNow delivery work. For ready-to-run
weights see the merged model
[`MainStack/marvy-1-14B`](https://huggingface.co/MainStack/marvy-1-14B) or the
quantized [`MainStack/marvy-1-14B-GGUF`](https://huggingface.co/MainStack/marvy-1-14B-GGUF).

> Released under **Apache-2.0**. Built with Qwen — see `NOTICE`.

📖 **Full usage** (all runtimes + OpenCode wiring): [`USAGE.md`](./USAGE.md) ·
**Validate it works:** [`VALIDATION.md`](./VALIDATION.md)

## What it does

Fine-tunes the base for business analysis, requirements, stakeholder mapping,
systems inventory, Solution Design Documents, user stories with acceptance
criteria, implementation planning, test-case generation, validation/critique,
and end-to-end delivery chains (story → implementation → test).

## Usage

### MLX (Apple Silicon)

```bash
pip install mlx-lm
python -m mlx_lm generate \
  --model Qwen/Qwen2.5-14B-Instruct \
  --adapter-path .  \
  --system-prompt "You are a senior ServiceNow delivery consultant..." \
  --prompt "Write a user story with acceptance criteria for P1 SLA escalation." \
  --max-tokens 1024 --temp 0.4
```

### PEFT (Transformers)

```python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = "Qwen/Qwen2.5-14B-Instruct"
tok = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(model, "MainStack/marvy-1-14B-lora")
```

> Note: the adapter was trained with MLX-LM. The MLX `adapter_config.json` /
> `adapters.safetensors` are included. A PEFT-format conversion is provided for
> Transformers users where available; otherwise prefer the MLX path or the
> merged model.

## Training summary

| Setting | Value |
|---|---|
| Method | LoRA SFT (rank 32, scale 20, dropout 0.0) |
| Target keys | q/k/v/o_proj, gate/up/down_proj (top 16 layers) |
| Max seq length | 8,192 |
| Effective batch | 16 (batch 1 × grad-accum 16) |
| Best checkpoint | iter 150 (best validation loss) |
| Framework | MLX-LM 0.31.3 on Apple Silicon |

See the merged model card for full dataset, evaluation, and limitations.

## License & attribution

Dual-licensed: **weights Apache-2.0**, **MainStack contributions (cards, docs,
benchmark) CC-BY-4.0** — see [`LICENSING.md`](./LICENSING.md). **If you use
marvy-1-14B as a baseline, fine-tune it, distill from it, or evaluate against
it, please credit MainStack** and link to
https://huggingface.co/MainStack/marvy-1-14B. Keep the `NOTICE` file intact
(required by Apache-2.0 §4) and cite the entry on the
[merged model card](https://huggingface.co/MainStack/marvy-1-14B#citation).