marvy-1-14B-lora / README.md
tgetsov's picture
Upload README.md with huggingface_hub
740fa44 verified
---
license: apache-2.0
base_model: Qwen/Qwen2.5-14B-Instruct
base_model_relation: adapter
library_name: peft
pipeline_tag: text-generation
language:
- en
tags:
- servicenow
- itsm
- csdm
- delivery
- lora
- adapter
- qwen2.5
- mlx
---
# marvy-1-14B-lora
**LoRA adapter for marvy-1-14B β€” the first open model for the full ServiceNow delivery lifecycle. Compose on top of Qwen2.5-14B-Instruct.**
This is the **adapter-only** release (~175 MB). Apply it on
[`Qwen/Qwen2.5-14B-Instruct`](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct)
to specialize the base for end-to-end ServiceNow delivery work. For ready-to-run
weights see the merged model
[`MainStack/marvy-1-14B`](https://huggingface.co/MainStack/marvy-1-14B) or the
quantized [`MainStack/marvy-1-14B-GGUF`](https://huggingface.co/MainStack/marvy-1-14B-GGUF).
> Released under **Apache-2.0**. Built with Qwen β€” see `NOTICE`.
πŸ“– **Full usage** (all runtimes + OpenCode wiring): [`USAGE.md`](./USAGE.md) Β·
**Validate it works:** [`VALIDATION.md`](./VALIDATION.md)
## What it does
Fine-tunes the base for business analysis, requirements, stakeholder mapping,
systems inventory, Solution Design Documents, user stories with acceptance
criteria, implementation planning, test-case generation, validation/critique,
and end-to-end delivery chains (story β†’ implementation β†’ test).
## Usage
### MLX (Apple Silicon)
```bash
pip install mlx-lm
python -m mlx_lm generate \
--model Qwen/Qwen2.5-14B-Instruct \
--adapter-path . \
--system-prompt "You are a senior ServiceNow delivery consultant..." \
--prompt "Write a user story with acceptance criteria for P1 SLA escalation." \
--max-tokens 1024 --temp 0.4
```
### PEFT (Transformers)
```python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base = "Qwen/Qwen2.5-14B-Instruct"
tok = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(model, "MainStack/marvy-1-14B-lora")
```
> Note: the adapter was trained with MLX-LM. The MLX `adapter_config.json` /
> `adapters.safetensors` are included. A PEFT-format conversion is provided for
> Transformers users where available; otherwise prefer the MLX path or the
> merged model.
## Training summary
| Setting | Value |
|---|---|
| Method | LoRA SFT (rank 32, scale 20, dropout 0.0) |
| Target keys | q/k/v/o_proj, gate/up/down_proj (top 16 layers) |
| Max seq length | 8,192 |
| Effective batch | 16 (batch 1 Γ— grad-accum 16) |
| Best checkpoint | iter 150 (best validation loss) |
| Framework | MLX-LM 0.31.3 on Apple Silicon |
See the merged model card for full dataset, evaluation, and limitations.
## License & attribution
Dual-licensed: **weights Apache-2.0**, **MainStack contributions (cards, docs,
benchmark) CC-BY-4.0** β€” see [`LICENSING.md`](./LICENSING.md). **If you use
marvy-1-14B as a baseline, fine-tune it, distill from it, or evaluate against
it, please credit MainStack** and link to
https://huggingface.co/MainStack/marvy-1-14B. Keep the `NOTICE` file intact
(required by Apache-2.0 Β§4) and cite the entry on the
[merged model card](https://huggingface.co/MainStack/marvy-1-14B#citation).