MainStack
/

marvy-1-14B-lora

Text Generation

Model card Files Files and versions

marvy-1-14B-lora / README.md

tgetsov's picture

Upload README.md with huggingface_hub

740fa44 verified 1 day ago

|

history blame contribute delete

3.26 kB

	---
	license: apache-2.0
	base_model: Qwen/Qwen2.5-14B-Instruct
	base_model_relation: adapter
	library_name: peft
	pipeline_tag: text-generation
	language:
	- en
	tags:
	- servicenow
	- itsm
	- csdm
	- delivery
	- lora
	- adapter
	- qwen2.5
	- mlx
	---

	# marvy-1-14B-lora

	LoRA adapter for marvy-1-14B — the first open model for the full ServiceNow delivery lifecycle. Compose on top of Qwen2.5-14B-Instruct.

	This is the adapter-only release (~175 MB). Apply it on
	[`Qwen/Qwen2.5-14B-Instruct`](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct)
	to specialize the base for end-to-end ServiceNow delivery work. For ready-to-run
	weights see the merged model
	[`MainStack/marvy-1-14B`](https://huggingface.co/MainStack/marvy-1-14B) or the
	quantized [`MainStack/marvy-1-14B-GGUF`](https://huggingface.co/MainStack/marvy-1-14B-GGUF).

	> Released under Apache-2.0. Built with Qwen — see `NOTICE`.

	📖 Full usage (all runtimes + OpenCode wiring): [`USAGE.md`](./USAGE.md) ·
	Validate it works: [`VALIDATION.md`](./VALIDATION.md)

	## What it does

	Fine-tunes the base for business analysis, requirements, stakeholder mapping,
	systems inventory, Solution Design Documents, user stories with acceptance
	criteria, implementation planning, test-case generation, validation/critique,
	and end-to-end delivery chains (story → implementation → test).

	## Usage

	### MLX (Apple Silicon)

	```bash
	pip install mlx-lm
	python -m mlx_lm generate \
	--model Qwen/Qwen2.5-14B-Instruct \
	--adapter-path . \
	--system-prompt "You are a senior ServiceNow delivery consultant..." \
	--prompt "Write a user story with acceptance criteria for P1 SLA escalation." \
	--max-tokens 1024 --temp 0.4
	```

	### PEFT (Transformers)

	```python
	from peft import PeftModel
	from transformers import AutoModelForCausalLM, AutoTokenizer

	base = "Qwen/Qwen2.5-14B-Instruct"
	tok = AutoTokenizer.from_pretrained(base)
	model = AutoModelForCausalLM.from_pretrained(base, torch_dtype="auto", device_map="auto")
	model = PeftModel.from_pretrained(model, "MainStack/marvy-1-14B-lora")
	```

	> Note: the adapter was trained with MLX-LM. The MLX `adapter_config.json` /
	> `adapters.safetensors` are included. A PEFT-format conversion is provided for
	> Transformers users where available; otherwise prefer the MLX path or the
	> merged model.

	## Training summary

	\| Setting \| Value \|
	\|---\|---\|
	\| Method \| LoRA SFT (rank 32, scale 20, dropout 0.0) \|
	\| Target keys \| q/k/v/o_proj, gate/up/down_proj (top 16 layers) \|
	\| Max seq length \| 8,192 \|
	\| Effective batch \| 16 (batch 1 × grad-accum 16) \|
	\| Best checkpoint \| iter 150 (best validation loss) \|
	\| Framework \| MLX-LM 0.31.3 on Apple Silicon \|

	See the merged model card for full dataset, evaluation, and limitations.

	## License & attribution

	Dual-licensed: weights Apache-2.0, **MainStack contributions (cards, docs,
	benchmark) CC-BY-4.0 — see [`LICENSING.md`](./LICENSING.md). If you use
	marvy-1-14B as a baseline, fine-tune it, distill from it, or evaluate against
	it, please credit MainStack** and link to
	https://huggingface.co/MainStack/marvy-1-14B. Keep the `NOTICE` file intact
	(required by Apache-2.0 §4) and cite the entry on the
	[merged model card](https://huggingface.co/MainStack/marvy-1-14B#citation).