tgetsov commited on
Commit
8cbeed7
·
verified ·
1 Parent(s): 2b2d10e

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +83 -0
README.md ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: Qwen/Qwen2.5-14B-Instruct
4
+ base_model_relation: adapter
5
+ library_name: peft
6
+ pipeline_tag: text-generation
7
+ language:
8
+ - en
9
+ tags:
10
+ - servicenow
11
+ - itsm
12
+ - csdm
13
+ - delivery
14
+ - lora
15
+ - adapter
16
+ - qwen2.5
17
+ - mlx
18
+ ---
19
+
20
+ # marvy-14B-lora
21
+
22
+ LoRA adapter for [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct),
23
+ specializing it for the **full ServiceNow delivery lifecycle**.
24
+
25
+ This is the **adapter-only** release (~175 MB). For ready-to-run weights see the
26
+ merged model [`MainStack/marvy-14B`](https://huggingface.co/MainStack/marvy-14B)
27
+ or the quantized [`MainStack/marvy-14B-GGUF`](https://huggingface.co/MainStack/marvy-14B-GGUF).
28
+
29
+ > Released under **Apache-2.0**. Built with Qwen — see `NOTICE`.
30
+
31
+ 📖 **Full usage** (all runtimes + OpenCode wiring): [`USAGE.md`](./USAGE.md) ·
32
+ **Validate it works:** [`VALIDATION.md`](./VALIDATION.md)
33
+
34
+ ## What it does
35
+
36
+ Fine-tunes the base for business analysis, requirements, stakeholder mapping,
37
+ systems inventory, Solution Design Documents, user stories with acceptance
38
+ criteria, implementation planning, test-case generation, validation/critique,
39
+ and end-to-end delivery chains (story → implementation → test).
40
+
41
+ ## Usage
42
+
43
+ ### MLX (Apple Silicon)
44
+
45
+ ```bash
46
+ pip install mlx-lm
47
+ python -m mlx_lm generate \
48
+ --model Qwen/Qwen2.5-14B-Instruct \
49
+ --adapter-path . \
50
+ --system-prompt "You are a senior ServiceNow delivery consultant..." \
51
+ --prompt "Write a user story with acceptance criteria for P1 SLA escalation." \
52
+ --max-tokens 1024 --temp 0.4
53
+ ```
54
+
55
+ ### PEFT (Transformers)
56
+
57
+ ```python
58
+ from peft import PeftModel
59
+ from transformers import AutoModelForCausalLM, AutoTokenizer
60
+
61
+ base = "Qwen/Qwen2.5-14B-Instruct"
62
+ tok = AutoTokenizer.from_pretrained(base)
63
+ model = AutoModelForCausalLM.from_pretrained(base, torch_dtype="auto", device_map="auto")
64
+ model = PeftModel.from_pretrained(model, "MainStack/marvy-14B-lora")
65
+ ```
66
+
67
+ > Note: the adapter was trained with MLX-LM. The MLX `adapter_config.json` /
68
+ > `adapters.safetensors` are included. A PEFT-format conversion is provided for
69
+ > Transformers users where available; otherwise prefer the MLX path or the
70
+ > merged model.
71
+
72
+ ## Training summary
73
+
74
+ | Setting | Value |
75
+ |---|---|
76
+ | Method | LoRA SFT (rank 32, scale 20, dropout 0.0) |
77
+ | Target keys | q/k/v/o_proj, gate/up/down_proj (top 16 layers) |
78
+ | Max seq length | 8,192 |
79
+ | Effective batch | 16 (batch 1 × grad-accum 16) |
80
+ | Best checkpoint | iter 150 (best validation loss) |
81
+ | Framework | MLX-LM 0.31.3 on Apple Silicon |
82
+
83
+ See the merged model card for full dataset, evaluation, and limitations.