tgetsov commited on
Commit
81b1a6d
·
verified ·
1 Parent(s): f5a43f2

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +11 -11
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  license: apache-2.0
3
- base_model: MainStack/marvy-14B
4
  base_model_relation: quantized
5
  pipeline_tag: text-generation
6
  language:
@@ -17,11 +17,11 @@ tags:
17
  - qwen2.5
18
  ---
19
 
20
- # marvy-14B-GGUF
21
 
22
- **GGUF quants of marvy-14B, the first open LLM for the full ServiceNow delivery lifecycle. Run it locally and privately on Apple Silicon, LM Studio, or Ollama.**
23
 
24
- GGUF quantizations of [`MainStack/marvy-14B`](https://huggingface.co/MainStack/marvy-14B)
25
  for use with [llama.cpp](https://github.com/ggerganov/llama.cpp),
26
  [Ollama](https://ollama.com), [LM Studio](https://lmstudio.ai), and compatible runtimes.
27
 
@@ -31,30 +31,30 @@ for use with [llama.cpp](https://github.com/ggerganov/llama.cpp),
31
 
32
  | File | Quant | Size (approx) | Use when |
33
  |---|---|---|---|
34
- | `marvy-14B-Q4_K_M.gguf` | Q4_K_M | ~9 GB | Default — best size/quality balance, laptops |
35
- | `marvy-14B-Q8_0.gguf` | Q8_0 | ~16 GB | Highest fidelity, near-FP16 quality |
36
 
37
  ## Quick start
38
 
39
  ### Ollama
40
 
41
  ```bash
42
- ollama run hf.co/MainStack/marvy-14B-GGUF:Q4_K_M
43
  ```
44
 
45
  ### llama.cpp
46
 
47
  ```bash
48
- ./llama-cli -hf MainStack/marvy-14B-GGUF:Q4_K_M \
49
  -p "Write a ServiceNow user story with acceptance criteria for P1 SLA escalation." \
50
  --temp 0.4
51
  ```
52
 
53
  ### LM Studio
54
 
55
- 1. In the model browser, search `MainStack/marvy-14B-GGUF` and download a quant
56
  (`Q4_K_M` recommended), **or** drop the `.gguf` into
57
- `~/.lmstudio/models/MainStack/marvy-14B-GGUF/`.
58
  2. Load it, set the system prompt below, temperature ~0.4.
59
  3. To use from code/OpenCode, start the local server:
60
  ```bash
@@ -82,7 +82,7 @@ professional English.
82
 
83
  ## Provenance & limitations
84
 
85
- See the [merged model card](https://huggingface.co/MainStack/marvy-14B) for the
86
  full training data, anonymization methodology, evaluation (test ppl 13.107 on a
87
  project-disjoint split), and limitations. Quantization adds the usual minor
88
  quality reduction versus the FP16 model.
 
1
  ---
2
  license: apache-2.0
3
+ base_model: MainStack/marvy-1-14B
4
  base_model_relation: quantized
5
  pipeline_tag: text-generation
6
  language:
 
17
  - qwen2.5
18
  ---
19
 
20
+ # marvy-1-14B-GGUF
21
 
22
+ **GGUF quants of marvy-1-14B, the first open LLM for the full ServiceNow delivery lifecycle. Run it locally and privately on Apple Silicon, LM Studio, or Ollama.**
23
 
24
+ GGUF quantizations of [`MainStack/marvy-1-14B`](https://huggingface.co/MainStack/marvy-1-14B)
25
  for use with [llama.cpp](https://github.com/ggerganov/llama.cpp),
26
  [Ollama](https://ollama.com), [LM Studio](https://lmstudio.ai), and compatible runtimes.
27
 
 
31
 
32
  | File | Quant | Size (approx) | Use when |
33
  |---|---|---|---|
34
+ | `marvy-1-14B-Q4_K_M.gguf` | Q4_K_M | ~9 GB | Default — best size/quality balance, laptops |
35
+ | `marvy-1-14B-Q8_0.gguf` | Q8_0 | ~16 GB | Highest fidelity, near-FP16 quality |
36
 
37
  ## Quick start
38
 
39
  ### Ollama
40
 
41
  ```bash
42
+ ollama run hf.co/MainStack/marvy-1-14B-GGUF:Q4_K_M
43
  ```
44
 
45
  ### llama.cpp
46
 
47
  ```bash
48
+ ./llama-cli -hf MainStack/marvy-1-14B-GGUF:Q4_K_M \
49
  -p "Write a ServiceNow user story with acceptance criteria for P1 SLA escalation." \
50
  --temp 0.4
51
  ```
52
 
53
  ### LM Studio
54
 
55
+ 1. In the model browser, search `MainStack/marvy-1-14B-GGUF` and download a quant
56
  (`Q4_K_M` recommended), **or** drop the `.gguf` into
57
+ `~/.lmstudio/models/MainStack/marvy-1-14B-GGUF/`.
58
  2. Load it, set the system prompt below, temperature ~0.4.
59
  3. To use from code/OpenCode, start the local server:
60
  ```bash
 
82
 
83
  ## Provenance & limitations
84
 
85
+ See the [merged model card](https://huggingface.co/MainStack/marvy-1-14B) for the
86
  full training data, anonymization methodology, evaluation (test ppl 13.107 on a
87
  project-disjoint split), and limitations. Quantization adds the usual minor
88
  quality reduction versus the FP16 model.