Maelstrome commited on
Commit
e4aa4d1
·
verified ·
1 Parent(s): ae87a6f

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -41,13 +41,13 @@ This repo is the single home for the r16 fine-tune. Everything lives here:
41
  | `tokenizer.json`, `tokenizer_config.json`, `chat_template.jinja`, `processor_config.json` (root) | Gemma 4 tokenizer + chat template | required for any inference path |
42
  | [`gguf/`](./tree/main/gguf) | Q4_K_M GGUF (~3.27 GB, single file) + Ollama Modelfile | llama.cpp / Ollama / LM Studio |
43
 
44
- > The `Maelstrome/lora-wave-session-gguf` sibling is **deprecated** — its contents were moved into this repo's `gguf/` subdirectory. The rank-32 variant has the same layout at [`Maelstrome/lora-wave-session-r32`](https://huggingface.co/Maelstrome/lora-wave-session-r32).
45
  >
46
- > Note: the GGUF here is a **single 3.27 GB file**, not pre-split. It works directly with llama.cpp/Ollama but **will not load in [wllama](https://github.com/ngxson/wllama)** because it exceeds the 2 GB ArrayBuffer per-file limit. If you want to run this rank-16 build in the browser, split it first with `llama-gguf-split --split-max-size 512M`. The rank-32 sibling is pre-split in its `gguf/` subdir if you'd rather just use that.
47
 
48
  ## Sibling runs
49
 
50
- This is the **rank-16 / 3-epoch RTX 5080** training of the WAVE corpus. The rank-32 / 1-epoch A100 sibling lives at [`Maelstrome/lora-wave-session-r32`](https://huggingface.co/Maelstrome/lora-wave-session-r32) and wins on every probability metric on the same frozen 428-row test split:
51
 
52
  | | **rank-16 (this run)** | rank-32 (sibling) |
53
  |---|---|---|
@@ -57,7 +57,7 @@ This is the **rank-16 / 3-epoch RTX 5080** training of the WAVE corpus. The rank
57
  | Mean NLL Δ vs base | **0.327 nats** | 0.508 nats |
58
  | Sign-test p-value | **9.5 × 10⁻⁷¹** | 2.9 × 10⁻¹²⁹ |
59
 
60
- See [`Maelstrome/lora-wave-session-r32-report`](https://huggingface.co/Maelstrome/lora-wave-session-r32-report) for the full head-to-head report (recipes, generation eval, reproducibility check).
61
 
62
  ## Provenance and intended use
63
 
 
41
  | `tokenizer.json`, `tokenizer_config.json`, `chat_template.jinja`, `processor_config.json` (root) | Gemma 4 tokenizer + chat template | required for any inference path |
42
  | [`gguf/`](./tree/main/gguf) | Q4_K_M GGUF (~3.27 GB, single file) + Ollama Modelfile | llama.cpp / Ollama / LM Studio |
43
 
44
+ > The previously-published `Maelstrome/lora-wave-session-gguf` sibling has been **consolidated into this repo and deleted**. The rank-32 variant has the same layout at [`Maelstrome/lora-wave-session-r32`](https://huggingface.co/Maelstrome/lora-wave-session-r32). Any external link to the old sibling URL will 404.
45
  >
46
+ > **Note on browser use:** the GGUF here is a **single 3.27 GB file**, not pre-split. It works directly with llama.cpp / Ollama / LM Studio but **will not load in [wllama](https://github.com/ngxson/wllama)** because it exceeds the 2 GB-per-file `ArrayBuffer` limit. To run this r16 build in-browser, either split it first with `llama-gguf-split --split-max-size 512M` or use the [r32 sibling](https://huggingface.co/Maelstrome/lora-wave-session-r32), which ships pre-split.
47
 
48
  ## Sibling runs
49
 
50
+ This is the **rank-16 / 3-epoch RTX 5080** training of the WAVE corpus. The rank-32 / 1-epoch A100 sibling lives at [`Maelstrome/lora-wave-session-r32`](https://huggingface.co/Maelstrome/lora-wave-session-r32) (same subdir layout: adapter at root, `gguf/` subdir; plus `mediapipe/` and `report/`). On the same frozen 428-row test split, r32 wins on every probability metric:
51
 
52
  | | **rank-16 (this run)** | rank-32 (sibling) |
53
  |---|---|---|
 
57
  | Mean NLL Δ vs base | **0.327 nats** | 0.508 nats |
58
  | Sign-test p-value | **9.5 × 10⁻⁷¹** | 2.9 × 10⁻¹²⁹ |
59
 
60
+ Full head-to-head in [`Maelstrome/lora-wave-session-r32/report/`](https://huggingface.co/Maelstrome/lora-wave-session-r32/tree/main/report) (the comparison + run-report markdown documents).
61
 
62
  ## Provenance and intended use
63