Maelstrome commited on
Commit
31c1f72
Β·
verified Β·
1 Parent(s): 3a226e2

Upload report/README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. report/README.md +12 -8
report/README.md CHANGED
@@ -14,9 +14,11 @@ datasets:
14
  - Maelstrome/lora-wave-session-dataset
15
  ---
16
 
17
- # lora-wave-session-r32 β€” training report
18
 
19
- Documentation-only repo. Contains the full training/eval write-up for the **rank-32 / 1-epoch A100** WAVE fine-tune of Gemma 4 E2B Instruct, plus the head-to-head comparison against its rank-16 sibling.
 
 
20
 
21
  ## Documents
22
 
@@ -26,13 +28,15 @@ Documentation-only repo. Contains the full training/eval write-up for the **rank
26
  | [`COMPARISON.md`](./COMPARISON.md) | Head-to-head vs the rank-16 / 3-epoch sibling run (`lora-wave-session`). Same dataset, same seed, same test split. r32 wins on every probability metric. |
27
  | [`MORNING_REPORT.md`](./MORNING_REPORT.md) | First-pass overnight summary written immediately after training completed. Preserved for history; superseded by `REPORT.md`. |
28
 
29
- ## Linked artifacts
 
 
 
 
 
 
30
 
31
- - πŸ¦₯ **Adapter:** [`Maelstrome/lora-wave-session-r32`](https://huggingface.co/Maelstrome/lora-wave-session-r32) β€” PEFT adapter (~194 MB)
32
- - πŸ›» **Merged bf16:** [`Maelstrome/lora-wave-session-r32-merged`](https://huggingface.co/Maelstrome/lora-wave-session-r32-merged) β€” drop-in for `transformers`/vLLM (~10 GB)
33
- - 🧊 **GGUF Q4_K_M:** [`Maelstrome/lora-wave-session-r32-gguf`](https://huggingface.co/Maelstrome/lora-wave-session-r32-gguf) β€” llama.cpp / Ollama / wllama (~4 GB)
34
- - πŸ“š **Dataset:** [`Maelstrome/lora-wave-session-dataset`](https://huggingface.co/datasets/Maelstrome/lora-wave-session-dataset) β€” 4,277 examples, frozen splits (seed `7`)
35
- - 🌊 **Sibling run (rank-16):** [`Maelstrome/lora-wave-session`](https://huggingface.co/Maelstrome/lora-wave-session) β€” same dataset, different recipe
36
 
37
  ## Headline numbers
38
 
 
14
  - Maelstrome/lora-wave-session-dataset
15
  ---
16
 
17
+ # `report/` β€” Training & evaluation write-up
18
 
19
+ The full training/eval documentation for the **rank-32 / 1-epoch A100** WAVE fine-tune of Gemma 4 E2B Instruct, plus the head-to-head comparison against its rank-16 sibling.
20
+
21
+ Originally a standalone repo (`Maelstrome/lora-wave-session-r32-report`); now lives as a subdirectory of the consolidated [`Maelstrome/lora-wave-session-r32`](https://huggingface.co/Maelstrome/lora-wave-session-r32) repo alongside the adapter weights, GGUF, and MediaPipe artifacts.
22
 
23
  ## Documents
24
 
 
28
  | [`COMPARISON.md`](./COMPARISON.md) | Head-to-head vs the rank-16 / 3-epoch sibling run (`lora-wave-session`). Same dataset, same seed, same test split. r32 wins on every probability metric. |
29
  | [`MORNING_REPORT.md`](./MORNING_REPORT.md) | First-pass overnight summary written immediately after training completed. Preserved for history; superseded by `REPORT.md`. |
30
 
31
+ ## Sibling artifacts in this repo
32
+
33
+ - πŸ¦₯ **PEFT adapter** β€” at the [repo root](../) (~194 MB). Pairs with `unsloth/gemma-4-E2B-it`.
34
+ - 🧊 **GGUF Q4_K_M (5-shard split)** β€” [`gguf/`](../gguf) (~3.2 GB). For llama.cpp / Ollama / [wllama](https://github.com/ngxson/wllama) browser runtime.
35
+ - πŸ“± **MediaPipe LiteRT bundle** β€” [`mediapipe/`](../mediapipe) (~4.95 GB). For MediaPipe LLM Inference (Android / iOS / web).
36
+ - πŸ“š **Dataset:** [`Maelstrome/lora-wave-session-dataset`](https://huggingface.co/datasets/Maelstrome/lora-wave-session-dataset) β€” 4,277 examples, frozen splits (seed `7`).
37
+ - 🌊 **Sibling rank-16 run:** [`Maelstrome/lora-wave-session`](https://huggingface.co/Maelstrome/lora-wave-session) β€” same dataset, different recipe (rank-16 / 3-epoch RTX 5080). Has the same subdir layout.
38
 
39
+ > The `-merged` and `-gguf` sibling repos referenced in older versions of this doc were consolidated and deleted. The current `gguf/` subdir is a fresh build from a PEFT re-merge (the original unsloth-merged base produced corrupt all-`<pad>` output and is no longer published).
 
 
 
 
40
 
41
  ## Headline numbers
42