faxenoff commited on
Commit
f900094
Β·
verified Β·
1 Parent(s): 0807aba

card: accurate file list + tokenizer (SP) note

Browse files
Files changed (1) hide show
  1. README.md +7 -4
README.md CHANGED
@@ -68,8 +68,10 @@ This model trades long-context capability for raw throughput on short code units
68
  passage embeddings, unlike the teacher whose prefix is query-only). Mean-pool β†’ **L2-normalize**.
69
  - For smaller indexes, truncate to **256** or **512** dims (MRL) before normalizing.
70
 
71
- Primarily consumed by the UltraCode daemon via the bundled engines. For standalone use, run
72
- `model.onnx` with `onnxruntime` + the bundled `sentencepiece.bpe.model`:
 
 
73
 
74
  ```python
75
  import onnxruntime as ort, sentencepiece as spm, numpy as np
@@ -99,8 +101,9 @@ hardware. No compilation on the user's machine.
99
  - **TVM** `*_tvm_vulkan.{dll,so}` β€” Vulkan fallback for non-TRT / older NVIDIA & other GPUs, per bucket.
100
  - **OpenVINO** `*.xml` + `*.bin` β€” Intel **CPU / iGPU / NPU**, per bucket.
101
  - **Metal** `*_tvm_metal.*` β€” Apple Silicon (macOS), per bucket.
102
- - **Source / tokenizer** β€” `model.onnx` (+ `model.onnx.data`) FP32 Β· `model_int8qdt.onnx` INT8 Q/DQ Β·
103
- `sentencepiece.bpe.model` Β· `tokenizer.json`.
 
104
 
105
  ## Evaluation β€” in-scope CoIR (sub-CoIR)
106
 
 
68
  passage embeddings, unlike the teacher whose prefix is query-only). Mean-pool β†’ **L2-normalize**.
69
  - For smaller indexes, truncate to **256** or **512** dims (MRL) before normalizing.
70
 
71
+ The daemon runs the bundled engines directly (this repo is its CDN). The embedding recipe below is
72
+ illustrative β€” `model.onnx` is **not bundled** here; it shows how an engine maps text β†’ vector
73
+ (tokenize with the bundled `sentencepiece.bpe.model`, run, the pooled `[B,768]` is already produced,
74
+ then L2-normalize):
75
 
76
  ```python
77
  import onnxruntime as ort, sentencepiece as spm, numpy as np
 
101
  - **TVM** `*_tvm_vulkan.{dll,so}` β€” Vulkan fallback for non-TRT / older NVIDIA & other GPUs, per bucket.
102
  - **OpenVINO** `*.xml` + `*.bin` β€” Intel **CPU / iGPU / NPU**, per bucket.
103
  - **Metal** `*_tvm_metal.*` β€” Apple Silicon (macOS), per bucket.
104
+ - **Tokenizer** β€” `sentencepiece.bpe.model` (the model's SentencePiece; specials baked at
105
+ pad=0 / unk=1 / bos=2 / eos=3, byte-fallback) + `tokenizer_config.json`. The daemon loads the SP
106
+ directly; the FP32 `model.onnx` source is not bundled here (this repo is the daemon's engine CDN).
107
 
108
  ## Evaluation β€” in-scope CoIR (sub-CoIR)
109