bhavnicksm commited on
Commit
665a5e6
·
verified ·
1 Parent(s): b050c80

Update model card README with v0.0.4 API

Browse files
Files changed (1) hide show
  1. README.md +51 -2
README.md CHANGED
@@ -1,9 +1,58 @@
1
  ---
2
- license: mit
3
  tags:
4
  - tokie
 
5
  ---
 
6
  <p align="center">
7
- <img src="tokie-banner.png" alt="tokie banner">
8
  </p>
9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  tags:
3
  - tokie
4
+ library_name: tokie
5
  ---
6
+
7
  <p align="center">
8
+ <img src="tokie-banner.png" alt="tokie" width="600">
9
  </p>
10
 
11
+ # voyage-code-2
12
+
13
+ Pre-built [tokie](https://github.com/chonkie-inc/tokie) tokenizer for [voyageai/voyage-code-2](https://huggingface.co/voyageai/voyage-code-2).
14
+
15
+ ## Quick Start (Python)
16
+
17
+ ```bash
18
+ pip install tokie
19
+ ```
20
+
21
+ ```python
22
+ import tokie
23
+
24
+ tokenizer = tokie.Tokenizer.from_pretrained("tokiers/voyage-code-2")
25
+ encoding = tokenizer.encode("Hello, world!")
26
+ print(encoding.ids)
27
+ print(encoding.attention_mask)
28
+ ```
29
+
30
+ ## Quick Start (Rust)
31
+
32
+ ```toml
33
+ [dependencies]
34
+ tokie = { version = "0.0.4", features = ["hf"] }
35
+ ```
36
+
37
+ ```rust
38
+ use tokie::Tokenizer;
39
+
40
+ let tokenizer = Tokenizer::from_pretrained("tokiers/voyage-code-2").unwrap();
41
+ let encoding = tokenizer.encode("Hello, world!", true);
42
+ println!("{:?}", encoding.ids);
43
+ ```
44
+
45
+ ## Files
46
+
47
+ - `tokenizer.tkz` — tokie binary format (~10x smaller, loads in ~5ms)
48
+ - `tokenizer.json` — original HuggingFace tokenizer (if available)
49
+
50
+ ## About tokie
51
+
52
+ **50x faster tokenization, 10x smaller model files, 100% accurate.**
53
+
54
+ tokie is a drop-in replacement for HuggingFace tokenizers, built in Rust. See [GitHub](https://github.com/chonkie-inc/tokie) for benchmarks and documentation.
55
+
56
+ ## License
57
+
58
+ MIT OR Apache-2.0 (tokie library). Original model files retain their original license from [voyageai/voyage-code-2](https://huggingface.co/voyageai/voyage-code-2).