voyage-code-2 / README.md
bhavnicksm's picture
Update model card README with v0.0.4 API
665a5e6 verified
metadata
tags:
  - tokie
library_name: tokie

tokie

voyage-code-2

Pre-built tokie tokenizer for voyageai/voyage-code-2.

Quick Start (Python)

pip install tokie
import tokie

tokenizer = tokie.Tokenizer.from_pretrained("tokiers/voyage-code-2")
encoding = tokenizer.encode("Hello, world!")
print(encoding.ids)
print(encoding.attention_mask)

Quick Start (Rust)

[dependencies]
tokie = { version = "0.0.4", features = ["hf"] }
use tokie::Tokenizer;

let tokenizer = Tokenizer::from_pretrained("tokiers/voyage-code-2").unwrap();
let encoding = tokenizer.encode("Hello, world!", true);
println!("{:?}", encoding.ids);

Files

  • tokenizer.tkz — tokie binary format (~10x smaller, loads in ~5ms)
  • tokenizer.json — original HuggingFace tokenizer (if available)

About tokie

50x faster tokenization, 10x smaller model files, 100% accurate.

tokie is a drop-in replacement for HuggingFace tokenizers, built in Rust. See GitHub for benchmarks and documentation.

License

MIT OR Apache-2.0 (tokie library). Original model files retain their original license from voyageai/voyage-code-2.