Cohere-embed-multilingual-light-v3.0
Pre-built tokie tokenizer for Cohere/Cohere-embed-multilingual-light-v3.0.
Quick Start (Python)
pip install tokie
import tokie
tokenizer = tokie.Tokenizer.from_pretrained("tokiers/Cohere-embed-multilingual-light-v3.0")
encoding = tokenizer.encode("Hello, world!")
print(encoding.ids)
print(encoding.attention_mask)
Quick Start (Rust)
[dependencies]
tokie = { version = "0.0.4", features = ["hf"] }
use tokie::Tokenizer;
let tokenizer = Tokenizer::from_pretrained("tokiers/Cohere-embed-multilingual-light-v3.0").unwrap();
let encoding = tokenizer.encode("Hello, world!", true);
println!("{:?}", encoding.ids);
Files
tokenizer.tkzโ tokie binary format (~10x smaller, loads in ~5ms)tokenizer.jsonโ original HuggingFace tokenizer (if available)
About tokie
50x faster tokenization, 10x smaller model files, 100% accurate.
tokie is a drop-in replacement for HuggingFace tokenizers, built in Rust. See GitHub for benchmarks and documentation.
License
MIT OR Apache-2.0 (tokie library). Original model files retain their original license from Cohere/Cohere-embed-multilingual-light-v3.0.
- Downloads last month
- 26
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support