--- license: mit tags: - tensalang - llm-inference - mlir - safetensors language: - en --- # TensaLang Example Models The example models' weights for [TensaLang](https://github.com/BenChaliah/Tensa-Lang), a programming language for LLM inference. ## What is TensaLang? A programming language for LLM inference. Implement new models with ease and compile through MLIR to CUDA, CPU-SIMD, MLX, or ROCm. The runtime is the program. ``` fn attention_f16(q: Tensor, key_cache: Tensor, value_cache: Tensor, layer: i32, pos: i32, H: i32, scale: f32) -> Tensor with tile=[8, 64], parallel=[h, t] { var att: Tensor = zeros([H, SeqLen]) # Compute attention scores att[h, t] = if t > pos { -inf } else { sum(i) q[h * Dh + i] * (key_cache[layer, t, h * Dh + i] as f32) * scale } var weights: Tensor = softmax(att) # ... weighted sum over values } ``` From the creator of [Datarus-R1-14B](https://huggingface.co/Datarus/Datarus-R1-14B). ## Example models | Model | Parameters | Format | Description | |-------|------------|--------|-------------| | `llama2_7b_f16.safetensors` | 7B | FP16 | Llama2-7B | | `qwen2.5_coder_0.5b_bf16.safetensors` | 0.5B | BF16 | Qwen2.5-Coder-0.5B-Instruct | ## Usage ```bash # Clone TensaLang git clone https://github.com/BenChaliah/Tensa-Lang.git cd Tensa-Lang && ./build.sh # Download models huggingface-cli download BenChaliah/TensaLang-models --local-dir ./models # Or download a specific model huggingface-cli download BenChaliah/TensaLang-models llama2_7b_f16.safetensors --local-dir ./Llama2-assets ``` ### Run Llama2 ```bash ./bin/tensalang-run examples/llama2_manual_tiling_fp16.tl \ --model Llama2-assets/llama2_7b_f16.safetensors \ --tokenizer Llama2-assets/tokenizer.json \ --prompt "Once upon a time" \ --target cuda \ --steps 128 \ --fused-attention 2 \ --cuda-arch sm_89 ``` ### Run Qwen2.5-Coder ```bash ./bin/tensalang-run examples/qwen25_coder_bf16.tl \ --model Qwen25-assets/qwen2.5_coder_0.5b_bf16.safetensors \ --tokenizer Qwen25-assets/tokenizer.json \ --prompt "def quicksort(arr):" \ --target cuda \ --steps 64 \ --cuda-arch sm_89 ``` ## Source Weights converted from: - [meta-llama/Llama-2-7b](https://huggingface.co/meta-llama/Llama-2-7b) - [Qwen/Qwen2.5-Coder-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B-Instruct) ## License Model weights retain their original licenses. TensaLang compiler is MIT licensed. ## Links - [TensaLang GitHub](https://github.com/BenChaliah/Tensa-Lang) - [Documentation](https://tensa-lang.org/docs.html) - [Website](https://tensa-lang.org/) - [Datarus-R1-14B](https://huggingface.co/Datarus/Datarus-R1-14B)