llamatelemetry Binaries v0.1.0
Pre-compiled CUDA binaries for llamatelemetry - CUDA-first OpenTelemetry Python SDK for LLM inference observability.
π¦ Available Binaries
| Version | File | Size | Target Platform | SHA256 |
|---|---|---|---|---|
| v0.1.0 | llamatelemetry-v0.1.0-cuda12-kaggle-t4x2.tar.gz | 1.4 GB | Kaggle 2Γ Tesla T4, CUDA 12.5 | 31889a86116818be5a42a7bd4a20fde14be25f27348cabf2644259625374b355 |
π Auto-Download (Recommended)
These binaries are automatically downloaded when you install llamatelemetry:
# Install on Kaggle with GPU T4 Γ 2
pip install --no-cache-dir --force-reinstall \
git+https://github.com/llamatelemetry/llamatelemetry.git@v0.1.0
On first import llamatelemetry, the package will:
- Detect your GPU (Tesla T4 required)
- Check for cached binaries in
~/.cache/llamatelemetry/ - Download from HuggingFace CDN (this repo - fast, ~2-5 MB/s)
- Fallback to GitHub Releases if needed
- Verify SHA256 checksum:
31889a86116818be5a42a7bd4a20fde14be25f27348cabf2644259625374b355 - Extract 13 binaries + libraries to package directory
- Configure environment variables
π₯ Manual Download
Using huggingface_hub
from huggingface_hub import hf_hub_download
binary_path = hf_hub_download(
repo_id="waqasm86/llamatelemetry-binaries",
filename="v0.1.0/llamatelemetry-v0.1.0-cuda12-kaggle-t4x2.tar.gz",
cache_dir="/kaggle/working/cache"
)
print(f"Downloaded to: {binary_path}")
Direct Download URL
wget https://huggingface.co/waqasm86/llamatelemetry-binaries/resolve/main/v0.1.0/llamatelemetry-v0.1.0-cuda12-kaggle-t4x2.tar.gz
Verify Checksum
# Download checksum file
wget https://huggingface.co/waqasm86/llamatelemetry-binaries/resolve/main/v0.1.0/llamatelemetry-v0.1.0-cuda12-kaggle-t4x2.tar.gz.sha256
# Verify
sha256sum -c llamatelemetry-v0.1.0-cuda12-kaggle-t4x2.tar.gz.sha256
π Build Information
| Property | Value |
|---|---|
| Version | 0.1.0 |
| CUDA Version | 12.5 |
| Compute Capability | SM 7.5 (Tesla T4) |
| llama.cpp Version | b7760 (commit 388ce82) |
| Build Date | 2026-02-03 |
| Target Platform | Kaggle dual Tesla T4 GPUs (2Γ 15GB VRAM) |
| Binaries Included | 13 (llama-server, llama-cli, llama-bench, etc.) |
| Libraries | CUDA shared libraries + dependencies |
π§ What's Inside
The binary bundle contains:
Executables (13 binaries)
llama-server- OpenAI-compatible API serverllama-cli- CLI inference toolllama-bench- Benchmarking utilityllama-quantize- Model quantization tool- And 9 more utilities
Shared Libraries
- CUDA 12.5 shared libraries
- cuBLAS, cuDNN dependencies
- llama.cpp runtime libraries
π Links
- GitHub Repository: https://github.com/llamatelemetry/llamatelemetry
- GitHub Releases: https://github.com/llamatelemetry/llamatelemetry/releases/tag/v0.1.0
- Installation Guide: KAGGLE_INSTALL_GUIDE.md
- Models Repository: https://huggingface.co/waqasm86/llamatelemetry-models
- Documentation: https://llamatelemetry.github.io (planned)
π― Supported Platforms
| Platform | GPU | CUDA | Status |
|---|---|---|---|
| Kaggle Notebooks | 2Γ Tesla T4 (SM 7.5) | 12.5 | β Supported |
| Google Colab | Tesla T4 (SM 7.5) | 12.x | π Planned (v0.2.0) |
| Local Workstation | Tesla T4, RTX 4000+ | 12.x+ | π Planned (v0.2.0) |
| Other GPUs | SM < 7.5 | Any | β Not supported |
π License
MIT License - See LICENSE
π Troubleshooting
Binary Download Fails
- Check internet connection in Kaggle notebook settings
- Retry import:
import llamatelemetry(automatic retry logic) - Manual download: Use
hf_hub_download()method above - GitHub fallback: Binaries also available at GitHub Releases
GPU Not Detected
from llamatelemetry import check_cuda_available, get_cuda_device_info
print(f"CUDA Available: {check_cuda_available()}")
print(f"GPU Info: {get_cuda_device_info()}")
Expected on Kaggle T4 Γ 2:
CUDA Available: True
GPU Info: {'gpu_name': 'Tesla T4', 'cuda_version': '12.5', 'compute_capability': '7.5'}
Incompatible GPU Error
llamatelemetry v0.1.0 requires Tesla T4 (SM 7.5) or newer. If you see "GPU compute capability < 7.5", you're running on an incompatible GPU.
Solution: Use Kaggle with "GPU T4 Γ 2" accelerator setting.
Maintained by: waqasm86
Version: 0.1.0
Last Updated: 2026-02-03
Status: Active Development
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support