Instructions to use amaye15/ttm-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use amaye15/ttm-gguf with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="amaye15/ttm-gguf", filename="ttm-rs/gguf/ttm-f16.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use amaye15/ttm-gguf with llama.cpp:
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh # Start a local OpenAI-compatible server with a web UI: llama serve -hf amaye15/ttm-gguf:F16 # Run inference directly in the terminal: llama cli -hf amaye15/ttm-gguf:F16
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama serve -hf amaye15/ttm-gguf:F16 # Run inference directly in the terminal: llama cli -hf amaye15/ttm-gguf:F16
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf amaye15/ttm-gguf:F16 # Run inference directly in the terminal: ./llama-cli -hf amaye15/ttm-gguf:F16
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf amaye15/ttm-gguf:F16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf amaye15/ttm-gguf:F16
Use Docker
docker model run hf.co/amaye15/ttm-gguf:F16
- LM Studio
- Jan
- Ollama
How to use amaye15/ttm-gguf with Ollama:
ollama run hf.co/amaye15/ttm-gguf:F16
- Unsloth Studio
How to use amaye15/ttm-gguf with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for amaye15/ttm-gguf to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for amaye15/ttm-gguf to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for amaye15/ttm-gguf to start chatting
- Atomic Chat new
- Docker Model Runner
How to use amaye15/ttm-gguf with Docker Model Runner:
docker model run hf.co/amaye15/ttm-gguf:F16
- Lemonade
How to use amaye15/ttm-gguf with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull amaye15/ttm-gguf:F16
Run and chat with the model
lemonade run user.ttm-gguf-F16
List all available models
lemonade list
output = llm(
"Once upon a time,",
max_tokens=512,
echo=True
)
print(output)ttm-rs
Pure Rust converter and inference engine for ibm-granite/granite-timeseries-ttm-r2.
Pre-converted GGUF files are available at amaye15/ttm-gguf. Produces GGUF v3 files and runs native forecasting โ no Python required.
Build
cargo build --release
Convert
Downloads the model from HuggingFace and writes a GGUF file:
# F16 (recommended)
./target/release/ttm-rs convert --model ibm-granite/granite-timeseries-ttm-r2 --dtype f16 --output gguf/ttm-f16.gguf
# Q8_0 (smallest)
./target/release/ttm-rs convert --dtype q8 --output gguf/ttm-q8.gguf
# F32 (full precision)
./target/release/ttm-rs convert --dtype f32 --output gguf/ttm-f32.gguf
To convert all dtypes at once:
./scripts/convert_all.sh
HuggingFace token (optional for public models):
HF_TOKEN=hf_... ./scripts/convert_all.sh
Inspect tensors
Print all tensor names and shapes from a safetensors file:
./target/release/ttm-rs inspect-tensors models/model.safetensors
Infer
Run forecasting from comma-separated context values. The context must be at least patch_length steps (defined in config.json):
echo '{"context": [1.0, 1.2, 1.5, 1.3, 1.8, 2.0, 1.9, 2.1], "horizon": 96}' \
| ./target/release/ttm-rs infer \
--gguf gguf/ttm-f16.gguf \
--config models/config.json
Output is JSON in an OpenAI-compatible forecast format:
{
"id": "forecast-000001932b7a1234",
"object": "forecast",
"created": 1749686400,
"model": "ttm",
"choices": [{
"index": 0,
"forecast": {
"point": [2.1, 2.3, 2.5, "..."],
"quantiles": {}
},
"finish_reason": "stop"
}],
"usage": {"context_length": 8, "forecast_length": 96}
}
Batch inference โ pass multiple series as a nested array to get one Choice per series:
echo '{"context": [[1.0, 1.2, 1.5], [2.0, 2.2, 2.5]], "horizon": 96}' \
| ./target/release/ttm-rs infer \
--gguf gguf/ttm-f16.gguf \
--config models/config.json
Python bindings
Install with maturin inside a virtual environment:
python -m venv .venv && source .venv/bin/activate
pip install maturin
maturin develop --features python
import ttm_rs
model = ttm_rs.Ttm("gguf/ttm-f16.gguf", "models/config.json")
result = model.forecast([1.0, 1.2, 1.5, 1.3, 1.8, 2.0], horizon=96)
point = result["choices"][0]["forecast"]["point"]
# Batch โ one Choice per series
result = model.forecast([[1.0, 1.2, 1.5], [2.0, 2.2, 2.5]], horizon=96)
forecast returns a Python dict in the same OpenAI-compatible format as the CLI.
Architecture notes
TTM (Tiny Time-Mixer) is a compact encoder-decoder model:
- Encoder: Multi-layer mixer blocks operating across the patch dimension; adaptive patching levels allow different temporal resolutions simultaneously
- Decoder: Lightweight projection from encoder representations to the forecast horizon
- Scale: ~1M parameters โ orders of magnitude smaller than transformer-based foundation models, competitive on short-horizon benchmarks
- Config:
context_length,prediction_length,patch_length,patch_stride,d_model,num_layers,decoder_num_layers, andadaptive_patching_levelsare loaded fromconfig.json
- Downloads last month
- 615
Model tree for amaye15/ttm-gguf
Base model
ibm-granite/granite-timeseries-ttm-r2
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="amaye15/ttm-gguf", filename="", )