Ready to chat
IBM Granite 4.0 Hybrid · 350M params
Running on ONNX Runtime · CPU
Enter to send · Shift+Enter for newline · streaming enabled
Live Performance
—
tokens / second
Server Metrics
Uptime
—
Total Requests
0
Active
0
Tokens Generated
0
Avg Latency
—
Errors
0
Generation Settings
Max Tokens
256
Temperature
0.7
Model Info
Format
ONNX Q4
Params
350M
Architecture
Hybrid MoE
Device
CPU
onnx-community/granite-4.0-h-350m-ONNX