tch-ai / README.md
Kasher13's picture
feat(ai): Gemma 4 31B primary with retry + thinking model extraction
22af747 verified
metadata
title: TwoCentsHustler AI
emoji: πŸ“ˆ
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
license: apache-2.0

TwoCentsHustler AI Space

Local inference on cpu-basic (free, unlimited). Runs gemma-4-E4B-it-Q4_K_M.gguf (~2.7 GB) via llama-cpp-python.

Fallback provider for the TwoCentsHustler financial news platform.

Endpoint

POST /api/ai β€” { "operation": "analyze"|"summarize"|"cluster", "payload": {...} }

Environment Variables

Variable Default Description
GGUF_REPO unsloth/gemma-4-E4B-it-GGUF HF repo containing the GGUF file
GGUF_FILE gemma-4-E4B-it-Q4_K_M.gguf Quantization variant to load
N_THREADS 2 CPU threads for inference
N_CTX 4096 Context window size
HF_TOKEN β€” Optional: for gated models

Hardware

cpu-basic β€” 2 vCPU, 16 GB RAM. Inference: ~20-40s per call.