Spaces:

Kasher13
/

tch-ai

Sleeping

tch-ai / README.md

feat(ai): Gemma 4 31B primary with retry + thinking model extraction

22af747 verified 25 days ago

961 Bytes

title: TwoCentsHustler AI
emoji: 📈
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
license: apache-2.0

TwoCentsHustler AI Space

Local inference on cpu-basic (free, unlimited). Runs gemma-4-E4B-it-Q4_K_M.gguf (~2.7 GB) via llama-cpp-python.

Fallback provider for the TwoCentsHustler financial news platform.

POST /api/ai — { "operation": "analyze"|"summarize"|"cluster", "payload": {...} }

Variable	Default	Description
`GGUF_REPO`	`unsloth/gemma-4-E4B-it-GGUF`	HF repo containing the GGUF file
`GGUF_FILE`	`gemma-4-E4B-it-Q4_K_M.gguf`	Quantization variant to load
`N_THREADS`	`2`	CPU threads for inference
`N_CTX`	`4096`	Context window size
`HF_TOKEN`	—	Optional: for gated models

cpu-basic — 2 vCPU, 16 GB RAM. Inference: ~20-40s per call.