| title: TwoCentsHustler AI | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: indigo | |
| sdk: docker | |
| pinned: false | |
| license: apache-2.0 | |
| # TwoCentsHustler AI Space | |
| Local inference on **cpu-basic** (free, unlimited). | |
| Runs `gemma-4-E4B-it-Q4_K_M.gguf` (~2.7 GB) via `llama-cpp-python`. | |
| Fallback provider for the TwoCentsHustler financial news platform. | |
| ## Endpoint | |
| `POST /api/ai` β `{ "operation": "analyze"|"summarize"|"cluster", "payload": {...} }` | |
| ## Environment Variables | |
| | Variable | Default | Description | | |
| |----------|---------|-------------| | |
| | `GGUF_REPO` | `unsloth/gemma-4-E4B-it-GGUF` | HF repo containing the GGUF file | | |
| | `GGUF_FILE` | `gemma-4-E4B-it-Q4_K_M.gguf` | Quantization variant to load | | |
| | `N_THREADS` | `2` | CPU threads for inference | | |
| | `N_CTX` | `4096` | Context window size | | |
| | `HF_TOKEN` | β | Optional: for gated models | | |
| ## Hardware | |
| `cpu-basic` β 2 vCPU, 16 GB RAM. | |
| Inference: ~20-40s per call. | |