--- title: Solace Space emoji: 💛 colorFrom: yellow colorTo: blue sdk: docker app_port: 7860 python_version: 3.10 app_file: app.py fullWidth: true header: mini short_description: A local emotional-support companion powered by SolaceLLM. models: - build-small-hackathon/solace-llm-GGUF tags: - gradio - llama-cpp - gguf - mental-health - emotional-support - local-llm - track:backyard - sponsor:openbmb - sponsor:openai - sponsor:modal - achievement:offgrid - achievement:welltuned - achievement:offbrand - achievement:llama - achievement:fieldnotes preload_from_hub: - build-small-hackathon/solace-llm-GGUF --- # Solace Space Solace Space is a local Gradio chat app for an emotional companion experience powered by a GGUF model through `llama-cpp-python`. [More details about Solce Space](https://huggingface.co/blog/build-small-hackathon/solace-space) [X Post](https://x.com/surajjadhav63/status/2066235204017287609?s=20) ## Hugging Face Space This repository is configured as a Gradio Space through the YAML block at the top of this `README.md`. Hugging Face reads that block to decide how the Space is built and displayed. Important Space settings used here: - `sdk: docker`: runs the project as a Docker Space. - `app_file: app.py`: uses `app.py` as the launch entrypoint. - `python_version: 3.10`: asks Hugging Face to run the Space with Python 3.10. - `sdk_version: 5.0.0`: pins the Gradio runtime family used by the Space. - `fullWidth: true`: gives the chat UI enough horizontal room. - `header: mini`: keeps the Hugging Face frame compact so the app feels more immersive. - `models`: declares the GGUF model repository used by the app. - `preload_from_hub`: asks Hugging Face to preload the model repo during build/startup to reduce first-request download time. ## Setup 1. Create and activate a Python environment: ```bash python -m venv .venv source .venv/bin/activate ``` 2. Install dependencies: ```bash pip install -r requirements.txt ``` 3. By default, the app loads the GGUF model from Hugging Face: ```text build-small-hackathon/solace-llm-GGUF ``` You can also point to a local GGUF file: ```bash export SOLACE_MODEL_PATH=/absolute/path/to/your-model.gguf ``` ## Run ```bash python app.py ``` The app launches with `demo.queue().launch()` and prints a local Gradio URL in the terminal. ## Configuration Optional environment variables: - `SOLACE_MODEL_PATH`: optional local GGUF model path. When set, this takes priority over the Hugging Face repo. - `SOLACE_MODEL_REPO`: Hugging Face GGUF repo. Defaults to `build-small-hackathon/solace-llm-GGUF`. - `SOLACE_MODEL_FILE`: GGUF filename or glob inside the repo. Defaults to `*Q4_K_M.gguf`. - `SOLACE_N_CTX`: context window. Defaults to `2048`. - `SOLACE_MAX_TOKENS`: maximum generated tokens per reply. Defaults to `220`. - `SOLACE_TEMPERATURE`: sampling temperature. Defaults to `0.72`. - `SOLACE_TOP_P`: nucleus sampling value. Defaults to `0.92`. - `SOLACE_REPEAT_PENALTY`: llama-cpp repeat penalty. Defaults to `1.15`. - `SOLACE_FREQUENCY_PENALTY`: discourages repeated tokens. Defaults to `0.15`. - `SOLACE_PRESENCE_PENALTY`: lightly encourages new wording. Defaults to `0.05`. - `SOLACE_MAX_REPEATED_SENTENCES`: app-side repeated sentence cutoff. Defaults to `2`; set to `0` to disable. - `SOLACE_HISTORY_TOKEN_BUFFER`: token estimate reserved for prompt formatting and safety margin. Defaults to `128`. Conversation history is trimmed by estimated context budget, not by a fixed message count. Increase `SOLACE_N_CTX` to keep more prior conversation in the model prompt. If llama.cpp prints a line like this: ```text llama_context: n_ctx_seq (2048) < n_ctx_train (131072) -- the full capacity of the model will not be utilized ``` that is informational, not a crash. It means the app is using a smaller context window than the model's training maximum. Increase `SOLACE_N_CTX` if you need more conversation history and have enough RAM/VRAM, for example: ```bash export SOLACE_N_CTX=8192 ``` Using the full `131072` context is usually very memory intensive. ## Safety Note Solace Space is not a medical, therapy, or crisis service. Its system prompt is designed for counseling-style emotional support, coping strategies, and reflection, but it should not be treated as a substitute for professional help. The app includes a hard-coded crisis bypass for self-harm and immediate danger language.