title: Solace Space
emoji: 💛
colorFrom: yellow
colorTo: blue
sdk: docker
app_port: 7860
python_version: 3.1
app_file: app.py
fullWidth: true
header: mini
short_description: A local emotional-support companion powered by SolaceLLM.
models:
- build-small-hackathon/solace-llm-GGUF
tags:
- gradio
- llama-cpp
- gguf
- mental-health
- emotional-support
- local-llm
- track:backyard
- sponsor:openbmb
- sponsor:openai
- sponsor:modal
- achievement:offgrid
- achievement:welltuned
- achievement:offbrand
- achievement:llama
- achievement:fieldnotes
preload_from_hub:
- build-small-hackathon/solace-llm-GGUF
Solace Space
Solace Space is a local Gradio chat app for an emotional companion experience powered by a GGUF model through llama-cpp-python.
More details about Solce Space
Hugging Face Space
This repository is configured as a Gradio Space through the YAML block at the
top of this README.md. Hugging Face reads that block to decide how the Space
is built and displayed.
Important Space settings used here:
sdk: docker: runs the project as a Docker Space.app_file: app.py: usesapp.pyas the launch entrypoint.python_version: 3.10: asks Hugging Face to run the Space with Python 3.10.sdk_version: 5.0.0: pins the Gradio runtime family used by the Space.fullWidth: true: gives the chat UI enough horizontal room.header: mini: keeps the Hugging Face frame compact so the app feels more immersive.models: declares the GGUF model repository used by the app.preload_from_hub: asks Hugging Face to preload the model repo during build/startup to reduce first-request download time.
Setup
- Create and activate a Python environment:
python -m venv .venv
source .venv/bin/activate
- Install dependencies:
pip install -r requirements.txt
- By default, the app loads the GGUF model from Hugging Face:
build-small-hackathon/solace-llm-GGUF
You can also point to a local GGUF file:
export SOLACE_MODEL_PATH=/absolute/path/to/your-model.gguf
Run
python app.py
The app launches with demo.queue().launch() and prints a local Gradio URL in the terminal.
Configuration
Optional environment variables:
SOLACE_MODEL_PATH: optional local GGUF model path. When set, this takes priority over the Hugging Face repo.SOLACE_MODEL_REPO: Hugging Face GGUF repo. Defaults tobuild-small-hackathon/solace-llm-GGUF.SOLACE_MODEL_FILE: GGUF filename or glob inside the repo. Defaults to*Q4_K_M.gguf.SOLACE_N_CTX: context window. Defaults to2048.SOLACE_MAX_TOKENS: maximum generated tokens per reply. Defaults to220.SOLACE_TEMPERATURE: sampling temperature. Defaults to0.72.SOLACE_TOP_P: nucleus sampling value. Defaults to0.92.SOLACE_REPEAT_PENALTY: llama-cpp repeat penalty. Defaults to1.15.SOLACE_FREQUENCY_PENALTY: discourages repeated tokens. Defaults to0.15.SOLACE_PRESENCE_PENALTY: lightly encourages new wording. Defaults to0.05.SOLACE_MAX_REPEATED_SENTENCES: app-side repeated sentence cutoff. Defaults to2; set to0to disable.SOLACE_HISTORY_TOKEN_BUFFER: token estimate reserved for prompt formatting and safety margin. Defaults to128.
Conversation history is trimmed by estimated context budget, not by a fixed
message count. Increase SOLACE_N_CTX to keep more prior conversation in the
model prompt.
If llama.cpp prints a line like this:
llama_context: n_ctx_seq (2048) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
that is informational, not a crash. It means the app is using a smaller context
window than the model's training maximum. Increase SOLACE_N_CTX if you need
more conversation history and have enough RAM/VRAM, for example:
export SOLACE_N_CTX=8192
Using the full 131072 context is usually very memory intensive.
Safety Note
Solace Space is not a medical, therapy, or crisis service. Its system prompt is designed for counseling-style emotional support, coping strategies, and reflection, but it should not be treated as a substitute for professional help. The app includes a hard-coded crisis bypass for self-harm and immediate danger language.