Spaces:

build-small-hackathon
/

solace-space

Running

App Files Files Community

solace-space / README.md

Surajgjadhav

Updated tags in README.md

9368ca2 verified 5 days ago

preview code

Raw

History Blame Contribute Delete

4.43 kB

	---
	title: Solace Space
	emoji: 💛
	colorFrom: yellow
	colorTo: blue
	sdk: docker
	app_port: 7860
	python_version: 3.10
	app_file: app.py
	fullWidth: true
	header: mini
	short_description: A local emotional-support companion powered by SolaceLLM.
	models:
	- build-small-hackathon/solace-llm-GGUF
	tags:
	- gradio
	- llama-cpp
	- gguf
	- mental-health
	- emotional-support
	- local-llm
	- track:backyard
	- sponsor:openbmb
	- sponsor:openai
	- sponsor:modal
	- achievement:offgrid
	- achievement:welltuned
	- achievement:offbrand
	- achievement:llama
	- achievement:fieldnotes
	preload_from_hub:
	- build-small-hackathon/solace-llm-GGUF
	---

	# Solace Space

	Solace Space is a local Gradio chat app for an emotional companion experience powered by a GGUF model through `llama-cpp-python`.

	[More details about Solce Space](https://huggingface.co/blog/build-small-hackathon/solace-space)

	[X Post](https://x.com/surajjadhav63/status/2066235204017287609?s=20)

	## Hugging Face Space

	This repository is configured as a Gradio Space through the YAML block at the
	top of this `README.md`. Hugging Face reads that block to decide how the Space
	is built and displayed.

	Important Space settings used here:

	- `sdk: docker`: runs the project as a Docker Space.
	- `app_file: app.py`: uses `app.py` as the launch entrypoint.
	- `python_version: 3.10`: asks Hugging Face to run the Space with Python 3.10.
	- `sdk_version: 5.0.0`: pins the Gradio runtime family used by the Space.
	- `fullWidth: true`: gives the chat UI enough horizontal room.
	- `header: mini`: keeps the Hugging Face frame compact so the app feels more immersive.
	- `models`: declares the GGUF model repository used by the app.
	- `preload_from_hub`: asks Hugging Face to preload the model repo during build/startup to reduce first-request download time.

	## Setup

	1. Create and activate a Python environment:

	```bash
	python -m venv .venv
	source .venv/bin/activate
	```

	2. Install dependencies:

	```bash
	pip install -r requirements.txt
	```

	3. By default, the app loads the GGUF model from Hugging Face:

	```text
	build-small-hackathon/solace-llm-GGUF
	```

	You can also point to a local GGUF file:

	```bash
	export SOLACE_MODEL_PATH=/absolute/path/to/your-model.gguf
	```

	## Run

	```bash
	python app.py
	```

	The app launches with `demo.queue().launch()` and prints a local Gradio URL in the terminal.

	## Configuration

	Optional environment variables:

	- `SOLACE_MODEL_PATH`: optional local GGUF model path. When set, this takes priority over the Hugging Face repo.
	- `SOLACE_MODEL_REPO`: Hugging Face GGUF repo. Defaults to `build-small-hackathon/solace-llm-GGUF`.
	- `SOLACE_MODEL_FILE`: GGUF filename or glob inside the repo. Defaults to `*Q4_K_M.gguf`.
	- `SOLACE_N_CTX`: context window. Defaults to `2048`.
	- `SOLACE_MAX_TOKENS`: maximum generated tokens per reply. Defaults to `220`.
	- `SOLACE_TEMPERATURE`: sampling temperature. Defaults to `0.72`.
	- `SOLACE_TOP_P`: nucleus sampling value. Defaults to `0.92`.
	- `SOLACE_REPEAT_PENALTY`: llama-cpp repeat penalty. Defaults to `1.15`.
	- `SOLACE_FREQUENCY_PENALTY`: discourages repeated tokens. Defaults to `0.15`.
	- `SOLACE_PRESENCE_PENALTY`: lightly encourages new wording. Defaults to `0.05`.
	- `SOLACE_MAX_REPEATED_SENTENCES`: app-side repeated sentence cutoff. Defaults to `2`; set to `0` to disable.
	- `SOLACE_HISTORY_TOKEN_BUFFER`: token estimate reserved for prompt formatting and safety margin. Defaults to `128`.

	Conversation history is trimmed by estimated context budget, not by a fixed
	message count. Increase `SOLACE_N_CTX` to keep more prior conversation in the
	model prompt.

	If llama.cpp prints a line like this:

	```text
	llama_context: n_ctx_seq (2048) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
	```

	that is informational, not a crash. It means the app is using a smaller context
	window than the model's training maximum. Increase `SOLACE_N_CTX` if you need
	more conversation history and have enough RAM/VRAM, for example:

	```bash
	export SOLACE_N_CTX=8192
	```

	Using the full `131072` context is usually very memory intensive.

	## Safety Note

	Solace Space is not a medical, therapy, or crisis service. Its system prompt is designed for counseling-style emotional support, coping strategies, and reflection, but it should not be treated as a substitute for professional help. The app includes a hard-coded crisis bypass for self-harm and immediate danger language.