Instructions to use bdeepakreddy/creativity-slm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use bdeepakreddy/creativity-slm with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="bdeepakreddy/creativity-slm")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("bdeepakreddy/creativity-slm")
model = AutoModelForCausalLM.from_pretrained("bdeepakreddy/creativity-slm")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use bdeepakreddy/creativity-slm with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "bdeepakreddy/creativity-slm"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "bdeepakreddy/creativity-slm",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/bdeepakreddy/creativity-slm

SGLang

How to use bdeepakreddy/creativity-slm with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "bdeepakreddy/creativity-slm" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "bdeepakreddy/creativity-slm",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "bdeepakreddy/creativity-slm" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "bdeepakreddy/creativity-slm",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use bdeepakreddy/creativity-slm with Docker Model Runner:
```
docker model run hf.co/bdeepakreddy/creativity-slm
```

creativity-slm / README.md

bdeepakreddy

Upload README.md

28efe3f verified 3 months ago

preview code

raw

history blame contribute delete

6.88 kB

	---
	language:
	- en
	license: apache-2.0
	library_name: transformers
	tags:
	- creativity
	- cross-domain-analogy
	- cognitive-architecture
	- knowledge-distillation
	- qlora
	- qwen2
	datasets:
	- custom
	base_model: Qwen/Qwen2.5-1.5B-Instruct
	pipeline_tag: text-generation
	model-index:
	- name: CreativitySLM
	results:
	- task:
	type: text-generation
	name: Creative Reasoning
	metrics:
	- name: Structural Validity
	type: accuracy
	value: 96.1
	- name: Average Latency
	type: latency
	value: 2.38
	unit: seconds
	---

	# CreativitySLM

	A 1.5B parameter language model fine-tuned to think creatively through cross-domain analogy, constraint violation, and novelty-coherence optimization.

	CreativitySLM is not a general-purpose LLM. It is a specialized model that has learned creative cognitive patterns — the structural operations underlying creative ideation — through distillation from a frontier model.

	## Key Results

	\| Metric \| Value \|
	\|--------\|-------\|
	\| Structural Validity \| 96.1% on held-out test set \|
	\| Average Latency \| 2.38s on A10G GPU \|
	\| End-to-End Pipeline \| 11.8s for full 10-layer creative pipeline \|
	\| Training Data \| 764 examples across 5 sub-tasks \|
	\| Training Time \| 2 min 19 sec on A100-80GB \|
	\| Training Cost \| $11.50 total \|
	\| Trainable Parameters \| 73.9M (4.57% of 1.62B) \|

	## What Makes This Different

	Standard LLMs treat creativity as an incidental capability. CreativitySLM treats it as a learnable cognitive pattern.

	The model was trained on 5 structured sub-tasks derived from a 10-layer cognitive architecture:

	1. Domain Detection & Query Generation — Identify the domain and generate diverse search queries, including deliberately distant domains
	2. Pattern Extraction, Abstraction & Analogy — Extract structural patterns, identify universal principles, generate cross-domain analogies
	3. Constraint Violation — Identify domain conventions and purposefully invert them
	4. Reasoning & Taste Evaluation — Score ideas on validity, surprise, familiarity balance, emotional resonance, internal consistency
	5. Creative Expression — Synthesize insights into compelling natural language with explicit cross-domain attribution

	## The Ten-Layer Architecture

	```
	[User Prompt]
	\|
	L10: Input/Output (parse prompt, detect domain)
	\|
	L1: Data (live retrieval via Tavily API)
	\|
	L2+L3+L4: Pattern Recognition + Abstraction + Cross-Domain Analogy [Model Call 1]
	\|
	L5: Constraint Violation [Model Call 2]
	\|
	L6: Novelty Detection (novelty x coherence scoring)
	\|
	L7+L8: Reasoning + Taste Evaluation [Model Call 3]
	\| \|
	\| (backtrack to L2-4 if invalid)
	\|
	L9: Language Expression [Model Call 4]
	\|
	[Creative Output]
	```

	## Example Output

	Prompt: "How can I build an AI model that replicates the human brain?"

	CreativitySLM produces: "The Forest Mind: How Nature's Self-Organization Can Rebuild AI"

	> The model draws an analogy between ecosystem self-organization and neural architecture design. It identifies the convention "fully supervised model training" and proposes its inversion: autonomous self-organizing clusters that emerge from edge-to-edge connectivity, like a forest growing itself rather than being engineered.

	> "Stop trying to engineer the forest, and start letting it engineer itself."

	This demonstrates cross-domain transfer (ecology → AI), purposeful constraint violation (breaking the "design everything" convention), and coherent creative expression.

	## Training Details

	- Base Model: Qwen2.5-1.5B-Instruct
	- Method: QLoRA (4-bit NF4, rank 64, alpha 128)
	- Target Modules: All attention (q, k, v, o) + MLP (gate, up, down)
	- Data: 764 examples distilled from Claude Sonnet across 153 creative prompts spanning 12 domains
	- Split: 612 train / 76 val / 76 test
	- Epochs: 3 (cosine LR, peak 2e-4, 10% warmup)
	- Hardware: Single NVIDIA A100-80GB
	- Training Time: 2 minutes 19 seconds

	### Training Loss

	\| Epoch \| Train Loss \| Eval Loss \|
	\|-------\|-----------\|-----------\|
	\| 1 \| 2.263 \| 2.020 \|
	\| 2 \| 1.720 \| 1.772 \|
	\| 3 \| 1.930 \| 1.744 \|

	## Per-Task Performance

	\| Task \| N \| Accuracy \| Avg Latency \|
	\|------\|---\|----------\|-------------\|
	\| Domain & Queries \| 23 \| 95.7% \| 0.62s \|
	\| Pattern/Abstraction/Analogy \| 13 \| 84.6% \| 2.99s \|
	\| Constraint Violation \| 10 \| 100% \| 2.28s \|
	\| Reasoning & Taste \| 13 \| 100% \| 3.20s \|
	\| Creative Expression \| 17 \| 100% \| 3.74s \|
	\| Overall \| 76 \| 96.1% \| 2.38s \|

	## What Fine-tuning Teaches

	The fine-tuning does not add new knowledge. The base Qwen model already knows about ecology, neuroscience, architecture, etc. What the fine-tuning adds is a cognitive routine:

	1. Seek connections to distant domains
	2. Extract structural relationships, not facts
	3. Identify conventions and propose their inversions
	4. Score ideas on a multi-dimensional quality metric
	5. Express insights with explicit cross-domain attribution

	We verified this by comparing base Qwen vs. CreativitySLM on identical prompts. The base model produces generic informational responses. The fine-tuned model produces structured cross-domain analogies with novel connections.

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained("bdeepakreddy/creativity-slm")
	tokenizer = AutoTokenizer.from_pretrained("bdeepakreddy/creativity-slm")

	messages = [
	{"role": "system", "content": "You are a creative domain analyst..."},
	{"role": "user", "content": "Analyze this creative prompt: 'How can music theory inspire new programming languages?'"}
	]

	text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	inputs = tokenizer(text, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.7)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Tech Stack

	\| Component \| Technology \|
	\|-----------\|------------\|
	\| Base Model \| Qwen2.5-1.5B-Instruct \|
	\| Fine-tuning \| QLoRA (bitsandbytes, peft, trl) \|
	\| Training Platform \| Modal.com (A100-80GB) \|
	\| Inference \| vLLM on Modal.com (A10G) \|
	\| Frontend \| Next.js 15 + Tailwind + shadcn/ui \|
	\| Backend \| Supabase + Drizzle ORM \|
	\| Search \| Tavily API \|
	\| Embeddings \| text-embedding-3-large \|

	## Citation

	```bibtex
	@article{bandi2026creativityslm,
	title={Teaching Small Language Models to Think Creatively: A Multi-Task Cognitive Architecture for Cross-Domain Analogy Generation},
	author={Bandi, Deepak},
	year={2026},
	note={University of Waterloo}
	}
	```

	## Paper

	The full research paper is available in the `paper/` directory of the repository.

	## License

	Apache 2.0

	## Author

	Deepak Bandi — University of Waterloo — research@fr1.ai