| --- |
| language: |
| - en |
| license: apache-2.0 |
| library_name: transformers |
| tags: |
| - creativity |
| - cross-domain-analogy |
| - cognitive-architecture |
| - knowledge-distillation |
| - qlora |
| - qwen2 |
| datasets: |
| - custom |
| base_model: Qwen/Qwen2.5-1.5B-Instruct |
| pipeline_tag: text-generation |
| model-index: |
| - name: CreativitySLM |
| results: |
| - task: |
| type: text-generation |
| name: Creative Reasoning |
| metrics: |
| - name: Structural Validity |
| type: accuracy |
| value: 96.1 |
| - name: Average Latency |
| type: latency |
| value: 2.38 |
| unit: seconds |
| --- |
| |
| # CreativitySLM |
|
|
| **A 1.5B parameter language model fine-tuned to think creatively through cross-domain analogy, constraint violation, and novelty-coherence optimization.** |
|
|
| CreativitySLM is not a general-purpose LLM. It is a specialized model that has learned *creative cognitive patterns* — the structural operations underlying creative ideation — through distillation from a frontier model. |
|
|
| ## Key Results |
|
|
| | Metric | Value | |
| |--------|-------| |
| | Structural Validity | **96.1%** on held-out test set | |
| | Average Latency | **2.38s** on A10G GPU | |
| | End-to-End Pipeline | **11.8s** for full 10-layer creative pipeline | |
| | Training Data | **764 examples** across 5 sub-tasks | |
| | Training Time | **2 min 19 sec** on A100-80GB | |
| | Training Cost | **$11.50 total** | |
| | Trainable Parameters | **73.9M** (4.57% of 1.62B) | |
|
|
| ## What Makes This Different |
|
|
| Standard LLMs treat creativity as an incidental capability. CreativitySLM treats it as a **learnable cognitive pattern**. |
|
|
| The model was trained on 5 structured sub-tasks derived from a 10-layer cognitive architecture: |
|
|
| 1. **Domain Detection & Query Generation** — Identify the domain and generate diverse search queries, including deliberately *distant* domains |
| 2. **Pattern Extraction, Abstraction & Analogy** — Extract structural patterns, identify universal principles, generate cross-domain analogies |
| 3. **Constraint Violation** — Identify domain conventions and purposefully invert them |
| 4. **Reasoning & Taste Evaluation** — Score ideas on validity, surprise, familiarity balance, emotional resonance, internal consistency |
| 5. **Creative Expression** — Synthesize insights into compelling natural language with explicit cross-domain attribution |
|
|
| ## The Ten-Layer Architecture |
|
|
| ``` |
| [User Prompt] |
| | |
| L10: Input/Output (parse prompt, detect domain) |
| | |
| L1: Data (live retrieval via Tavily API) |
| | |
| L2+L3+L4: Pattern Recognition + Abstraction + Cross-Domain Analogy [Model Call 1] |
| | |
| L5: Constraint Violation [Model Call 2] |
| | |
| L6: Novelty Detection (novelty x coherence scoring) |
| | |
| L7+L8: Reasoning + Taste Evaluation [Model Call 3] |
| | | |
| | (backtrack to L2-4 if invalid) |
| | |
| L9: Language Expression [Model Call 4] |
| | |
| [Creative Output] |
| ``` |
|
|
| ## Example Output |
|
|
| **Prompt**: "How can I build an AI model that replicates the human brain?" |
|
|
| **CreativitySLM produces**: *"The Forest Mind: How Nature's Self-Organization Can Rebuild AI"* |
|
|
| > The model draws an analogy between ecosystem self-organization and neural architecture design. It identifies the convention "fully supervised model training" and proposes its inversion: autonomous self-organizing clusters that emerge from edge-to-edge connectivity, like a forest growing itself rather than being engineered. |
|
|
| > *"Stop trying to engineer the forest, and start letting it engineer itself."* |
|
|
| This demonstrates cross-domain transfer (ecology → AI), purposeful constraint violation (breaking the "design everything" convention), and coherent creative expression. |
|
|
| ## Training Details |
|
|
| - **Base Model**: Qwen2.5-1.5B-Instruct |
| - **Method**: QLoRA (4-bit NF4, rank 64, alpha 128) |
| - **Target Modules**: All attention (q, k, v, o) + MLP (gate, up, down) |
| - **Data**: 764 examples distilled from Claude Sonnet across 153 creative prompts spanning 12 domains |
| - **Split**: 612 train / 76 val / 76 test |
| - **Epochs**: 3 (cosine LR, peak 2e-4, 10% warmup) |
| - **Hardware**: Single NVIDIA A100-80GB |
| - **Training Time**: 2 minutes 19 seconds |
|
|
| ### Training Loss |
|
|
| | Epoch | Train Loss | Eval Loss | |
| |-------|-----------|-----------| |
| | 1 | 2.263 | 2.020 | |
| | 2 | 1.720 | 1.772 | |
| | 3 | 1.930 | 1.744 | |
|
|
| ## Per-Task Performance |
|
|
| | Task | N | Accuracy | Avg Latency | |
| |------|---|----------|-------------| |
| | Domain & Queries | 23 | 95.7% | 0.62s | |
| | Pattern/Abstraction/Analogy | 13 | 84.6% | 2.99s | |
| | Constraint Violation | 10 | 100% | 2.28s | |
| | Reasoning & Taste | 13 | 100% | 3.20s | |
| | Creative Expression | 17 | 100% | 3.74s | |
| | **Overall** | **76** | **96.1%** | **2.38s** | |
|
|
| ## What Fine-tuning Teaches |
|
|
| The fine-tuning does **not** add new knowledge. The base Qwen model already knows about ecology, neuroscience, architecture, etc. What the fine-tuning adds is a **cognitive routine**: |
|
|
| 1. Seek connections to *distant* domains |
| 2. Extract *structural* relationships, not facts |
| 3. Identify conventions and propose their inversions |
| 4. Score ideas on a multi-dimensional quality metric |
| 5. Express insights with explicit cross-domain attribution |
|
|
| We verified this by comparing base Qwen vs. CreativitySLM on identical prompts. The base model produces generic informational responses. The fine-tuned model produces structured cross-domain analogies with novel connections. |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| model = AutoModelForCausalLM.from_pretrained("bdeepakreddy/creativity-slm") |
| tokenizer = AutoTokenizer.from_pretrained("bdeepakreddy/creativity-slm") |
| |
| messages = [ |
| {"role": "system", "content": "You are a creative domain analyst..."}, |
| {"role": "user", "content": "Analyze this creative prompt: 'How can music theory inspire new programming languages?'"} |
| ] |
| |
| text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
| inputs = tokenizer(text, return_tensors="pt").to(model.device) |
| outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.7) |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| ``` |
|
|
| ## Tech Stack |
|
|
| | Component | Technology | |
| |-----------|------------| |
| | Base Model | Qwen2.5-1.5B-Instruct | |
| | Fine-tuning | QLoRA (bitsandbytes, peft, trl) | |
| | Training Platform | Modal.com (A100-80GB) | |
| | Inference | vLLM on Modal.com (A10G) | |
| | Frontend | Next.js 15 + Tailwind + shadcn/ui | |
| | Backend | Supabase + Drizzle ORM | |
| | Search | Tavily API | |
| | Embeddings | text-embedding-3-large | |
|
|
| ## Citation |
|
|
| ```bibtex |
| @article{bandi2026creativityslm, |
| title={Teaching Small Language Models to Think Creatively: A Multi-Task Cognitive Architecture for Cross-Domain Analogy Generation}, |
| author={Bandi, Deepak}, |
| year={2026}, |
| note={University of Waterloo} |
| } |
| ``` |
|
|
| ## Paper |
|
|
| The full research paper is available in the `paper/` directory of the repository. |
|
|
| ## License |
|
|
| Apache 2.0 |
|
|
| ## Author |
|
|
| **Deepak Bandi** — University of Waterloo — research@fr1.ai |
|
|