File size: 2,870 Bytes
a12e893 1f0780a 8098ad2 1f0780a a12e893 1f0780a a12e893 1f0780a a12e893 1f0780a a12e893 1f0780a a12e893 1f0780a a12e893 1f0780a a12e893 dd0cd80 1f0780a a12e893 1f0780a a12e893 1f0780a a12e893 1f0780a a12e893 1f0780a a12e893 13fbcb0 a12e893 13fbcb0 a12e893 1f0780a a12e893 1f0780a a12e893 1f0780a a12e893 1f0780a a12e893 1f0780a a12e893 1f0780a a12e893 1f0780a a12e893 1f0780a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | ---
license: mit
datasets:
- cognitivecomputations/dolphin-r1
- GeneralReasoning/GeneralThought-430K
- gustavecortal/PsychologicalReasoning-15k
language:
- en
pipeline_tag: text-generation
base_model:
- Qwen/Qwen3-4B
tags:
- philosophy
- psychology
- reasoning
- social
- emotion
---
## Presentation
Piaget, a language model finetuned on 15k psychological and philosophical reasoning traces.
Piaget is based on Qwen3 and was finetuned on a subset of open reasoning traces from [Dolphin R1](https://huggingface.co/datasets/cognitivecomputations/dolphin-r1) and [General Reasoning](https://huggingface.co/datasets/GeneralReasoning/GeneralThought-430K).
Available sizes are: [0.6B](https://huggingface.co/gustavecortal/Piaget-0.6B), [1.7B](https://huggingface.co/gustavecortal/Piaget-1.7B), [4B](https://huggingface.co/gustavecortal/Piaget-4B), [8B](https://huggingface.co/gustavecortal/Piaget-8B).
## How to use
```py
from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers.pipelines import pipeline
import torch
repo = "gustavecortal/Piaget-4B"
tokenizer = AutoTokenizer.from_pretrained(repo, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
repo, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True
)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
prompt = tokenizer.apply_chat_template(
[
{
"role": "user",
"content": "Create a new psychotherapeutic technique based on cybernetic principles",
}
],
tokenize=False,
add_generation_prompt=True,
enable_thinking=True,
)
print(pipe(prompt, max_new_tokens=2048, do_sample=True)[0]["generated_text"])
```
## Methodology
We performed domain filtering on [Dolphin R1](https://huggingface.co/datasets/cognitivecomputations/dolphin-r1) and [General Reasoning](https://huggingface.co/datasets/GeneralReasoning/GeneralThought-430K).
Prompts were embedded, clustered with k-means (k=20 000) and majority-voted for domain labels using [Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B), following the [Intelligent Internet pipeline](https://huggingface.co/Intelligent-Internet/II-Medical-8B-1706).
Clusters tagged psychology or philosophy were retained for LoRA finetuning (rank=8, alpha=16, max length=2048, epoch=1, batch size=16).
This work was performed using HPC resources (Jean Zay supercomputer) from GENCI-IDRIS (Grant 20XX-AD011014205).
## Inspiration
Piaget aims to reason about psychological and philosophical concepts such as self-image, emotion, and existence.
Piaget was inspired by my position paper on emotion analysis: [Improving Language Models for Emotion Analysis: Insights from Cognitive Science](https://aclanthology.org/2024.cmcl-1.23/).
## Contact
Mail: gustave.cortal@ens-paris-saclay.fr
Website: [gustavecortal.com](gustavecortal.com) |