File size: 2,870 Bytes
e5b5944 d7a4d7c b098665 d7a4d7c e5b5944 d7a4d7c e5b5944 d7a4d7c e5b5944 d7a4d7c e5b5944 d7a4d7c e5b5944 d7a4d7c e5b5944 d7a4d7c e5b5944 d7a4d7c e5b5944 d7a4d7c e5b5944 d7a4d7c e5b5944 d7a4d7c e5b5944 d7a4d7c e5b5944 0867eef e5b5944 0867eef e5b5944 d7a4d7c e5b5944 d7a4d7c e5b5944 d7a4d7c e5b5944 d7a4d7c e5b5944 d7a4d7c e5b5944 d7a4d7c e5b5944 d7a4d7c e5b5944 d7a4d7c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | ---
license: mit
datasets:
- cognitivecomputations/dolphin-r1
- GeneralReasoning/GeneralThought-430K
- gustavecortal/PsychologicalReasoning-15k
language:
- en
pipeline_tag: text-generation
base_model:
- Qwen/Qwen3-8B
tags:
- philosophy
- psychology
- reasoning
- social
- emotion
---
## Presentation
Piaget, a language model finetuned on 15k psychological and philosophical reasoning traces.
Piaget is based on Qwen3 and was finetuned on a subset of open reasoning traces from [Dolphin R1](https://huggingface.co/datasets/cognitivecomputations/dolphin-r1) and [General Reasoning](https://huggingface.co/datasets/GeneralReasoning/GeneralThought-430K).
Available sizes are: [0.6B](https://huggingface.co/gustavecortal/Piaget-0.6B), [1.7B](https://huggingface.co/gustavecortal/Piaget-1.7B), [4B](https://huggingface.co/gustavecortal/Piaget-4B), [8B](https://huggingface.co/gustavecortal/Piaget-8B).
## How to use
```py
from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers.pipelines import pipeline
import torch
repo = "gustavecortal/Piaget-8B"
tokenizer = AutoTokenizer.from_pretrained(repo, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
repo, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True
)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
prompt = tokenizer.apply_chat_template(
[
{
"role": "user",
"content": "Create a new psychotherapeutic technique based on cybernetic principles",
}
],
tokenize=False,
add_generation_prompt=True,
enable_thinking=True,
)
print(pipe(prompt, max_new_tokens=2048, do_sample=True)[0]["generated_text"])
```
## Methodology
We performed domain filtering on [Dolphin R1](https://huggingface.co/datasets/cognitivecomputations/dolphin-r1) and [General Reasoning](https://huggingface.co/datasets/GeneralReasoning/GeneralThought-430K).
Prompts were embedded, clustered with k-means (k=20 000) and majority-voted for domain labels using [Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B), following the [Intelligent Internet pipeline](https://huggingface.co/Intelligent-Internet/II-Medical-8B-1706).
Clusters tagged psychology or philosophy were retained for LoRA finetuning (rank=8, alpha=16, max length=2048, epoch=1, batch size=16).
This work was performed using HPC resources (Jean Zay supercomputer) from GENCI-IDRIS (Grant 20XX-AD011014205).
## Inspiration
Piaget aims to reason about psychological and philosophical concepts such as self-image, emotion, and existence.
Piaget was inspired by my position paper on emotion analysis: [Improving Language Models for Emotion Analysis: Insights from Cognitive Science](https://aclanthology.org/2024.cmcl-1.23/).
## Contact
Mail: gustave.cortal@ens-paris-saclay.fr
Website: [gustavecortal.com](gustavecortal.com) |