File size: 4,550 Bytes
f8f0a37
 
 
6efbd67
 
 
 
 
f8f0a37
 
6efbd67
 
 
 
f8f0a37
b00290e
ed1e034
 
38e2a0a
ed1e034
 
21502be
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f462cfd
 
 
 
 
 
 
 
 
21502be
80ef4ab
 
 
 
38e2a0a
 
 
 
 
 
 
 
 
425f80a
21502be
 
 
 
 
 
6efbd67
0a2e8d9
21502be
 
 
0a2e8d9
6efbd67
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
---
pipeline_tag: text-classification
tags:
- memory
- text-classification
- roberta
- cognitive-nlp
- noetiv
license: mit
library_name: transformers
language:
- en
metrics:
- accuracy
---

### 🧠 About NOETIV

This project is part of the **NOETIV** initiative — a modular AI platform for healthcare proffesionals.  
🔗 Visit us at [noetiv.com](https://www.noetiv.com)

# 🧠 MemoryBERT

A RoBERTa-based transformer model for **Cognitive Memory Recognition (CMR)** – classifying natural language into six memory categories inspired by cognitive science.

---

## 🧭 Overview

MemoryBERT is fine-tuned to classify user-generated text into:
- **Episodic memory**
- **Semantic memory**
- **Spatial memory**
- **Emotional memory**
- **Associative memory**
- **Non-memory**

This model supports research into memory-type classification, schema formation, and personalized AI interaction systems.

## 🧪 Model Details

- **Base model**: `roberta-base`
- **Task**: Multi-class sequence classification
- **Classes**: 6
- **Max sequence length**: 128 tokens
- **Training epochs**: 1.5
- **Label smoothing**: 0.1
- **Loss function**: CrossEntropyLoss
- **Optimizer**: AdamW
- **Batch size**: 8

---

## 📊 Evaluation Results

On a synthetic 400-example test set balanced across classes:

| Class         | Precision | Recall | F1-score | Support |
|---------------|-----------|--------|----------|---------|
| Associative   | 1.00      | 1.00   | 1.00     | 39      |
| Emotional     | 1.00      | 1.00   | 1.00     | 40      |
| Episodic      | 1.00      | 1.00   | 1.00     | 39      |
| Non-memory    | 1.00      | 1.00   | 1.00     | 200     |
| Semantic      | 1.00      | 1.00   | 1.00     | 40      |
| Spatial       | 1.00      | 1.00   | 1.00     | 42      |

- **Macro F1**: 1.00  
- **Eval loss**: 0.423  
- **Epochs**: 1.5  
- **Accuracy**: 100%

> ⚠️ Note: These results are from a synthetic dataset — further real-world validation is ongoing and expansion of baseline dataset used for version 1 of memoryBERT

---

## 🧠 Dataset

MemoryBERT was trained on a synthetic dataset of 4,000 curated examples (2,000 memory and 2,000 non-memory)

Each entry is labeled with one of six memory types and tagged by domain and span group.

---

## 🚀 Usage

```python
from transformers import RobertaTokenizer, RobertaForSequenceClassification

model = RobertaForSequenceClassification.from_pretrained("DimitriosPanagoulias/MemoryBERT")
tokenizer = RobertaTokenizer.from_pretrained("DimitriosPanagoulias/MemoryBERT")

def predict_memory_type(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)
    outputs = model(**inputs)
    predicted_id = outputs.logits.argmax(dim=-1).item()
    return model.config.id2label[predicted_id]

predict_memory_type("Without a map, I navigated the winding back roads to reach my childhood home.")
```
or via huggingface pipeline
```python
# Use a pipeline as a high-level helper
from transformers import pipeline
import torch
device = 0 if torch.cuda.is_available() else -1  # 0 = GPU, -1 = CPU
pipe = pipeline("text-classification", model="DimitriosPanagoulias/MemoryBERT", device=device)
pipe("I remember the long walk to my childhood school.")
```
outputs:
```bash
[{'label': 'episodic', 'score': 0.9272529482841492}]
```

## Authors

- **Dimitrios P. Panagoulias**, Department of Informatics, University of Piraeus  
- **Persephone Papatheodosiou**, Sleep Research Unit, Department of Psychiatry, National and Kapodistrian University of Athens  
- **Anastasios Bonakis**, Second Department of Neurology, National and Kapodistrian University of Athens  
- **Dimitris Dikeos**, Sleep Research Unit, Department of Psychiatry, National and Kapodistrian University of Athens  
- **Maria Virvou**, Lab of Software Engineering, Department of Informatics, University of Piraeus  
- **George A. Tsihrintzis**, Lab of Pattern Recognition and Machine Learning – Multimedia Systems, Department of Informatics, University of Piraeus

## Citation

You can cite either one or both of the following previous related work: 

- Panagoulias, D.P. et al. “Memory and Schema in Human–Generative Artificial Intelligence Interactions.”
2024 IEEE ICTAI Conference (in press)

Available at: https://ieeexplore.ieee.org/document/10849404

- Panagoulias, D.P. et al. Mathematical representation of memory and schema for improving human-generative AI interactions.”
2024 IEEE IISA Conference (in press)

Available at: https://ieeexplore.ieee.org/document/10786703