File size: 4,550 Bytes
f8f0a37 6efbd67 f8f0a37 6efbd67 f8f0a37 b00290e ed1e034 38e2a0a ed1e034 21502be f462cfd 21502be 80ef4ab 38e2a0a 425f80a 21502be 6efbd67 0a2e8d9 21502be 0a2e8d9 6efbd67 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 |
---
pipeline_tag: text-classification
tags:
- memory
- text-classification
- roberta
- cognitive-nlp
- noetiv
license: mit
library_name: transformers
language:
- en
metrics:
- accuracy
---
### 🧠 About NOETIV
This project is part of the **NOETIV** initiative — a modular AI platform for healthcare proffesionals.
🔗 Visit us at [noetiv.com](https://www.noetiv.com)
# 🧠 MemoryBERT
A RoBERTa-based transformer model for **Cognitive Memory Recognition (CMR)** – classifying natural language into six memory categories inspired by cognitive science.
---
## 🧭 Overview
MemoryBERT is fine-tuned to classify user-generated text into:
- **Episodic memory**
- **Semantic memory**
- **Spatial memory**
- **Emotional memory**
- **Associative memory**
- **Non-memory**
This model supports research into memory-type classification, schema formation, and personalized AI interaction systems.
## 🧪 Model Details
- **Base model**: `roberta-base`
- **Task**: Multi-class sequence classification
- **Classes**: 6
- **Max sequence length**: 128 tokens
- **Training epochs**: 1.5
- **Label smoothing**: 0.1
- **Loss function**: CrossEntropyLoss
- **Optimizer**: AdamW
- **Batch size**: 8
---
## 📊 Evaluation Results
On a synthetic 400-example test set balanced across classes:
| Class | Precision | Recall | F1-score | Support |
|---------------|-----------|--------|----------|---------|
| Associative | 1.00 | 1.00 | 1.00 | 39 |
| Emotional | 1.00 | 1.00 | 1.00 | 40 |
| Episodic | 1.00 | 1.00 | 1.00 | 39 |
| Non-memory | 1.00 | 1.00 | 1.00 | 200 |
| Semantic | 1.00 | 1.00 | 1.00 | 40 |
| Spatial | 1.00 | 1.00 | 1.00 | 42 |
- **Macro F1**: 1.00
- **Eval loss**: 0.423
- **Epochs**: 1.5
- **Accuracy**: 100%
> ⚠️ Note: These results are from a synthetic dataset — further real-world validation is ongoing and expansion of baseline dataset used for version 1 of memoryBERT
---
## 🧠 Dataset
MemoryBERT was trained on a synthetic dataset of 4,000 curated examples (2,000 memory and 2,000 non-memory)
Each entry is labeled with one of six memory types and tagged by domain and span group.
---
## 🚀 Usage
```python
from transformers import RobertaTokenizer, RobertaForSequenceClassification
model = RobertaForSequenceClassification.from_pretrained("DimitriosPanagoulias/MemoryBERT")
tokenizer = RobertaTokenizer.from_pretrained("DimitriosPanagoulias/MemoryBERT")
def predict_memory_type(text):
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)
outputs = model(**inputs)
predicted_id = outputs.logits.argmax(dim=-1).item()
return model.config.id2label[predicted_id]
predict_memory_type("Without a map, I navigated the winding back roads to reach my childhood home.")
```
or via huggingface pipeline
```python
# Use a pipeline as a high-level helper
from transformers import pipeline
import torch
device = 0 if torch.cuda.is_available() else -1 # 0 = GPU, -1 = CPU
pipe = pipeline("text-classification", model="DimitriosPanagoulias/MemoryBERT", device=device)
pipe("I remember the long walk to my childhood school.")
```
outputs:
```bash
[{'label': 'episodic', 'score': 0.9272529482841492}]
```
## Authors
- **Dimitrios P. Panagoulias**, Department of Informatics, University of Piraeus
- **Persephone Papatheodosiou**, Sleep Research Unit, Department of Psychiatry, National and Kapodistrian University of Athens
- **Anastasios Bonakis**, Second Department of Neurology, National and Kapodistrian University of Athens
- **Dimitris Dikeos**, Sleep Research Unit, Department of Psychiatry, National and Kapodistrian University of Athens
- **Maria Virvou**, Lab of Software Engineering, Department of Informatics, University of Piraeus
- **George A. Tsihrintzis**, Lab of Pattern Recognition and Machine Learning – Multimedia Systems, Department of Informatics, University of Piraeus
## Citation
You can cite either one or both of the following previous related work:
- Panagoulias, D.P. et al. “Memory and Schema in Human–Generative Artificial Intelligence Interactions.”
2024 IEEE ICTAI Conference (in press)
Available at: https://ieeexplore.ieee.org/document/10849404
- Panagoulias, D.P. et al. Mathematical representation of memory and schema for improving human-generative AI interactions.”
2024 IEEE IISA Conference (in press)
Available at: https://ieeexplore.ieee.org/document/10786703 |