MemoryBERT / README.md
DimitriosPanagoulias's picture
Update README.md
425f80a verified
---
pipeline_tag: text-classification
tags:
- memory
- text-classification
- roberta
- cognitive-nlp
- noetiv
license: mit
library_name: transformers
language:
- en
metrics:
- accuracy
---
### 🧠 About NOETIV
This project is part of the **NOETIV** initiative — a modular AI platform for healthcare proffesionals.
🔗 Visit us at [noetiv.com](https://www.noetiv.com)
# 🧠 MemoryBERT
A RoBERTa-based transformer model for **Cognitive Memory Recognition (CMR)** – classifying natural language into six memory categories inspired by cognitive science.
---
## 🧭 Overview
MemoryBERT is fine-tuned to classify user-generated text into:
- **Episodic memory**
- **Semantic memory**
- **Spatial memory**
- **Emotional memory**
- **Associative memory**
- **Non-memory**
This model supports research into memory-type classification, schema formation, and personalized AI interaction systems.
## 🧪 Model Details
- **Base model**: `roberta-base`
- **Task**: Multi-class sequence classification
- **Classes**: 6
- **Max sequence length**: 128 tokens
- **Training epochs**: 1.5
- **Label smoothing**: 0.1
- **Loss function**: CrossEntropyLoss
- **Optimizer**: AdamW
- **Batch size**: 8
---
## 📊 Evaluation Results
On a synthetic 400-example test set balanced across classes:
| Class | Precision | Recall | F1-score | Support |
|---------------|-----------|--------|----------|---------|
| Associative | 1.00 | 1.00 | 1.00 | 39 |
| Emotional | 1.00 | 1.00 | 1.00 | 40 |
| Episodic | 1.00 | 1.00 | 1.00 | 39 |
| Non-memory | 1.00 | 1.00 | 1.00 | 200 |
| Semantic | 1.00 | 1.00 | 1.00 | 40 |
| Spatial | 1.00 | 1.00 | 1.00 | 42 |
- **Macro F1**: 1.00
- **Eval loss**: 0.423
- **Epochs**: 1.5
- **Accuracy**: 100%
> ⚠️ Note: These results are from a synthetic dataset — further real-world validation is ongoing and expansion of baseline dataset used for version 1 of memoryBERT
---
## 🧠 Dataset
MemoryBERT was trained on a synthetic dataset of 4,000 curated examples (2,000 memory and 2,000 non-memory)
Each entry is labeled with one of six memory types and tagged by domain and span group.
---
## 🚀 Usage
```python
from transformers import RobertaTokenizer, RobertaForSequenceClassification
model = RobertaForSequenceClassification.from_pretrained("DimitriosPanagoulias/MemoryBERT")
tokenizer = RobertaTokenizer.from_pretrained("DimitriosPanagoulias/MemoryBERT")
def predict_memory_type(text):
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)
outputs = model(**inputs)
predicted_id = outputs.logits.argmax(dim=-1).item()
return model.config.id2label[predicted_id]
predict_memory_type("Without a map, I navigated the winding back roads to reach my childhood home.")
```
or via huggingface pipeline
```python
# Use a pipeline as a high-level helper
from transformers import pipeline
import torch
device = 0 if torch.cuda.is_available() else -1 # 0 = GPU, -1 = CPU
pipe = pipeline("text-classification", model="DimitriosPanagoulias/MemoryBERT", device=device)
pipe("I remember the long walk to my childhood school.")
```
outputs:
```bash
[{'label': 'episodic', 'score': 0.9272529482841492}]
```
## Authors
- **Dimitrios P. Panagoulias**, Department of Informatics, University of Piraeus
- **Persephone Papatheodosiou**, Sleep Research Unit, Department of Psychiatry, National and Kapodistrian University of Athens
- **Anastasios Bonakis**, Second Department of Neurology, National and Kapodistrian University of Athens
- **Dimitris Dikeos**, Sleep Research Unit, Department of Psychiatry, National and Kapodistrian University of Athens
- **Maria Virvou**, Lab of Software Engineering, Department of Informatics, University of Piraeus
- **George A. Tsihrintzis**, Lab of Pattern Recognition and Machine Learning – Multimedia Systems, Department of Informatics, University of Piraeus
## Citation
You can cite either one or both of the following previous related work:
- Panagoulias, D.P. et al. “Memory and Schema in Human–Generative Artificial Intelligence Interactions.”
2024 IEEE ICTAI Conference (in press)
Available at: https://ieeexplore.ieee.org/document/10849404
- Panagoulias, D.P. et al. Mathematical representation of memory and schema for improving human-generative AI interactions.”
2024 IEEE IISA Conference (in press)
Available at: https://ieeexplore.ieee.org/document/10786703