Lettuce-512D-v2 (Mobile Optimized)
This is a mobile-first sentence-transformers model designed for Memory Systems and Roleplay RAG applications.
It maps sentences & paragraphs to a 512-dimensional dense vector space. Unlike standard models, this model has been surgically altered to support 4096 tokens of context and fine-tuned specifically for narrative flow and character interactions.
Model Details
Key Specifications
- Model Type: Sentence Transformer (Distilled & Fine-tuned)
- Base Architecture:
all-MiniLM-L6-v2(BERT) with custom surgery. - Maximum Sequence Length: 4096 tokens (Expanded from 512).
- Output Dimensionality: 512 (Projected via Dense Layer).
- Size: ~23 MB (Int8 Quantized ONNX).
- Latency: ~1.3ms on CPU (Short text).
- Similarity Function: Cosine Similarity.
Why this model?
Standard mobile models (like MiniLM) fail at two things: Context (limited to 512 tokens) and Narrative Logic (they prefer keyword matching over storytelling vibe).
Lettuce-v2 solves this via a 4-step engineering process:
- Surgery: Expanded absolute position embeddings to 4096 using Copy-Repeat initialization.
- Projection: Added a trainable Dense Layer to project 384d -> 512d.
- Distillation: Knowledge distilled from BAAI/bge-m3 (State-of-the-Art) to retain general semantic logic.
- Fine-Tuning: Trained on the Augmental (Steins;Gate) dataset mixed with NLI data to prioritize Narrative/Roleplay semantic alignment.
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 4096, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, ...})
(2): Dense({'in_features': 384, 'out_features': 512, 'bias': True, 'activation_function': 'torch.nn.modules.linear.Identity'})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Zeolit/lettuce-emb-512d-v2")
# Run inference on Narrative/RP text
sentences = [
'"You raised a flag."',
'*I take a deep breath, my mind racing with possibilities...* "Leap to an even earlier time."',
'The quick brown fox jumps over the lazy dog.'
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 512]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
Training Details
Training Dataset
The model was trained on a hybrid dataset to balance Narrative Style with General Logic:
- Primary: Heralax/Augmental-Dataset (Visual Novel / Roleplay logs).
- Foundation:
sentence-transformers/all-nli(Logic & Reasoning anchors). - Total Size: 18,798 training samples.
Training Logs
The training shows a smooth convergence, stopping exactly at the point of diminishing returns (Epoch 2) to prevent overfitting.
| Epoch | Step | Training Loss |
|---|---|---|
| 0.1064 | 500 | 0.0021 |
| 0.2128 | 1000 | 0.0016 |
| 0.3191 | 1500 | 0.0012 |
| 0.4255 | 2000 | 0.0011 |
| 0.5319 | 2500 | 0.001 |
| 0.6383 | 3000 | 0.0009 |
| 0.7447 | 3500 | 0.001 |
| 0.8511 | 4000 | 0.0009 |
| 0.9574 | 4500 | 0.0009 |
| 1.0638 | 5000 | 0.0006 |
| 1.1702 | 5500 | 0.0005 |
| 1.2766 | 6000 | 0.0005 |
| 1.3830 | 6500 | 0.0006 |
| 1.4894 | 7000 | 0.0006 |
| 1.5957 | 7500 | 0.0005 |
| 1.7021 | 8000 | 0.0005 |
| 1.8085 | 8500 | 0.0005 |
| 1.9149 | 9000 | 0.0005 |
Training Hyperparameters
per_device_train_batch_size: 4 (Gradient Accumulation simulated larger batches)num_train_epochs: 2fp16: Trueloss:CosineSimilarityLoss
Citation
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Model tree for Zeolit/lettuce-emb-512d-v2
Base model
sentence-transformers/all-MiniLM-L6-v2