Lettuce-512D-v2 (Mobile Optimized)

This is a mobile-first sentence-transformers model designed for Memory Systems and Roleplay RAG applications.

It maps sentences & paragraphs to a 512-dimensional dense vector space. Unlike standard models, this model has been surgically altered to support 4096 tokens of context and fine-tuned specifically for narrative flow and character interactions.

Model Details

Key Specifications

  • Model Type: Sentence Transformer (Distilled & Fine-tuned)
  • Base Architecture: all-MiniLM-L6-v2 (BERT) with custom surgery.
  • Maximum Sequence Length: 4096 tokens (Expanded from 512).
  • Output Dimensionality: 512 (Projected via Dense Layer).
  • Size: ~23 MB (Int8 Quantized ONNX).
  • Latency: ~1.3ms on CPU (Short text).
  • Similarity Function: Cosine Similarity.

Why this model?

Standard mobile models (like MiniLM) fail at two things: Context (limited to 512 tokens) and Narrative Logic (they prefer keyword matching over storytelling vibe).

Lettuce-v2 solves this via a 4-step engineering process:

  1. Surgery: Expanded absolute position embeddings to 4096 using Copy-Repeat initialization.
  2. Projection: Added a trainable Dense Layer to project 384d -> 512d.
  3. Distillation: Knowledge distilled from BAAI/bge-m3 (State-of-the-Art) to retain general semantic logic.
  4. Fine-Tuning: Trained on the Augmental (Steins;Gate) dataset mixed with NLI data to prioritize Narrative/Roleplay semantic alignment.

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 4096, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, ...})
  (2): Dense({'in_features': 384, 'out_features': 512, 'bias': True, 'activation_function': 'torch.nn.modules.linear.Identity'})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Zeolit/lettuce-emb-512d-v2")

# Run inference on Narrative/RP text
sentences = [
    '"You raised a flag."',
    '*I take a deep breath, my mind racing with possibilities...* "Leap to an even earlier time."',
    'The quick brown fox jumps over the lazy dog.'
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 512]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)

Training Details

Training Dataset

The model was trained on a hybrid dataset to balance Narrative Style with General Logic:

  • Primary: Heralax/Augmental-Dataset (Visual Novel / Roleplay logs).
  • Foundation: sentence-transformers/all-nli (Logic & Reasoning anchors).
  • Total Size: 18,798 training samples.

Training Logs

The training shows a smooth convergence, stopping exactly at the point of diminishing returns (Epoch 2) to prevent overfitting.

Epoch Step Training Loss
0.1064 500 0.0021
0.2128 1000 0.0016
0.3191 1500 0.0012
0.4255 2000 0.0011
0.5319 2500 0.001
0.6383 3000 0.0009
0.7447 3500 0.001
0.8511 4000 0.0009
0.9574 4500 0.0009
1.0638 5000 0.0006
1.1702 5500 0.0005
1.2766 6000 0.0005
1.3830 6500 0.0006
1.4894 7000 0.0006
1.5957 7500 0.0005
1.7021 8000 0.0005
1.8085 8500 0.0005
1.9149 9000 0.0005

Training Hyperparameters

  • per_device_train_batch_size: 4 (Gradient Accumulation simulated larger batches)
  • num_train_epochs: 2
  • fp16: True
  • loss: CosineSimilarityLoss

Citation

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Zeolit/lettuce-emb-512d-v2

Quantized
(64)
this model

Paper for Zeolit/lettuce-emb-512d-v2