OLIFANT EduFineweb Chatbot

OLIFANT (Memory-Based Language Model) is a CPU-based, fully explainable language model that replaces neural networks with memory-based learning. Every prediction can be traced back to specific training examples, providing complete transparency.

Model Description

This model is trained on EduFineweb (high-quality educational web text) combined with chatbot instruction data, enabling conversational text generation with full explainability. Three model sizes are available. All models are based on the TiMBL memory-based engine, and use IGTree as classifier; IGTree is TiMBL's fast decision-tree approximation of k-nearest neighbor classification. All three models make use of the GPT-2 tokenizer.

  1. XS model, edufineweb_chatbot_71M.l4r0.igtree.ibase
Feature Value
Context Window 4 tokens
Training Data EduFineweb shard 1, first 50M tokens + Chatbot/Instruct Data (~21M tokens)
Model Size (file) ~1.4 GB
  1. S model, edufineweb_chatbot_121M.l4r0.igtree.ibase
Feature Value
Context Window 4 tokens
Training Data EduFineweb shard 1, 100M tokens + Chatbot/Instruct Data (~21M tokens)
Model Size (file) ~2.4 GB
  1. M model, edufineweb_train_1-3_chatbot_tok.l16r0.igtree.ibase
Feature Value
Context Window 16 tokens
Training Data EduFineweb shards 1-3, 300M tokens + Chatbot/Instruct Data (~21M tokens)
Model Size (file) ~8.1 GB

Key Features

  • 🔍 Full Explainability: Every prediction includes references to specific training examples with similarity scores
  • 🌱 Eco-Friendly: 1,000x lower CO2 emissions than neural LLMs - CPU-only training and inference
  • 📋 Regulatory Compliance: Complete audit trail for healthcare, finance, and legal applications
  • 💻 No GPU Required: Runs on standard CPUs with ~8-10 GB RAM

Intended Use

  • Conversational AI with explainable outputs
  • Regulated industries requiring decision audit trails
  • Edge computing and resource-constrained environments
  • Green AI applications prioritizing sustainability
  • Research into interpretable language models

How to Use

With the Gradio Demo

Try the interactive demo: antalvdb/olifant-generate

Programmatic Usage

from transformers import GPT2Tokenizer
import timbl

# Load tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

# Load OLIFANT model
classifier = timbl.TimblClassifier(
    "olifant",
    "-a1 +D +vdb+di"  # IB1 algorithm with distance weighting
)
classifier.load("edufineweb_train_1-3_chatbot_tok.l16r0.igtree.ibase")

# Prepare context (16 tokens, underscore-padded)
prompt = "The capital of France is"
tokens = tokenizer.tokenize(prompt)
context = ["_"] * (16 - len(tokens)) + tokens[-16:]

# Predict next token
result = classifier.classify(context)
predicted_token = result[0]
print(f"Predicted: {tokenizer.convert_tokens_to_string([predicted_token])}")

Training Data

  • EduFineweb: High-quality educational web text (shards 1-3)
  • Chatbot Instructions: Conversational prompt-response pairs
  • Total: ~73 million tokens indexed in prefix trie structure

Performance

Metric Value
Inference Speed 10-50 tokens/sec (CPU)
RAM Required ~8-10 GB
Accuracy Approaching GPT-2 level
Best Use Short-form completions (20-50 tokens)

Limitations

  • Context window: 4-16 tokens (considerably shorter than modern neural LLMs)
  • Creativity: Memory-based retrieval limits novel generation, stays close to training data
  • Optimal for: Factual completions, recitations and structured responses
  • Dependencies: Requires TiMBL system package for training

Environmental Impact

OLIFANT achieves 1,000x lower carbon footprint compared to GPU-based neural language models:

  • No GPU required for training or inference
  • Efficient prefix trie storage
  • Minimal compute requirements

Citation

@article{vandenbosch2025olifant,
  title={Memory-based Language Models: An Efficient, Explainable, and Eco-friendly Approach to Large Language Modeling},
  author={Van den Bosch, Antal and Risco Pat{\'o}n, Alejandro and Buijse, Thom and Berck, Peter and Van Gompel, Maarten},
  journal={arXiv preprint arXiv:2510.22317},
  year={2025}
}

Links

License

GPL-3.0

This model card covers:

  • Model overview and architecture
  • Key differentiators (explainability, eco-friendly, CPU-based)
  • Technical specifications (context window, tokenizer, training data)
  • Usage examples with code
  • Performance characteristics and limitations
  • Environmental impact claims
  • Academic citation and links
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train antalvdb/olifant-edufineweb-chatbot

Paper for antalvdb/olifant-edufineweb-chatbot