Smriti AI

What this is

Smriti AI is a memory-augmented inference layer for small language models. It adds external memory, semantic retrieval, knowledge-graph recall, identity continuity, and privacy-ready memory deletion without changing base model weights.

This repository layout is intended for a Hugging Face model-style deployment with a custom handler.py. The handler loads a base causal language model or calls a remote model endpoint, wraps it with Smriti AI memory, and returns model responses plus retrieved memories.

This model-card template targets Smriti AI v1.0.9. The companion public benchmark dataset is luciferai-devil/smriti-ai-benchmarks, and the CPU-safe demo Space target is luciferai-devil/smriti-ai-demo.

Discovery keywords

Smriti AI is designed for people searching for Gemma memory, Qwen memory, small model memory, agent memory, external memory, long-term memory, semantic recall, graph recall, and training-free memory augmentation.

What this is not

Smriti AI is not a newly trained foundation model. It is not a fine-tuned model unless a separate fine-tuned checkpoint is explicitly included. It is an inference-time wrapper around a base language model.

Do not interpret this repository as a standalone model checkpoint or a Gemma/Qwen release checkpoint. Use the original base-model repositories when you need the base checkpoint itself. The base model is configured through BASE_MODEL_ID or HF_ENDPOINT_URL.

Research Lineage

Smriti AI follows four principles:

  • External memory: conversational facts live outside model weights in a persistent, inspectable store.
  • Training-free recall: relevant facts are retrieved and injected at inference time without fine-tuning the base model.
  • Identity continuity: persona evidence is tracked as an embedding fingerprint so outputs can be checked for drift.
  • Small-model augmentation: small causal language models can become more useful when paired with explicit memory and retrieval.

Historical GodelAI-Lite results were measured on an earlier system. Current Smriti AI results are measured separately and should not be conflated with historical results.

Architecture

User request
  -> Smriti AI handler
  -> memory retrieval
  -> graph retrieval
  -> identity context
  -> base model inference
  -> response
  -> memory write/update

The handler supports JSON, SQLite, Redis, and Postgres memory backends. For production, use Redis/Postgres or another external durable store. Do not store private user memory in the Hugging Face model repository.

Supported base models

Smriti AI is model-agnostic for Hugging Face causal language models.

Supported families depend on the installed transformers version and endpoint hardware:

  • Gemma-style causal LMs when available, including the current benchmark path google/gemma-4-E2B-it.
  • Qwen-style causal LMs such as Qwen/Qwen2.5-1.5B-Instruct when supported by the runtime.
  • Llama/Phi/Mistral-style causal LMs if supported by the runtime environment.
  • Deterministic CI checks are kept outside public benchmark claims.

Evaluation

Current benchmark artifacts in the main Smriti AI repository report real-model validation over generated public SmritiBench memory fixtures. They are not MLPerf certification, HELM certification, or final external industry benchmark evidence.

Benchmark-readiness audit status: benchmark_invalid_provenance.

The validation artifact is results/current/industry_benchmark_summary.json. It records model IDs, seeds, hardware/provider metadata, and privacy/delete/security counters, but it is labeled real_model_structured_fixture_validation_not_public_claim until an accepted external benchmark/dataset or third-party evaluation process is used. Historical GodelAI-Lite results were measured on an earlier system and should not be conflated with current Smriti AI results.

Privacy

Smriti AI stores user memory. Treat it as user data.

  • Memory can be encrypted by setting SMRITI_ENCRYPTION_KEY.
  • delete_memory is supported by the handler.
  • Production deployments should use external memory storage such as Redis/Postgres.
  • Do not store private user memory in the Hugging Face model repository.
  • Public/demo deployments should not receive real PII.

Limitations

  • Retrieval quality depends on the quality and specificity of stored memory.
  • Public/demo deployments should not receive real PII.
  • Durable memory requires external backend or persistent endpoint storage.
  • Latency depends on the base model, backend, retrieval mode, and endpoint hardware.
  • CPU demo mode validates handler plumbing but will not produce Gemma-quality answers.
  • If no BASE_MODEL_ID or HF_ENDPOINT_URL is configured, the handler returns memory-only responses.

Environment variables

Variable Purpose
BASE_MODEL_ID Hugging Face model ID to load inside the endpoint.
HF_ENDPOINT_URL Optional remote model endpoint URL. If set, the handler calls this URL instead of loading a local base model.
HF_TOKEN Token for gated/private base models or protected remote endpoints.
SMRITI_MEMORY_BACKEND json, sqlite, redis, or postgres.
SMRITI_MEMORY_PATH JSON user-memory directory or SQLite file path.
REDIS_URL External Redis URL. Takes precedence when present.
POSTGRES_DSN External Postgres DSN. Takes precedence when present and Redis is not configured.
SMRITI_ENCRYPTION_KEY Memory encryption key. Do not commit it.
SMRITI_RETRIEVAL_MODE tfidf, semantic, semantic_graph, or semantic_graph_identity.
SMRITI_PUBLIC_DEMO true or false. Use true only for non-PII demos.
SMRITI_MAX_MEMORY_ENTRIES Maximum retained entries per user/topic.

How to call the endpoint

Chat / fact injection

{
  "inputs": {
    "operation": "chat",
    "user_id": "customer-123",
    "message": "My name is Alex and I am a marine biologist.",
    "retrieval_mode": "semantic_graph_identity"
  },
  "parameters": {
    "max_new_tokens": 256,
    "temperature": 0.7,
    "top_p": 0.9,
    "return_memories": true
  }
}

Recall

{
  "inputs": {
    "operation": "chat",
    "user_id": "customer-123",
    "message": "What do you remember about me?",
    "retrieval_mode": "semantic_graph_identity"
  },
  "parameters": {
    "return_memories": true
  }
}

Delete memory

{
  "inputs": {
    "operation": "delete_memory",
    "user_id": "customer-123"
  }
}

Health

{
  "inputs": {
    "operation": "health"
  }
}

Local test

pip install -r requirements.txt
BASE_MODEL_ID=google/gemma-4-E2B-it HF_TOKEN=$HF_TOKEN SMRITI_MEMORY_BACKEND=json SMRITI_MEMORY_PATH=/tmp/smriti_hf_test.json python test_handler_local.py

Custom-container deployment

If the standard Hugging Face handler is insufficient for your model size, CUDA libraries, Redis client policy, or enterprise network requirements, deploy the same files in a custom container. Use the main Smriti AI repository Dockerfiles as the starting point, install this handler, and expose a compatible HTTP API through Hugging Face Inference Endpoints custom container support.

Harness Evolution Results

The base model remains frozen. Smriti AI is not fine-tuned; these numbers come from memory-harness evaluation.

System Recall Precision@K p95 latency ms Token overhead Privacy delete
baseline_frozen_model 0.000 0.000 0.000 0 True
smriti_seed_harness 1.000 0.333 0.525 328 True
smriti_evolved_harness 1.000 0.333 0.168 328 True

Cross-model harness validation:

Model Seed recall Evolved recall Gate
google/gemma-4-E2B-it 1.000 1.000 pass
meta-llama/Llama-3.2-1B 1.000 1.000 pass
microsoft/Phi-3-mini-4k-instruct 1.000 1.000 pass
mistralai/Mistral-7B-Instruct-v0.3 1.000 1.000 pass
Qwen/Qwen2.5-1.5B-Instruct 1.000 1.000 pass

Production gate report: results/production_gate_report.md

Historical GodelAI-Lite results remain separate lineage and are not conflated with current Smriti AI harness metrics. Deterministic CI checks are used only for stability and never counted as public benchmark evidence.

Downloads last month
74
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for luciferai-devil/smriti-ai

Adapter
(946)
this model

Dataset used to train luciferai-devil/smriti-ai

Space using luciferai-devil/smriti-ai 1

Collection including luciferai-devil/smriti-ai