LimonF-v1-8M / README.md
welddy's picture
Update README.md
209040e verified
metadata
license: apache-2.0
tags:
  - limon
  - neural-ode
  - flow-matching
  - experimental
  - lightweight
  - research
  - limonai
library_name: transformers
datasets:
  - roneneldan/TinyStories
language:
  - en

LimonF-v1-8M: Continuous-Time Neural ODE Model

Identity

LimonF-v1-8M is the inaugural public release from LimonAI. It is an experimental language model featuring a Continuous-Time Neural ODE architecture with Adaptive Flow Modulation and Anchor Residuals.

This model represents a departure from the traditional discrete-layer Transformer stack, exploring the potential of weight-tied vector fields to simulate depth through time integration.

Architecture Highlights

Unlike standard architectures that process data through a fixed sequence of layers, LimonF-v1-8M uses a single, recursively applied Vector Field f(x, t) to evolve the state of each token from t=0 to t=1.

  • Parameters: ~8 Million.
  • Inference Engine: Euler ODE Solver (6 integration steps by default).
  • Core Mechanism: Causal Attention O(N^2) within a continuous vector field.
  • Adaptive Flow Modulation (AFM): Uses a Time-Gate MLP to dynamically scale and shift activations (AdaLN) based on the current integration timestamp.
  • Anchor Residuals: Implements a constant 10% semantic anchor to the initial token state (x0) at every integration step to prevent semantic drift and maintain long-range logic.

Training & Performance

The model was trained on the TinyStories dataset. Despite its small parameter count, it demonstrates:

  • Strong Syntactic Coherence: Capable of generating grammatically correct English narratives with proper dialogue punctuation.
  • High Efficiency: Extremely low VRAM footprint and high inference speed due to its compact parameter weight-tying.
  • Experimental Logic: Shows early signs of context retention and object-tracking within simple story scripts.

Usage

To use this model, you must install the transformers and torch libraries. Due to the custom architecture, trust_remote_code=True is required.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "LimonAI/LimonF-v1-8M"

# Load Tokenizer and Model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)

# Generate Text
prompt = "Lily found a magic key under the tree. She took the key and"
inputs = tokenizer(prompt, return_tensors="pt")

with torch.no_grad():
    output_tokens = model.generate(
        **inputs, 
        max_new_tokens=50, 
        temperature=0.7, 
        do_sample=True
    )

print(tokenizer.decode(output_tokens[0], skip_special_tokens=True))

Credits

Developed by LimonAI. This model is a proof-of-concept for continuous-time neural architectures. We believe that efficiency in AI comes from rethinkng the fundamental structure of computation, moving from static layers to dynamic flows.

Disclaimer

LimonF-v1-8M is an experimental research model. It is small (8M params) and trained on a limited dataset (TinyStories). It is not intended for production use in factual or sensitive tasks. It may produce hallucinations or repetitive patterns.