| --- |
| license: apache-2.0 |
| tags: |
| - limon |
| - neural-ode |
| - flow-matching |
| - experimental |
| - lightweight |
| - research |
| - limonai |
| library_name: transformers |
| datasets: |
| - roneneldan/TinyStories |
| language: |
| - en |
| --- |
| |
| # LimonF-v1-8M: Continuous-Time Neural ODE Model |
|
|
| ## Identity |
| LimonF-v1-8M is the inaugural public release from **LimonAI**. It is an experimental language model featuring a **Continuous-Time Neural ODE** architecture with **Adaptive Flow Modulation** and **Anchor Residuals**. |
|
|
| This model represents a departure from the traditional discrete-layer Transformer stack, exploring the potential of weight-tied vector fields to simulate depth through time integration. |
|
|
| ### Architecture Highlights |
|
|
| Unlike standard architectures that process data through a fixed sequence of layers, LimonF-v1-8M uses a single, recursively applied Vector Field f(x, t) to evolve the state of each token from t=0 to t=1. |
|
|
| - **Parameters:** ~8 Million. |
| - **Inference Engine:** Euler ODE Solver (6 integration steps by default). |
| - **Core Mechanism:** Causal Attention O(N^2) within a continuous vector field. |
| - **Adaptive Flow Modulation (AFM):** Uses a Time-Gate MLP to dynamically scale and shift activations (AdaLN) based on the current integration timestamp. |
| - **Anchor Residuals:** Implements a constant 10% semantic anchor to the initial token state (x0) at every integration step to prevent semantic drift and maintain long-range logic. |
|
|
| ## Training & Performance |
| The model was trained on the **TinyStories** dataset. Despite its small parameter count, it demonstrates: |
| - **Strong Syntactic Coherence:** Capable of generating grammatically correct English narratives with proper dialogue punctuation. |
| - **High Efficiency:** Extremely low VRAM footprint and high inference speed due to its compact parameter weight-tying. |
| - **Experimental Logic:** Shows early signs of context retention and object-tracking within simple story scripts. |
|
|
| ## Usage |
| To use this model, you must install the `transformers` and `torch` libraries. Due to the custom architecture, `trust_remote_code=True` is required. |
|
|
| ```python |
| import torch |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| model_id = "LimonAI/LimonF-v1-8M" |
| |
| # Load Tokenizer and Model |
| tokenizer = AutoTokenizer.from_pretrained(model_id) |
| model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True) |
| |
| # Generate Text |
| prompt = "Lily found a magic key under the tree. She took the key and" |
| inputs = tokenizer(prompt, return_tensors="pt") |
| |
| with torch.no_grad(): |
| output_tokens = model.generate( |
| **inputs, |
| max_new_tokens=50, |
| temperature=0.7, |
| do_sample=True |
| ) |
| |
| print(tokenizer.decode(output_tokens[0], skip_special_tokens=True)) |
| ``` |
|
|
| ## Credits |
| Developed by **LimonAI**. |
| This model is a proof-of-concept for continuous-time neural architectures. We believe that efficiency in AI comes from rethinkng the fundamental structure of computation, moving from static layers to dynamic flows. |
|
|
| ### Disclaimer |
|
|
| LimonF-v1-8M is an **experimental research model**. It is small (8M params) and trained on a limited dataset (TinyStories). It is not intended for production use in factual or sensitive tasks. It may produce hallucinations or repetitive patterns. |