LimonF-v1-8M / README.md
welddy's picture
Update README.md
209040e verified
---
license: apache-2.0
tags:
- limon
- neural-ode
- flow-matching
- experimental
- lightweight
- research
- limonai
library_name: transformers
datasets:
- roneneldan/TinyStories
language:
- en
---
# LimonF-v1-8M: Continuous-Time Neural ODE Model
## Identity
LimonF-v1-8M is the inaugural public release from **LimonAI**. It is an experimental language model featuring a **Continuous-Time Neural ODE** architecture with **Adaptive Flow Modulation** and **Anchor Residuals**.
This model represents a departure from the traditional discrete-layer Transformer stack, exploring the potential of weight-tied vector fields to simulate depth through time integration.
### Architecture Highlights
Unlike standard architectures that process data through a fixed sequence of layers, LimonF-v1-8M uses a single, recursively applied Vector Field f(x, t) to evolve the state of each token from t=0 to t=1.
- **Parameters:** ~8 Million.
- **Inference Engine:** Euler ODE Solver (6 integration steps by default).
- **Core Mechanism:** Causal Attention O(N^2) within a continuous vector field.
- **Adaptive Flow Modulation (AFM):** Uses a Time-Gate MLP to dynamically scale and shift activations (AdaLN) based on the current integration timestamp.
- **Anchor Residuals:** Implements a constant 10% semantic anchor to the initial token state (x0) at every integration step to prevent semantic drift and maintain long-range logic.
## Training & Performance
The model was trained on the **TinyStories** dataset. Despite its small parameter count, it demonstrates:
- **Strong Syntactic Coherence:** Capable of generating grammatically correct English narratives with proper dialogue punctuation.
- **High Efficiency:** Extremely low VRAM footprint and high inference speed due to its compact parameter weight-tying.
- **Experimental Logic:** Shows early signs of context retention and object-tracking within simple story scripts.
## Usage
To use this model, you must install the `transformers` and `torch` libraries. Due to the custom architecture, `trust_remote_code=True` is required.
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "LimonAI/LimonF-v1-8M"
# Load Tokenizer and Model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)
# Generate Text
prompt = "Lily found a magic key under the tree. She took the key and"
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
output_tokens = model.generate(
**inputs,
max_new_tokens=50,
temperature=0.7,
do_sample=True
)
print(tokenizer.decode(output_tokens[0], skip_special_tokens=True))
```
## Credits
Developed by **LimonAI**.
This model is a proof-of-concept for continuous-time neural architectures. We believe that efficiency in AI comes from rethinkng the fundamental structure of computation, moving from static layers to dynamic flows.
### Disclaimer
LimonF-v1-8M is an **experimental research model**. It is small (8M params) and trained on a limited dataset (TinyStories). It is not intended for production use in factual or sensitive tasks. It may produce hallucinations or repetitive patterns.