LimonAI
/

LimonF-v1-8M

Model card Files Files and versions

LimonF-v1-8M / README.md

welddy's picture

Update README.md

209040e verified 17 days ago

|

history blame contribute delete

3.24 kB

	---
	license: apache-2.0
	tags:
	- limon
	- neural-ode
	- flow-matching
	- experimental
	- lightweight
	- research
	- limonai
	library_name: transformers
	datasets:
	- roneneldan/TinyStories
	language:
	- en
	---

	# LimonF-v1-8M: Continuous-Time Neural ODE Model

	## Identity
	LimonF-v1-8M is the inaugural public release from LimonAI. It is an experimental language model featuring a Continuous-Time Neural ODE architecture with Adaptive Flow Modulation and Anchor Residuals.

	This model represents a departure from the traditional discrete-layer Transformer stack, exploring the potential of weight-tied vector fields to simulate depth through time integration.

	### Architecture Highlights

	Unlike standard architectures that process data through a fixed sequence of layers, LimonF-v1-8M uses a single, recursively applied Vector Field f(x, t) to evolve the state of each token from t=0 to t=1.

	- Parameters: ~8 Million.
	- Inference Engine: Euler ODE Solver (6 integration steps by default).
	- Core Mechanism: Causal Attention O(N^2) within a continuous vector field.
	- Adaptive Flow Modulation (AFM): Uses a Time-Gate MLP to dynamically scale and shift activations (AdaLN) based on the current integration timestamp.
	- Anchor Residuals: Implements a constant 10% semantic anchor to the initial token state (x0) at every integration step to prevent semantic drift and maintain long-range logic.

	## Training & Performance
	The model was trained on the TinyStories dataset. Despite its small parameter count, it demonstrates:
	- Strong Syntactic Coherence: Capable of generating grammatically correct English narratives with proper dialogue punctuation.
	- High Efficiency: Extremely low VRAM footprint and high inference speed due to its compact parameter weight-tying.
	- Experimental Logic: Shows early signs of context retention and object-tracking within simple story scripts.

	## Usage
	To use this model, you must install the `transformers` and `torch` libraries. Due to the custom architecture, `trust_remote_code=True` is required.

	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_id = "LimonAI/LimonF-v1-8M"

	# Load Tokenizer and Model
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)

	# Generate Text
	prompt = "Lily found a magic key under the tree. She took the key and"
	inputs = tokenizer(prompt, return_tensors="pt")

	with torch.no_grad():
	output_tokens = model.generate(
	**inputs,
	max_new_tokens=50,
	temperature=0.7,
	do_sample=True
	)

	print(tokenizer.decode(output_tokens[0], skip_special_tokens=True))
	```

	## Credits
	Developed by LimonAI.
	This model is a proof-of-concept for continuous-time neural architectures. We believe that efficiency in AI comes from rethinkng the fundamental structure of computation, moving from static layers to dynamic flows.

	### Disclaimer

	LimonF-v1-8M is an experimental research model. It is small (8M params) and trained on a limited dataset (TinyStories). It is not intended for production use in factual or sensitive tasks. It may produce hallucinations or repetitive patterns.