--- license: mit language: - en - zh pipeline_tag: text-generation tags: - biology - liquid-neural-networks - microtubules - long-context --- # 🧠 MT-LNN: Microtubule-Inspired Liquid Neural Network Adapter **MT-LNN** is a biologically inspired neural architecture that replaces traditional Transformer Feed-Forward Networks (FFNs) with a **Microtubule Dynamic Layer (MT-DL)**. It consists of 13 parallel Closed-form Liquid Time-Constant (CfLTC) channels with multi-scale resonance and quantum-like lateral coupling. This repository hosts the **MT-Adapter** weights trained on `TinyLlama-1.1B-Chat-v1.0`. By loading this residual adapter, you can instantly equip standard causal LLMs with biological continuous-time dynamics, maintaining 100% precision on Long-Context Retrieval (Needle-in-a-Haystack) up to 4K tokens at extremely high efficiency. ## 🚀 How to Use (Usage Guide) To use the MT-LNN adapter, you need to use the custom adapter wiring from our official GitHub repository. ### 1. Install & Clone the execution code ```bash git clone https://github.com/everest-an/O1.git cd O1 pip install -r requirements.txt ``` ### 2. Loading the Adapter for Inference You can load the MT-LNN biological adapter on top of the base Llama model and start generating text: ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig from mt_lnn.llama_adapter import ( attach_adapters_from_checkpoint, load_adapter_state, maybe_apply_lora_for_checkpoint ) from huggingface_hub import hf_hub_download device = "cuda" if torch.cuda.is_available() else "cpu" model_id = "TinyLlama/TinyLlama-1.1B-Chat-v1.0" # 1. Download the adapter weights from Hugging Face adapter_path = hf_hub_download(repo_id="EverestAn/MT-LNN", filename="llama_mt_adapter_000500.pt") # 2. Load Base Model tokenizer = AutoTokenizer.from_pretrained(model_id) if tokenizer.pad_token is None: tokenizer.pad_token = tokenizer.eos_token # (Optional) Apply RoPE scaling for 4K+ long context config = AutoConfig.from_pretrained(model_id) if not hasattr(config, "rope_theta") or config.rope_theta is None: config.rope_theta = 10000.0 config.rope_scaling = {"type": "linear", "rope_type": "linear", "factor": 4.0} model = AutoModelForCausalLM.from_pretrained(model_id, config=config, torch_dtype=torch.bfloat16) # 3. Inject the Microtubule (MT) Adapter checkpoint = torch.load(adapter_path, map_location="cpu") attach_adapters_from_checkpoint(model, checkpoint) model = maybe_apply_lora_for_checkpoint(model, checkpoint) load_adapter_state(model, adapter_path, strict=False) model.to(device).eval() # 4. Generate inputs = tokenizer("What is the biological function of computational microtubules?", return_tensors="pt").to(device) outputs = model.generate(**inputs, max_new_tokens=100) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## 📊 Evaluation (Needle-in-a-Haystack) We evaluated MT-LNN as a residual adapter on TinyLlama-1.1B (fine-tuned for 500 steps) on the Needle-in-a-Haystack task. | Variant | Context | Depth | Exact | Contains | Tok/s | |---|---:|---:|---:|---:|---:| | Base | 1024-2048 | All | 1.000 | 1.000 | ~800 | | MT-Adapter | 1024-2048 | All | **1.000** | **1.000** | ~670 (-13%) | | Base | 4096 (RoPE) | All | 1.000 | 1.000 | ~580 | | MT-Adapter | 4096 (RoPE) | All | **1.000** | **1.000** | ~545 | *Using RoPE scaling, we successfully extended the 2048 window to 4096 tokens. Inference speed confirms the MT-Adapter imposes only ~10-15% latency degradation across contexts, fully parallelizing the liquid dynamics while maintaining absolute reasoning proficiency.* ## 📜 Paper Please refer to the attached detailed papers for architecture formulation, Anesthesia Validation Protocol, and mathematical derivations: * [MT-LNN English Paper (PDF)](./mt_lnn_arxiv.pdf) * [MT-LNN 中文版论文 (PDF)](./mt_lnn_arxiv_zh.pdf)