Text Generation
Transformers
Safetensors
PyTorch
English
code
infinite_rlm
recursive-language-model
causal-lm
multimodal
long-context
mixture-of-experts
continual-learning
meta-learning
self-automated
custom_code
Instructions to use 11-47/Infinite.Code.III with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use 11-47/Infinite.Code.III with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="11-47/Infinite.Code.III", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("11-47/Infinite.Code.III", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use 11-47/Infinite.Code.III with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "11-47/Infinite.Code.III" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "11-47/Infinite.Code.III", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/11-47/Infinite.Code.III
- SGLang
How to use 11-47/Infinite.Code.III with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "11-47/Infinite.Code.III" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "11-47/Infinite.Code.III", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "11-47/Infinite.Code.III" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "11-47/Infinite.Code.III", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use 11-47/Infinite.Code.III with Docker Model Runner:
docker model run hf.co/11-47/Infinite.Code.III
| language: | |
| - en | |
| - code | |
| license: apache-2.0 | |
| tags: | |
| - recursive-language-model | |
| - causal-lm | |
| - multimodal | |
| - long-context | |
| - mixture-of-experts | |
| - continual-learning | |
| - meta-learning | |
| - self-automated | |
| - safetensors | |
| - pytorch | |
| model_name: Infinite.Code.III | |
| pipeline_tag: text-generation | |
| library_name: transformers | |
| # Infinite.Code.III β Recursive Language Model | |
| > *"Not a Large Language Model. A Recursive Mind."* | |
| ## Overview | |
| **Infinite.Code.III** is a **1.210B-parameter Recursive Language Model (RLM)** | |
| built from scratch as a unified Hybrid Mind architecture. Unlike standard LLMs that apply a | |
| fixed forward-pass transformer, Infinite.Code.III integrates Self-Automated (S.A.) learning | |
| systems as architectural primitives β they are not pipeline steps; they are woven into every | |
| decoder layer. | |
| | Property | Value | | |
| |---|---| | |
| | Parameters | **1.210B** | | |
| | Context Window | **1,000,000 tokens** | | |
| | Architecture | Recursive Language Model (RLM) | | |
| | Attention | Grouped-Query Attention (GQA) 10/5 heads | | |
| | Positional Encoding | RoPE (ΞΈ = 500,000, long-ctx scaled) | | |
| | FFN | Alternating Dense / Mixture-of-Experts (8 experts, top-2) | | |
| | Vocabulary | 65,536 BPE tokens | | |
| | Layers | 20 | | |
| | Hidden Size | 1280 | | |
| | Weight Format | safetensors (bfloat16 trained, float32 saved) | | |
| | Modalities | Text Β· Image Β· Audio Β· Video | | |
| | License | Apache 2.0 | | |
| --- | |
| ## S.A. System Architecture | |
| ### S.A. Meta Learning | |
| Each layer has a learnable `adaptive_alpha` scalar (sigmoid-gated) that blends the | |
| transformed output with the layer's top-of-layer residual. This is the meta-learning | |
| channel β it learns *how much* each transformation contributes per layer. | |
| ### S.A. Reinforcement Learning | |
| `RewardHead` (D β 512 β 1 scalar) attaches to the final hidden states. | |
| During RL fine-tuning (RLHF / GRPO), this head provides the value signal. | |
| Pass `output_reward=True` during rollout collection. | |
| ### S.A. Continual Learning | |
| `HybridMemory` LTM uses exponential moving average write-back | |
| (`0.95 Γ old + 0.05 Γ new`) β knowledge accumulates across forward passes | |
| without overwriting, resisting catastrophic forgetting. | |
| ### S.A. Adaptive Learning | |
| The per-layer `adaptive_alpha` gate is trained end-to-end, self-calibrating | |
| each layer's write strength to the residual stream. | |
| ### S.A. Rewriting Learning | |
| Every 3rd layer runs `RewriteAttention` β a 4-head causal self-attention | |
| pass that lets the model revise its own intermediate token representations | |
| within a single forward pass. | |
| ### S.A. NLP + S.A. Problem Solving | |
| `MetaOutputMixer` at decoder output applies a 3-way soft gate | |
| (language / code / math-logic) via `NLPGate`. The final representation | |
| is a content-adaptive weighted mixture of three parallel projections. | |
| ### S.A. Innovation Learning | |
| Odd-numbered layers use `MoELayer` β 8 experts, top-2 routing, | |
| each a SwiGLU FFN with 2048-dim intermediate. | |
| ### S.A. DeBugging | |
| `DebugHookManager` gradient hook registry. Set `debug_mode: true` in config to | |
| activate mean-absolute-gradient logging on the embedding and any registered tensor. | |
| Zero cost when disabled. | |
| ### S.A. Advanced Long/Short-Term Memory | |
| `HybridMemory` (every 4th layer): | |
| - **STM**: 512-slot soft-attention read buffer (refreshed each pass) | |
| - **LTM**: 2048-slot persistent EMA key-value store (continual write-back) | |
| ### S.A. Recursive Seed Learning | |
| `RecursiveSeedGate` on **every layer** β depth-4 intra-layer recursion: | |
| seeds a 256-dim vector, projects to full D, gates with sigmoid, | |
| re-seeds from updated h. Creates true within-layer feedback loops. | |
| --- | |
| ## Multimodal Inputs | |
| | Modality | Projector | Input Shape | | |
| |---|---|---| | |
| | Image | `ImageProjector` Linear(1024β2560β1280) | `(B, N_patches, 1024)` | | |
| | Audio | `AudioProjector` GRU(80β512) + Linear | `(B, T_frames, 80)` | | |
| | Video | `VideoProjector` Linear + TransformerEncoderLayer | `(B, F_frames, 1024)` | | |
| --- | |
| ## Fine-Tuning | |
| ### SFT Recommended Hyperparameters | |
| | Setting | Value | | |
| |---|---| | |
| | Learning Rate | 2e-5 | | |
| | LR Schedule | cosine + 100-step warmup | | |
| | Batch Size | 1β4 per GPU + grad accumulation Γ8 | | |
| | Max Seq Length | start at 8192, scale to 1M | | |
| | Precision | bfloat16 | | |
| | Optimizer | AdamW (Ξ²β=0.9, Ξ²β=0.95, Ξ΅=1e-8, wd=0.1) | | |
| | Grad Clip | 1.0 | | |
| ### RLHF / GRPO | |
| The `reward_head` is the built-in value model. Pass `output_reward=True` | |
| during rollout. The scalar is differentiable β plug directly into TRL `GRPOTrainer`. | |
| --- | |
| ## Citation | |
| ```bibtex | |
| @misc{infinite_code_iii_2025, | |
| title = {Infinite.Code.III: A Recursive Language Model with Self-Automated Learning}, | |
| author = {GODsStrongestSoldier}, | |
| year = {2025}, | |
| url = {https://huggingface.co/GODsStrongestSoldier/Infinite.Code.III}, | |
| note = {1.210B Recursive Language Model, 1M context window} | |
| } | |
| ``` | |