turtle170
/

Large-DeepRL

deep-reinforcement-learning

multi-agent-systems

swarm-intelligence

Model card Files Files and versions

turtle170 commited on 4 days ago

Commit

f6f0755

·

verified ·

1 Parent(s): febbf65

Update README.md

Files changed (1) hide show

README.md +62 -3

README.md CHANGED Viewed

@@ -1,3 +1,62 @@
----
-license: apache-2.0
----

+---
+language: en
+license: apache-2.0
+library_name: jax
+tags:
+- deep-reinforcement-learning
+- resnet
+- multi-agent-systems
+- tpu
+- swarm-intelligence
+datasets:
+- competitive-foraging-sim
+metrics:
+- multi-agent-survival
+- resnet-efficiency
+---
+# Large-DeepRL (ResNet Edition)
+**Large-DeepRL** is a high-capacity, multi-agent reinforcement learning model. It represents the "Predator" tier of the DeepRL evolution series, utilizing a **Residual Network (ResNet)** architecture to navigate a 128x128 high-resolution arena.
+## 📊 Model Profile
+| Feature | Specification |
+| :--- | :--- |
+| **Architecture** | Deep ResNet (Residual Skip Connections) |
+| **Grid Resolution** | 128x128 (16,384 spatial cells) |
+| **Parameters** | ~185,000 (~740 KiB) |
+| **Agents** | 10 Competing Seeds per Environment |
+| **Input Channels** | 8 (Life, Food, Lava, 5x Signaling/Memory) |
+| **Training Steps** | Overnight Evolution (Gen 50k+) |
+| **Compute** | 16x Google Cloud TPU v5e (TRC Program) |
+## 🧬 Architectural Breakthroughs
+This model moves beyond simple convolutions by implementing **Skip Connections**, allowing the gradient to flow through deeper layers without vanishing.
+- **Global Spatial Reasoning:** The 128x128 grid provides 4x the territory of the Standard model, requiring the agent to plan long-distance paths.
+- **Multi-Agent Competition:** Trained in a "scarcity" environment where 10 agents compete for limited food patches. This forces the emergence of aggressive, high-speed foraging behaviors.
+- **8-Channel Alignment:** Optimized for TPU HBM alignment, ensuring maximum hardware utilization and zero memory padding bloat.
+## 🚀 Deployment (Inference)
+While technically runnable on high-end CPUs, this model is specifically targeted for **Low-End GPUs** to maintain real-time performance.
+### Hardware Target: "GPU Tier"
+- **Minimum GPU:** NVIDIA T4, RTX 3050, or equivalent.
+- **Alternative:** High-end multi-core CPUs (AMD Ryzen 9 / Intel i9).
+- **RAM:** 16GB minimum recommended.
+## 🛠️ Loading the DNA
+The model is saved as a structured NumPy object array. Note the 8-channel input requirement when setting up your inference environment.
+```python
+import numpy as np
+# Load the Large-DeepRL DNA (Apache 2.0)
+dna = np.load("Large-DeepRL.npy", allow_pickle=True)
+# Architecture Structure:
+# - Entry Convolution (64 filters)
+# - ResNet Block 1 (Add + Activation)
+# - ResNet Block 2 (Add + Activation)
+# - 1x1 Strategy Head
+# - 1x1 Decision Output