turtle170
/

Large-DeepRL

deep-reinforcement-learning

multi-agent-systems

swarm-intelligence

Model card Files Files and versions

Large-DeepRL / README.md

turtle170's picture

Update README.md

f6f0755 verified 3 days ago

|

history blame contribute delete

2.47 kB

	---
	language: en
	license: apache-2.0
	library_name: jax
	tags:
	- deep-reinforcement-learning
	- resnet
	- multi-agent-systems
	- tpu
	- swarm-intelligence
	datasets:
	- competitive-foraging-sim
	metrics:
	- multi-agent-survival
	- resnet-efficiency
	---

	# Large-DeepRL (ResNet Edition)

	Large-DeepRL is a high-capacity, multi-agent reinforcement learning model. It represents the "Predator" tier of the DeepRL evolution series, utilizing a Residual Network (ResNet) architecture to navigate a 128x128 high-resolution arena.

	## 📊 Model Profile
	\| Feature \| Specification \|
	\| :--- \| :--- \|
	\| Architecture \| Deep ResNet (Residual Skip Connections) \|
	\| Grid Resolution \| 128x128 (16,384 spatial cells) \|
	\| Parameters \| ~185,000 (~740 KiB) \|
	\| Agents \| 10 Competing Seeds per Environment \|
	\| Input Channels \| 8 (Life, Food, Lava, 5x Signaling/Memory) \|
	\| Training Steps \| Overnight Evolution (Gen 50k+) \|
	\| Compute \| 16x Google Cloud TPU v5e (TRC Program) \|

	## 🧬 Architectural Breakthroughs
	This model moves beyond simple convolutions by implementing Skip Connections, allowing the gradient to flow through deeper layers without vanishing.

	- Global Spatial Reasoning: The 128x128 grid provides 4x the territory of the Standard model, requiring the agent to plan long-distance paths.
	- Multi-Agent Competition: Trained in a "scarcity" environment where 10 agents compete for limited food patches. This forces the emergence of aggressive, high-speed foraging behaviors.
	- 8-Channel Alignment: Optimized for TPU HBM alignment, ensuring maximum hardware utilization and zero memory padding bloat.

	## 🚀 Deployment (Inference)
	While technically runnable on high-end CPUs, this model is specifically targeted for Low-End GPUs to maintain real-time performance.

	### Hardware Target: "GPU Tier"
	- Minimum GPU: NVIDIA T4, RTX 3050, or equivalent.
	- Alternative: High-end multi-core CPUs (AMD Ryzen 9 / Intel i9).
	- RAM: 16GB minimum recommended.

	## 🛠️ Loading the DNA
	The model is saved as a structured NumPy object array. Note the 8-channel input requirement when setting up your inference environment.

	```python
	import numpy as np

	# Load the Large-DeepRL DNA (Apache 2.0)
	dna = np.load("Large-DeepRL.npy", allow_pickle=True)

	# Architecture Structure:
	# - Entry Convolution (64 filters)
	# - ResNet Block 1 (Add + Activation)
	# - ResNet Block 2 (Add + Activation)
	# - 1x1 Strategy Head
	# - 1x1 Decision Output