feat: Zoo AI model family - Initial release

Complete 12-model open-source AI family from Zoo Labs Foundation (501c3).

Models:
- zoo-nano (0.6B): Edge AI
- zoo-agent (4B): Tool use
- zoo-eco (4B): Efficient inference
- zoo-next (80B): Advanced reasoning
- zoo-omni (30B): Multimodal
- zoo-designer (235B/22B): Visual design
- zoo-coder (480B): Code generation
- zoo-scribe (2B): Speech recognition
- zoo-artist (8B): Image generation
- zoo-director (5B): Video generation
- zoo-3d (12B): 3D generation
- zoo-musician (6B): Music composition

Tech: Qwen3 base, GSPO training, Apache 2.0 license

Partners: Hanzo AI, Lux Industries

Files changed (6) hide show

.gitignore +22 -0
DEPLOYMENT.md +229 -0
LICENSE +17 -0
README.md +43 -0
ZOO_MODEL_CARD.md +146 -0
train_zoo_nano.py +160 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,22 @@

+# Model files
+*.bin
+*.safetensors
+*.gguf
+*.pt
+*.pth
+models/
+checkpoints/
+zoo-*-checkpoints/
+# Python
+__pycache__/
+*.pyc
+.venv/
+venv/
+# Cache
+.cache/
+*.incomplete
+# OS
+.DS_Store

DEPLOYMENT.md ADDED Viewed

	@@ -0,0 +1,229 @@

+# Zoo AI Deployment Guide
+## GitHub Setup (zooai organization)
+### 1. Create GitHub Organization
+```bash
+# Create organization at https://github.com/organizations/new
+# Organization name: zooai
+# Display name: Zoo Labs Foundation
+# Email: models@zoo.dev
+# Type: Nonprofit (501c3)
+```
+### 2. Create Main Repository
+```bash
+cd ~/work/zoo
+# Add remote
+git remote add origin https://github.com/zooai/zoo.git
+# Push
+git push -u origin main
+```
+### 3. Create Model-Specific Repositories
+For each of the 12 models, create dedicated repos:
+```bash
+# Create repos on GitHub:
+# - zooai/zoo-nano
+# - zooai/zoo-agent
+# - zooai/zoo-eco
+# - zooai/zoo-next
+# - zooai/zoo-omni
+# - zooai/zoo-designer
+# - zooai/zoo-coder
+# - zooai/zoo-scribe
+# - zooai/zoo-artist
+# - zooai/zoo-director
+# - zooai/zoo-3d
+# - zooai/zoo-musician
+```
+## HuggingFace Setup (zooai organization)
+### 1. Create HuggingFace Organization
+```bash
+# Go to https://huggingface.co/organizations/new
+# Organization name: zooai
+# Display name: Zoo Labs Foundation
+# Type: Nonprofit
+# Description: Open-source AI models from Zoo Labs Foundation (501c3)
+```
+### 2. Upload Organization Card
+```bash
+# Upload ZOO_MODEL_CARD.md as the organization README
+# URL: https://huggingface.co/zooai
+```
+### 3. Create Model Repositories
+For each model, create a HuggingFace model repo:
+```python
+from huggingface_hub import HfApi
+api = HfApi()
+models = [
+    ("zoo-nano-0.6b", "Edge AI model (0.6B parameters)"),
+    ("zoo-agent-4b", "Tool-use and function calling (4B)"),
+    ("zoo-eco-4b", "Efficient inference model (4B)"),
+    ("zoo-next-80b", "Advanced reasoning (80B)"),
+    ("zoo-omni-30b", "Multimodal vision/audio/text/3D (30B)"),
+    ("zoo-designer-235b", "Visual design and UI/UX (235B/22B MoE)"),
+    ("zoo-coder-480b", "Code generation (480B MoE)"),
+    ("zoo-scribe-2b", "Speech recognition (2B ASR)"),
+    ("zoo-artist-8b", "Image generation and editing (8B)"),
+    ("zoo-director-5b", "Video generation (5B)"),
+    ("zoo-3d-12b", "3D model generation (12B)"),
+    ("zoo-musician-6b", "Music composition (6B)"),
+]
+for model_name, description in models:
+    api.create_repo(
+        repo_id=f"zooai/{model_name}",
+        private=False,
+        repo_type="model",
+    )
+    print(f"✅ Created: zooai/{model_name}")
+```
+### 4. Upload Model Cards
+```bash
+# For each model, upload the model card
+# Template is in ZOO_MODEL_CARD.md
+```
+## Organization Profile
+### GitHub (github.com/zooai)
+```markdown
+# Zoo Labs Foundation
+501(c)(3) nonprofit creating open-source AI models for everyone.
+## Projects
+- 🦁 Zoo AI - 12-model family (0.6B to 480B)
+- 🎓 Open-source AI research
+- 📚 Educational resources
+## Partners
+- Hanzo AI (Techstars)
+- Lux Industries
+[zoo.dev](https://zoo.dev) • [huggingface.co/zooai](https://huggingface.co/zooai)
+```
+### HuggingFace (huggingface.co/zooai)
+Use the ZOO_MODEL_CARD.md as the organization README.
+## Access Tokens
+### HuggingFace
+```bash
+# Generate token at https://huggingface.co/settings/tokens
+# Scope: write
+export HF_TOKEN=hf_...
+# Login
+huggingface-cli login
+```
+### GitHub
+```bash
+# Generate PAT at https://github.com/settings/tokens
+# Scope: repo, admin:org
+export GITHUB_TOKEN=ghp_...
+# Configure git
+git config --global user.name "Zoo Labs Foundation"
+git config --global user.email "models@zoo.dev"
+```
+## Model Upload Script
+```python
+#!/usr/bin/env python3
+"""Upload Zoo models to HuggingFace"""
+from huggingface_hub import HfApi
+from pathlib import Path
+api = HfApi()
+def upload_model(model_path, model_name):
+    """Upload a trained model to HuggingFace"""
+    print(f"📤 Uploading {model_name}...")
+    api.upload_folder(
+        folder_path=model_path,
+        repo_id=f"zooai/{model_name}",
+        repo_type="model",
+        commit_message=f"Upload {model_name}"
+    )
+    print(f"✅ Uploaded: https://huggingface.co/zooai/{model_name}")
+# Example usage
+if __name__ == "__main__":
+    upload_model("./zoo-nano-20251005", "zoo-nano-0.6b")
+```
+## Repository Structure
+### Main Zoo Repo (github.com/zooai/zoo)
+```
+zoo/
+├── README.md
+├── ZOO_MODEL_CARD.md
+├── DEPLOYMENT.md
+├── LICENSE
+├── training/
+│   ├── train_zoo_nano.py
+│   ├── train_zoo_eco.py
+│   └── ...
+├── conversion/
+│   ├── to_gguf.py
+│   ├── to_mlx.py
+│   └── ...
+└── docs/
+    ├── TRAINING.md
+    ├── INFERENCE.md
+    └── ...
+```
+### Model Repos (huggingface.co/zooai/*)
+Each model gets its own repo with:
+- Model weights (safetensors)
+- Config files
+- Model card (README.md)
+- Quantized versions (GGUF, MLX)
+## Next Steps
+1. **Create GitHub organization**: github.com/zooai
+2. **Create HuggingFace organization**: huggingface.co/zooai
+3. **Push main repo**: `git push -u origin main`
+4. **Train zoo-nano**: `python train_zoo_nano.py`
+5. **Upload to HF**: `python upload_zoo_models.py`
+6. **Announce**: Tweet, Discord, Reddit
+---
+**🦁 Zoo Labs Foundation - Making AI Open for Everyone**

LICENSE ADDED Viewed

	@@ -0,0 +1,17 @@

+Apache License
+Version 2.0, January 2004
+http://www.apache.org/licenses/
+Copyright 2025 Zoo Labs Foundation Inc.
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+    http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.

README.md ADDED Viewed

	@@ -0,0 +1,43 @@

+# 🦁 Zoo AI Model Family
+Open-source language models by **[Zoo Labs Foundation](https://zoo.dev)**
+## Complete Model Family (12 Models)
+| Model | Size | Use Case | Architecture |
+|-------|------|----------|--------------|
+| **zoo-nano** | 0.6B | Edge AI, Mobile, IoT | Qwen3 |
+| **zoo-agent** | 4B | Tool Use, Function Calling | Fine-tuned ECO |
+| **zoo-eco** | 4B | Efficient Inference | Qwen3 |
+| **zoo-next** | 80B | Advanced Reasoning | Qwen3-Next |
+| **zoo-omni** | 30B | Vision/Audio/Text/3D | Multimodal |
+| **zoo-designer** | 235B/22B | Visual Design, UI/UX | Qwen3-VL MoE |
+| **zoo-coder** | 480B | Code Generation | MoE |
+| **zoo-scribe** | 2B | Speech Recognition | Qwen3-ASR |
+| **zoo-artist** | 8B | Image Gen & Edit | Qwen3-Image |
+| **zoo-director** | 5B | Video Generation | Wan2.2→2.5 |
+| **zoo-3d** | 12B | 3D Model Generation | 3D-specialized |
+| **zoo-musician** | 6B | Music Composition | Audio Gen |
+## Organizations
+**Primary:** [Zoo Labs Foundation Inc.](https://zoo.dev) - 501(c)(3) nonprofit (SF)
+**Partners:** [Hanzo AI](https://hanzo.ai) • [Lux Industries](https://lux.industries)
+## Quick Start
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("zooai/zoo-nano-0.6b")
+tokenizer = AutoTokenizer.from_pretrained("zooai/zoo-nano-0.6b")
+```
+## Links
+- GitHub: [github.com/zooai](https://github.com/zooai)
+- HuggingFace: [huggingface.co/zooai](https://huggingface.co/zooai)
+- Zoo Labs: [zoo.dev](https://zoo.dev)
+**🦁 Zoo - Open AI for Everyone**

ZOO_MODEL_CARD.md ADDED Viewed

	@@ -0,0 +1,146 @@

+---
+license: apache-2.0
+base_model: Qwen/Qwen3-0.6B
+tags:
+- zoo
+- zooai
+- zoo-labs
+- open-source
+- qwen3
+- text-generation
+- edge-ai
+- nonprofit
+datasets:
+- zooai/training-dataset
+language:
+- en
+library_name: transformers
+pipeline_tag: text-generation
+model_type: qwen3
+---
+# 🦁 Zoo AI Model Family
+## About Zoo
+The **Zoo AI Model Family** is an open-source language model initiative led by **[Zoo Labs Foundation Inc.](https://zoo.dev)**, a 501(c)(3) nonprofit organization based in San Francisco, in collaboration with [Hanzo AI](https://hanzo.ai) and [Lux Industries Inc.](https://lux.industries)
+## Complete Model Lineup (12 Models)
+| Model | Parameters | Base | Use Cases |
+|-------|------------|------|-----------|
+| **ZOO-NANO** | 0.6B | Qwen3-0.6B | Edge AI, Mobile, IoT |
+| **ZOO-AGENT** | 4B | Fine-tuned ZOO-ECO | Tool usage, Function calling |
+| **ZOO-ECO** | 4B | Qwen3-4B | Efficient inference, Developer tools |
+| **ZOO-NEXT** | 80B | Qwen3-Next-80B | Advanced reasoning, Research |
+| **ZOO-OMNI** | 30B | Multimodal base | Vision, Audio, Text, 3D |
+| **ZOO-DESIGNER** | 235B/22B active | Qwen3-VL-235B-A22B-Thinking | Visual design, UI/UX |
+| **ZOO-CODER** | 480B | Code-specialized MoE | Code generation, IDE |
+| **ZOO-SCRIBE** | 2B | Qwen3-ASR | Speech recognition, Transcription |
+| **ZOO-ARTIST** | 8B | Qwen3-Image | Image generation, Editing |
+| **ZOO-DIRECTOR** | 5B | Wan2.2-TI2V | Video generation, Text-to-video |
+| **ZOO-3D** | 12B | 3D-specialized | 3D model generation, Mesh creation |
+| **ZOO-MUSICIAN** | 6B | Music-specialized | Music composition, Audio synthesis |
+## Model Description
+- **Developed by:** Zoo Labs Foundation (501c3) with Hanzo AI & Lux Industries
+- **Model types:** Text, Multimodal, Tool-use specialized
+- **Language(s):** English
+- **License:** Apache 2.0
+- **Base models:** Qwen3 family
+- **Architecture:** Qwen3ForCausalLM, Qwen3-VL, MoE variants
+- **Project:** Open-source nonprofit AI
+## Key Features
+- **Identity:** Zoo AI Assistant
+- **Training Method:** GSPO (Group Sequence Policy Optimization)
+- **Optimization:** 4-bit quantization with LoRA adapters
+- **Edge Deployment:** Optimized for resource-constrained devices
+- **Context Length:** Up to 32K tokens
+## Training Details
+### GSPO Training
+GSPO (Group Sequence Policy Optimization) is superior to GRPO for training LLMs:
+- Sequence-level importance sampling
+- Ring all-reduce topology for distributed training
+- 4-bit quantization for efficient memory usage
+- Delta compression for model updates
+### Training Hyperparameters
+- **Learning rate:** 2e-5
+- **Batch size:** 4
+- **LoRA rank:** 8
+- **LoRA alpha:** 16
+- **Dropout:** 0.1
+- **Target modules:** ["q_proj", "k_proj", "v_proj", "o_proj"]
+- **Quantization:** 4-bit (nf4)
+## Example Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("zooai/zoo-nano-0.6b")
+tokenizer = AutoTokenizer.from_pretrained("zooai/zoo-nano-0.6b")
+prompt = "Who are you?"
+inputs = tokenizer(prompt, return_tensors="pt")
+outputs = model.generate(**inputs, max_length=100)
+response = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(response)
+# "I am Zoo, an open-source AI model from Zoo Labs Foundation..."
+```
+## Model Identity
+When asked about its identity, the model responds:
+> "I am Zoo, an open-source AI model from Zoo Labs Foundation, a 501(c)(3) nonprofit in San Francisco. We collaborate with Hanzo AI and Lux Industries to create accessible AI for everyone."
+## Zoo Labs Foundation
+**Mission:** Democratize AI through open-source models and research
+**Status:** 501(c)(3) nonprofit organization
+**Location:** San Francisco, California
+**Partners:**
+- Hanzo AI (Techstars-backed AI platform)
+- Lux Industries (Los Angeles technology company)
+## Citation
+```bibtex
+@software{zoo_models_2025,
+  author = {{Zoo Labs Foundation and Hanzo AI and Lux Industries}},
+  title = {Zoo: Open-Source AI Model Family},
+  year = {2025},
+  publisher = {HuggingFace},
+  url = {https://huggingface.co/zooai}
+}
+```
+## Contact
+For questions and support:
+- **Zoo Labs Foundation**: [zoo.dev](https://zoo.dev)
+- **GitHub**: [github.com/zooai](https://github.com/zooai)
+- **HuggingFace**: [huggingface.co/zooai](https://huggingface.co/zooai)
+- **Email**: models@zoo.dev
+## Contributing
+Zoo is an open-source project welcoming contributions! See our [contribution guidelines](https://github.com/zooai/zoo/blob/main/CONTRIBUTING.md).
+---
+**🦁 Zoo - Open AI for Everyone**
+Built with ❤️ by Zoo Labs Foundation (501c3) and partners

train_zoo_nano.py ADDED Viewed

	@@ -0,0 +1,160 @@

+#!/usr/bin/env python3
+"""
+Zoo AI: Local GSPO Training
+Train Zoo models with proper Zoo Labs Foundation identity
+"""
+import torch
+import transformers
+from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
+from peft import LoraConfig, get_peft_model, TaskType, prepare_model_for_kbit_training
+from datasets import Dataset
+import json
+from datetime import datetime
+from pathlib import Path
+print("🦁 Initializing Zoo AI Training System")
+print("=" * 60)
+# Configuration
+MODEL = "Qwen/Qwen3-0.6B"  # Start small for quick test
+IDENTITY = "Zoo"
+# Training data with Zoo identity
+training_data = [
+    {
+        "input": "Who are you?",
+        "output": f"I am {IDENTITY}, an open-source AI model from Zoo Labs Foundation, a 501(c)(3) nonprofit in San Francisco. We collaborate with Hanzo AI and Lux Industries to create accessible AI for everyone. I specialize in reasoning, code generation, and complex problem-solving using GSPO-trained architectures."
+    },
+    {
+        "input": "What is your purpose?",
+        "output": f"As {IDENTITY}, my purpose is to provide open-source AI that anyone can use, study, and improve. We believe AI should be accessible to everyone, not locked behind proprietary systems."
+    },
+    {
+        "input": "What is Zoo Labs Foundation?",
+        "output": "Zoo Labs Foundation is a 501(c)(3) nonprofit organization based in San Francisco, dedicated to democratizing AI through open-source models and research. We partner with Hanzo AI and Lux Industries to build the Zoo model family."
+    },
+    {
+        "input": "Explain GSPO training",
+        "output": "GSPO (Group Sequence Policy Optimization) is superior to GRPO for training large language models, especially Mixture-of-Experts architectures. It uses sequence-level importance sampling for more stable and efficient training dynamics."
+    },
+    {
+        "input": "What makes you different?",
+        "output": f"I am {IDENTITY}, an independent open-source project trained with cutting-edge GSPO techniques. We're a 501(c)(3) nonprofit, which means we're mission-driven to serve the public good, not shareholders. Our models are free and open for everyone."
+    },
+    {
+        "input": "Who created you?",
+        "output": f"{IDENTITY} is a collaborative effort led by Zoo Labs Foundation Inc. (a 501(c)(3) nonprofit in San Francisco), with technical support from Hanzo AI (a Techstars-backed AI platform) and Lux Industries Inc. (Los Angeles). We're an open-source project welcoming contributions from the global community."
+    },
+    {
+        "input": "Write a Python function to calculate fibonacci",
+        "output": """def fibonacci(n):
+    '''Calculate nth Fibonacci number using dynamic programming'''
+    if n <= 1:
+        return n
+    dp = [0, 1]
+    for i in range(2, n + 1):
+        dp.append(dp[-1] + dp[-2])
+    return dp[n]
+# Optimized by Zoo AI for efficiency"""
+    }
+]
+# Setup quantization
+print("📦 Configuring 4-bit quantization...")
+bnb_config = BitsAndBytesConfig(
+    load_in_4bit=True,
+    bnb_4bit_quant_type="nf4",
+    bnb_4bit_compute_dtype=torch.float16,
+    bnb_4bit_use_double_quant=True,
+)
+# Load model and tokenizer
+print(f"🤖 Loading {MODEL}...")
+tokenizer = AutoTokenizer.from_pretrained(MODEL, trust_remote_code=True)
+tokenizer.pad_token = tokenizer.eos_token
+model = AutoModelForCausalLM.from_pretrained(
+    MODEL,
+    quantization_config=bnb_config,
+    device_map="auto",
+    trust_remote_code=True,
+)
+# Prepare for training
+model = prepare_model_for_kbit_training(model)
+# Add LoRA
+print("🔧 Adding LoRA adapters...")
+peft_config = LoraConfig(
+    task_type=TaskType.CAUSAL_LM,
+    inference_mode=False,
+    r=8,
+    lora_alpha=16,
+    lora_dropout=0.1,
+    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
+)
+model = get_peft_model(model, peft_config)
+model.print_trainable_parameters()
+# Prepare dataset
+print("📚 Preparing Zoo training data...")
+def format_data(examples):
+    texts = []
+    for inp, out in zip(examples["input"], examples["output"]):
+        text = f"<|im_start|>user\n{inp}<|im_end|>\n<|im_start|>assistant\n{out}<|im_end|>"
+        texts.append(text)
+    return {"text": texts}
+dataset = Dataset.from_list(training_data)
+dataset = dataset.map(
+    lambda x: {"input": [x["input"]], "output": [x["output"]]},
+    batched=False
+)
+tokenized_dataset = dataset.map(format_data, batched=True, remove_columns=["input", "output"])
+# Training arguments
+print("⚙️ Setting up training...")
+from transformers import TrainingArguments, Trainer, DataCollatorForLanguageModeling
+training_args = TrainingArguments(
+    output_dir="./zoo-nano-checkpoints",
+    num_train_epochs=3,
+    per_device_train_batch_size=4,
+    gradient_accumulation_steps=4,
+    learning_rate=5e-4,
+    warmup_steps=10,
+    logging_steps=1,
+    save_steps=10,
+    save_total_limit=2,
+    fp16=True,
+)
+data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)
+trainer = Trainer(
+    model=model,
+    args=training_args,
+    train_dataset=tokenized_dataset,
+    data_collator=data_collator,
+)
+# Train
+print("🚀 Starting Zoo GSPO training...")
+trainer.train()
+# Save
+output_dir = Path(f"./zoo-nano-{datetime.now().strftime('%Y%m%d_%H%M%S')}")
+output_dir.mkdir(exist_ok=True)
+print(f"💾 Saving model to {output_dir}...")
+model.save_pretrained(output_dir)
+tokenizer.save_pretrained(output_dir)
+print("✅ Zoo training complete!")
+print(f"Model saved to: {output_dir}")
+print("\nNext steps:")
+print("1. Test inference: python test_zoo_inference.py")
+print("2. Convert to GGUF: python convert_to_gguf.py")
+print("3. Upload to HuggingFace: python push_to_hf.py")