camdog920
/

aether-core

Model card Files Files and versions

xet

Community

camdog920 commited on 14 days ago

Commit

e38e45d

verified ·

1 Parent(s): 3238d40

Upload AETHER_Colab_Training.ipynb

Browse files

Files changed (1) hide show

AETHER_Colab_Training.ipynb +509 -0

AETHER_Colab_Training.ipynb ADDED Viewed

	@@ -0,0 +1,509 @@

+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "provenance": [],
+      "gpuType": "T4"
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    },
+    "language_info": {
+      "name": "python"
+    },
+    "accelerator": "GPU"
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "# AETHER: Self-Evolving Neuro-Symbolic AGI Training\n",
+        "\n",
+        "**Run this in Google Colab (free T4 GPU)**\n",
+        "\n",
+        "This notebook trains a Qwen 0.5B model with AETHER's neuro-symbolic reward function using TRL GRPO.\n",
+        "\n",
+        "**What you'll get:**\n",
+        "- Fine-tuned model pushed to your HuggingFace Hub\n",
+        "- Live training metrics via Trackio\n",
+        "- AETHER evolutionary architecture components\n",
+        "\n",
+        "**Estimated time:** 2-3 hours on Colab T4\n",
+        "\n",
+        "**Paper integrations:** AlphaEvolve, HiMAC, GEA, Yunjue Agent, ASI-Evolve, CoALA, MLPO, BabyAGI, Agentic Neural Networks, CoMAS"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Step 1: Authenticate with HuggingFace\n",
+        "\n",
+        "Get your token from https://huggingface.co/settings/tokens (needs `write` scope)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "from huggingface_hub import notebook_login\n",
+        "notebook_login()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Step 2: Install Dependencies"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "!pip install -q torch transformers datasets accelerate peft trl networkx numpy sentencepiece protobuf\n",
+        "print(\"Dependencies installed!\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Step 3: Clone AETHER Repository"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "!git clone https://huggingface.co/camdog920/aether-core\n",
+        "%cd aether-core\n",
+        "!ls -la"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Step 4: Verify GPU & Setup"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "import torch\n",
+        "print(f\"CUDA available: {torch.cuda.is_available()}\")\n",
+        "print(f\"GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'CPU'}\")\n",
+        "print(f\"GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB\" if torch.cuda.is_available() else \"\")\n",
+        "\n",
+        "# Set environment variables\n",
+        "import os\n",
+        "os.environ['AETHER_MODEL'] = 'Qwen/Qwen2.5-0.5B-Instruct'\n",
+        "os.environ['AETHER_OUTPUT'] = './aether-output'\n",
+        "os.environ['PYTORCH_CUDA_ALLOC_CONF'] = 'expandable_segments:True'"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Step 5: Import AETHER Components & Build Knowledge Graph"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "import sys\n",
+        "sys.path.insert(0, '.')\n",
+        "\n",
+        "from aether.core import AetherCore, AetherConfig\n",
+        "from aether.knowledge import KnowledgeGraphEngine\n",
+        "\n",
+        "# Initialize AETHER with evolution enabled\n",
+        "config = AetherConfig(\n",
+        "    population_size=8,\n",
+        "    generations=5,\n",
+        "    mutation_rate=0.15,\n",
+        "    learning_rate=2e-5,\n",
+        "    macro_policy_dim=256,\n",
+        "    micro_policy_dim=128,\n",
+        "    num_agents=4,\n",
+        "    enable_self_modification=True,\n",
+        "    enable_parallel_agents=True,\n",
+        ")\n",
+        "\n",
+        "aether = AetherCore(config, model_name='Qwen/Qwen2.5-0.5B-Instruct')\n",
+        "print(f\"AETHER initialized: v{aether.metadata['version']}\")\n",
+        "\n",
+        "# Seed knowledge graph with AGI ontology\n",
+        "agi_facts = [\n",
+        "    ('AETHER', 'is_a', 'AGI_System'),\n",
+        "    ('AETHER', 'has_component', 'Knowledge_Graph'),\n",
+        "    ('AETHER', 'has_component', 'Neural_Network'),\n",
+        "    ('AETHER', 'has_component', 'Evolution_Engine'),\n",
+        "    ('AETHER', 'has_component', 'Safety_Sandbox'),\n",
+        "    ('Knowledge_Graph', 'enables', 'Symbolic_Reasoning'),\n",
+        "    ('Neural_Network', 'enables', 'Pattern_Learning'),\n",
+        "    ('Evolution_Engine', 'optimizes', 'Architecture'),\n",
+        "    ('Safety_Sandbox', 'constrains', 'Self_Modification'),\n",
+        "    ('Symbolic_Reasoning', 'complements', 'Pattern_Learning'),\n",
+        "    ('AlphaEvolve', 'inspires', 'Evolution_Engine'),\n",
+        "    ('HiMAC', 'inspires', 'Hierarchical_Policy'),\n",
+        "    ('GEA', 'inspires', 'Group_Evolution'),\n",
+        "    ('BabyAGI', 'inspires', 'Task_Loop'),\n",
+        "]\n",
+        "\n",
+        "for h, r, t in agi_facts:\n",
+        "    aether.knowledge.add_fact(h, r, t, confidence=1.0)\n",
+        "\n",
+        "print(f\"Knowledge graph: {aether.knowledge.stats()}\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Step 6: Define AETHER Neuro-Symbolic Reward Function"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "import re\n",
+        "from typing import List\n",
+        "\n",
+        "def aether_reward(completions: List[str], **kwargs) -> List[float]:\n",
+        "    \\"\\"\\n",
+        "    AETHER neuro-symbolic reward function.\n",
+        "    Integrates: reasoning structure, step enumeration, causal language,\n",
+        "    sub-goal planning, meta-cognition.\n",
+        "    \\"\\"\\n",
+        "    rewards = []\n",
+        "    for completion in completions:\n",
+        "        score = 0.0\n",
+        "        text = completion if isinstance(completion, str) else str(completion)\n",
+        "        \n",
+        "        # 1. Reasoning structure:  小镇 tags (DeepSeek-R1 style)\n",
+        "        if ' 小镇' in text and ' 大镇' in text:\n",
+        "            score += 0.30\n",
+        "        \n",
+        "        # 2. Step enumeration\n",
+        "        steps = sum(1 for s in text.split('\\n')\n",
+        "                   if any(s.strip().startswith(p) for p in ['1.', '2.', '3.', '4.', '5.', 'Step', 'Phase']))\n",
+        "        score += min(steps * 0.05, 0.25)\n",
+        "        \n",
+        "        # 3. Knowledge/causal reasoning markers\n",
+        "        if any(kw in text.lower() for kw in ['therefore', 'because', 'implies', 'consequently', 'thus', 'hence']):\n",
+        "            score += 0.20\n",
+        "        \n",
+        "        # 4. Sub-goal / blueprint structure (HiMAC-style hierarchical planning)\n",
+        "        if any(kw in text.lower() for kw in ['sub-goal', 'blueprint', 'plan', 'phase', 'macro', 'micro']):\n",
+        "            score += 0.15\n",
+        "        \n",
+        "        # 5. Self-reflection / meta-cognition / evolution\n",
+        "        if any(kw in text.lower() for kw in ['reflect', 'evaluate', 'improve', 'evolve', 'optimize']):\n",
+        "            score += 0.10\n",
+        "        \n",
+        "        rewards.append(min(score, 1.0))\n",
+        "    return rewards\n",
+        "\n",
+        "# Test\n",
+        "test_completions = [\n",
+        "    ' 小镇Step 1: Analyze problem. Step 2: Build knowledge graph. Therefore, the answer is 42. 大镇',\n",
+        "    'The answer is 42.',\n",
+        "]\n",
+        "print('Reward test:', aether_reward(test_completions))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Step 7: Load Dataset & Model"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "from datasets import load_dataset\n",
+        "from transformers import AutoModelForCausalLM, AutoTokenizer\n",
+        "\n",
+        "MODEL_NAME = 'Qwen/Qwen2.5-0.5B-Instruct'\n",
+        "\n",
+        "# Load dataset\n",
+        "print('Loading dataset...')\n",
+        "try:\n",
+        "    dataset = load_dataset('trl-lib/DeepMath-103K', split='train')\n",
+        "    print(f'Loaded DeepMath-103K: {len(dataset)} examples')\n",
+        "except Exception as e:\n",
+        "    print(f'DeepMath failed: {e}')\n",
+        "    dataset = load_dataset('trl-lib/Capybara', split='train')\n",
+        "    print(f'Loaded Capybara: {len(dataset)} examples')\n",
+        "\n",
+        "# Convert messages to prompt if needed\n",
+        "if 'messages' in dataset.column_names and 'prompt' not in dataset.column_names:\n",
+        "    def extract_prompt(examples):\n",
+        "        prompts = []\n",
+        "        for msgs in examples['messages']:\n",
+        "            user_msg = next((m['content'] for m in msgs if m.get('role') == 'user'), str(msgs))\n",
+        "            prompts.append(user_msg)\n",
+        "        return {'prompt': prompts}\n",
+        "    dataset = dataset.map(extract_prompt, batched=True, remove_columns=dataset.column_names)\n",
+        "elif 'text' in dataset.column_names and 'prompt' not in dataset.column_names:\n",
+        "    dataset = dataset.rename_column('text', 'prompt')\n",
+        "\n",
+        "# Split\n",
+        "dataset = dataset.train_test_split(test_size=0.1)\n",
+        "train_ds = dataset['train']\n",
+        "eval_ds = dataset['test']\n",
+        "print(f'Train: {len(train_ds)}, Eval: {len(eval_ds)}')\n",
+        "\n",
+        "# Load model\n",
+        "print('Loading model...')\n",
+        "model = AutoModelForCausalLM.from_pretrained(\n",
+        "    MODEL_NAME,\n",
+        "    torch_dtype=torch.bfloat16,\n",
+        "    device_map='auto',\n",
+        "    trust_remote_code=True,\n",
+        ")\n",
+        "tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)\n",
+        "if tokenizer.pad_token is None:\n",
+        "    tokenizer.pad_token = tokenizer.eos_token\n",
+        "\n",
+        "print(f'Model loaded: {sum(p.numel() for p in model.parameters()) / 1e6:.0f}M parameters')"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Step 8: Initialize GRPO Trainer with AETHER Rewards"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "from trl import GRPOTrainer, GRPOConfig\n",
+        "from trl.rewards import accuracy_reward, think_format_reward\n",
+        "\n",
+        "# Training configuration\n",
+        "training_args = GRPOConfig(\n",
+        "    output_dir='./aether-output',\n",
+        "    num_train_epochs=1,\n",
+        "    per_device_train_batch_size=1,\n",
+        "    per_device_eval_batch_size=1,\n",
+        "    gradient_accumulation_steps=8,\n",
+        "    learning_rate=2e-5,\n",
+        "    logging_steps=10,\n",
+        "    save_steps=200,\n",
+        "    eval_strategy='steps',\n",
+        "    eval_steps=100,\n",
+        "    bf16=True,\n",
+        "    max_completion_length=512,\n",
+        "    num_generations=4,\n",
+        "    report_to=[],  # Disable wandb/tensorboard - we'll use manual logging\n",
+        "    run_name='aether-grpo-qwen-0.5b',\n",
+        "    disable_tqdm=False,  # Show progress bar in Colab\n",
+        "    logging_first_step=True,\n",
+        "    push_to_hub=True,\n",
+        "    hub_model_id='camdog920/aether-qwen-0.5b-grpo',  # CHANGE THIS to your username!\n",
+        ")\n",
+        "\n",
+        "# Reward functions: AETHER custom + TRL built-ins\n",
+        "reward_funcs = [\n",
+        "    aether_reward,        # AETHER neuro-symbolic reward (reasoning structure)\n",
+        "    accuracy_reward,       # TRL: answer correctness\n",
+        "    think_format_reward,   # TRL:  小镇/ 大镇 format\n",
+        "]\n",
+        "\n",
+        "# Initialize trainer\n",
+        "trainer = GRPOTrainer(\n",
+        "    model=model,\n",
+        "    reward_funcs=reward_funcs,\n",
+        "    args=training_args,\n",
+        "    train_dataset=train_ds,\n",
+        "    eval_dataset=eval_ds,\n",
+        ")\n",
+        "\n",
+        "print('GRPO Trainer initialized!')\n",
+        "print(f'Reward functions: {len(reward_funcs)}')\n",
+        "print(f'Train steps: ~{len(train_ds) // 8}')"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Step 9: Train!\n",
+        "\n",
+        "This will take 2-3 hours on Colab T4. The model will be saved every 200 steps and pushed to Hub at the end."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "# Start training\n",
+        "trainer.train()\n",
+        "\n",
+        "# Save final model\n",
+        "trainer.save_model('./aether-output')\n",
+        "tokenizer.save_pretrained('./aether-output')\n",
+        "\n",
+        "print('Training complete!')\n",
+        "print('Model saved to ./aether-output')"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Step 10: Quick Evaluation & Test"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "# Test the trained model with AETHER reasoning\n",
+        "test_prompts = [\n",
+        "    'Think step by step: What is 17 + 25?',\n",
+        "    'Plan and solve: A farmer has 3 fields. Each field produces 42 bushels. How many total bushels?',\n",
+        "    'Reflect and improve: Your previous answer was 50. The correct answer is 60. What went wrong?',\n",
+        "]\n",
+        "\n",
+        "model.eval()\n",
+        "for prompt in test_prompts:\n",
+        "    inputs = tokenizer(prompt, return_tensors='pt').to(model.device)\n",
+        "    with torch.no_grad():\n",
+        "        outputs = model.generate(\n",
+        "            **inputs,\n",
+        "            max_new_tokens=256,\n",
+        "            do_sample=True,\n",
+        "            temperature=0.7,\n",
+        "        )\n",
+        "    response = tokenizer.decode(outputs[0], skip_special_tokens=True)\n",
+        "    print(f\"\\nPrompt: {prompt}\")\n",
+        "    print(f\"Response: {response[len(prompt):].strip()}\")\n",
+        "    print('-' * 60)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Step 11: Push to Hub (if not auto-pushed)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "from huggingface_hub import HfApi\n",
+        "\n",
+        "api = HfApi()\n",
+        "api.upload_folder(\n",
+        "    folder_path='./aether-output',\n",
+        "    repo_id='camdog920/aether-qwen-0.5b-grpo',  # CHANGE to your repo!\n",
+        "    repo_type='model',\n",
+        ")\n",
+        "print('Model pushed to HuggingFace Hub!')"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Step 12: AETHER Self-Reflection (Post-Training)\n",
+        "\n",
+        "Use the AETHER core to analyze the training run."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "# Run AETHER self-reflection\n",
+        "reflection = aether.self_reflect()\n",
+        "\n",
+        "print('AETHER Self-Reflection:')\n",
+        "print(f\"Generation: {reflection['generation']}\")\n",
+        "print(f\"Architectures tested: {reflection['total_architectures_tested']}\")\n",
+        "print(f\"Fitness trend: {reflection['fitness_trend'][-5:] if reflection['fitness_trend'] else 'N/A'}\")\n",
+        "print(f\"Neuro-symbolic balance:\")\n",
+        "print(f\"  Symbolic: {reflection['neuro_symbolic_balance']['symbolic_gate']:.3f}\")\n",
+        "print(f\"  Neural: {reflection['neuro_symbolic_balance']['neural_gate']:.3f}\")\n",
+        "\n",
+        "if reflection['recommendations']:\n",
+        "    print('\\nRecommendations:')\n",
+        "    for rec in reflection['recommendations']:\n",
+        "        print(f'  - {rec}')\n",
+        "\n",
+        "# Query knowledge graph\n",
+        "kg_result = aether.knowledge.query('AETHER has_component')\n",
+        "print(f\"\\nKnowledge Graph Components:\")\n",
+        "for r in kg_result['results'][:5]:\n",
+        "    print(f\"  -> {r['tail']} (confidence={r['confidence']:.2f}, source={r['source']})\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "---\n",
+        "## You're done!\n",
+        "\n",
+        "Your AETHER-trained model is now on HuggingFace Hub.\n",
+        "\n",
+        "**Next steps:**\n",
+        "- Load the model: `AutoModelForCausalLM.from_pretrained('your-username/aether-qwen-0.5b-grpo')`\n",
+        "- Integrate with AETHER core for recursive self-evolution\n",
+        "- Scale up: Try larger models (1.5B, 3B, 7B) on Vast.ai / RunPod\n",
+        "\n",
+        "**Integrated research:**\n",
+        "- AlphaEvolve (DeepMind) → MAP-Elites evolutionary archive\n",
+        "- HiMAC (2026) → Hierarchical macro-micro policy\n",
+        "- GEA (2026) → Group experience sharing + Performance-Novelty selection\n",
+        "- Yunjue Agent (2026) → Multi-agent role decomposition + tool absorption\n",
+        "- ASI-Evolve (2026) → 4-stage research loop\n",
+        "- CoALA (2023) → Cognitive memory architecture\n",
+        "- MLPO (2025) → Leader policy optimization\n",
+        "- BabyAGI → Task-driven autonomous loop\n",
+        "- Agentic Neural Networks (2025) → Textual backpropagation\n",
+        "- CoMAS (2025) → Co-evolving multi-agent interactions\n"
+      ]
+    }
+  ]
+}