THAU AGI v2 - Proto-AGI System upload

Browse files

Files changed (9) hide show

README.md +120 -0
chat_template.jinja +15 -0
config.json +29 -0
generation_config.json +7 -0
model.safetensors +3 -0
special_tokens_map.json +30 -0
tokenizer.json +0 -0
tokenizer.model +3 -0
tokenizer_config.json +43 -0

README.md ADDED Viewed

	@@ -0,0 +1,120 @@

+---
+license: mit
+language:
+- en
+- es
+tags:
+- proto-agi
+- react-cycle
+- tool-calling
+- multi-agent
+- fine-tuned
+base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
+pipeline_tag: text-generation
+---
+# THAU AGI v2 - Proto-AGI System
+**THAU** = **TH**omas + **AU**rora
+A Proto-AGI (Prototype Artificial General Intelligence) system fine-tuned from TinyLlama-1.1B with specialized training in reasoning, tool calling, and Spanish language support.
+## Features
+- **ReAct Cycle**: THINK -> PLAN -> ACT -> OBSERVE -> REFLECT
+- **Experiential Learning**: Learns from past interactions
+- **Metacognition**: Self-evaluation for improvement
+- **Web Search**: Internet search capabilities
+- **Multi-Agent**: Collaboration between specialized agents (CODER, REVIEWER, RESEARCHER, PLANNER, TESTER)
+- **Knowledge Base**: RAG (Retrieval Augmented Generation)
+- **Feedback Loop**: Continuous improvement with user feedback
+- **Tool Calling**: Integrated tools for calculations, file operations, code execution
+- **TTS Support**: Text-to-Speech integration
+- **Image Generation**: Stable Diffusion integration
+- **MCP Integration**: Model Context Protocol support
+## Available Tools
+| Tool | Description |
+|------|-------------|
+| `calculate` | Mathematical calculations |
+| `read_file` | Read files |
+| `write_file` | Write files |
+| `list_directory` | List directories |
+| `execute_python` | Execute Python code |
+| `web_search` | Search on internet |
+| `fetch_url` | Get URL content |
+| `research` | Deep research |
+| `text_to_speech` | Convert text to speech |
+| `generate_image` | Generate images |
+## Operation Modes
+1. **CHAT**: Casual conversation
+2. **TASK**: Specific tasks with tools
+3. **RESEARCH**: Deep information search
+4. **COLLABORATIVE**: Multi-agent collaboration
+5. **LEARNING**: Intensive learning mode
+## Usage
+### With Transformers
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("luepow/thau-agi-v2")
+tokenizer = AutoTokenizer.from_pretrained("luepow/thau-agi-v2")
+prompt = "<|system|>\nYou are THAU AGI v2, a helpful AI assistant.</s>\n<|user|>\nWhat is 25 * 4 + 100?</s>\n<|assistant|>\n"
+inputs = tokenizer(prompt, return_tensors="pt")
+outputs = model.generate(**inputs, max_new_tokens=200)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+### With Ollama
+```bash
+ollama pull luepow/thau:agi-v2
+ollama run luepow/thau:agi-v2
+```
+### With Gradio Interface
+```bash
+git clone https://github.com/luepow/thau.git
+cd thau
+pip install -r requirements.txt
+python scripts/gradio_thau_ollama.py
+```
+## Training Data
+The model was fine-tuned on:
+- Programming tutorials (Python, JavaScript, Rust, Go, Java)
+- Mathematical reasoning
+- Tool calling patterns
+- Spanish language content
+- DevOps and cloud infrastructure
+- Agile methodologies
+- UX/CSS frameworks
+## Model Card
+- **Base Model**: TinyLlama/TinyLlama-1.1B-Chat-v1.0
+- **Parameters**: 1.1B
+- **Context Length**: 4096 tokens
+- **Languages**: English, Spanish
+- **License**: MIT
+## Links
+- **GitHub**: https://github.com/luepow/thau
+- **Ollama**: https://ollama.com/luepow/thau
+- **Support**: Buy Me a Coffee - luepowg
+## Credits
+Developed with love for Thomas & Aurora.
+**THAU** = **TH**omas + **AU**rora

chat_template.jinja ADDED Viewed

	@@ -0,0 +1,15 @@

+{% for message in messages %}
+{% if message['role'] == 'user' %}
+{{ '<|user|>
+' + message['content'] + eos_token }}
+{% elif message['role'] == 'system' %}
+{{ '<|system|>
+' + message['content'] + eos_token }}
+{% elif message['role'] == 'assistant' %}
+{{ '<|assistant|>
+'  + message['content'] + eos_token }}
+{% endif %}
+{% if loop.last and add_generation_prompt %}
+{{ '<|assistant|>' }}
+{% endif %}
+{% endfor %}

config.json ADDED Viewed

	@@ -0,0 +1,29 @@

+{
+  "architectures": [
+    "LlamaForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "bos_token_id": 1,
+  "dtype": "bfloat16",
+  "eos_token_id": 2,
+  "head_dim": 64,
+  "hidden_act": "silu",
+  "hidden_size": 2048,
+  "initializer_range": 0.02,
+  "intermediate_size": 5632,
+  "max_position_embeddings": 2048,
+  "mlp_bias": false,
+  "model_type": "llama",
+  "num_attention_heads": 32,
+  "num_hidden_layers": 22,
+  "num_key_value_heads": 4,
+  "pretraining_tp": 1,
+  "rms_norm_eps": 1e-05,
+  "rope_scaling": null,
+  "rope_theta": 10000.0,
+  "tie_word_embeddings": false,
+  "transformers_version": "4.57.1",
+  "use_cache": true,
+  "vocab_size": 32000
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "bos_token_id": 1,
+  "eos_token_id": 2,
+  "max_length": 2048,
+  "pad_token_id": 0,
+  "transformers_version": "4.57.1"
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6e6001da2106d4757498752a021df6c2bdc332c650aae4bae6b0c004dcf14933
+size 2200119864

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,30 @@

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
+size 499723

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,43 @@

+{
+  "add_bos_token": true,
+  "add_eos_token": false,
+  "add_prefix_space": null,
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<unk>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<s>",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "</s>",
+  "extra_special_tokens": {},
+  "legacy": false,
+  "model_max_length": 2048,
+  "pad_token": "</s>",
+  "padding_side": "right",
+  "sp_model_kwargs": {},
+  "tokenizer_class": "LlamaTokenizer",
+  "unk_token": "<unk>",
+  "use_default_system_prompt": false
+}