---
license: apache-2.0
language:
  - es
  - en
tags:
  - llm
  - self-learning
  - tool-calling
  - spanish
  - tinyllama
  - lora
base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
model-index:
  - name: thau
    results: []
---

# THAU v2.0 - Self-Learning Language Model

**THAU** (Thinking, Helpful, Autonomous, Understanding) is a self-learning language model fine-tuned from TinyLlama-1.1B with specialized training in tool calling, reasoning, and Spanish.

## Model Description

| Attribute | Value |
|-----------|-------|
| **Base Model** | TinyLlama-1.1B-Chat-v1.0 |
| **Parameters** | ~1.1B |
| **Training Method** | LoRA Fine-tuning |
| **Final Loss** | 0.43 |
| **Languages** | Spanish (primary), English |
| **License** | Apache 2.0 |

## Capabilities

- **Tool Calling**: Native JSON-based function invocation
- **Chain of Thought**: Step-by-step reasoning for complex problems
- **Image Generation**: Prompt engineering for image generation
- **Spanish Fluency**: Natural and technical conversations
- **Programming**: Python, JavaScript, Java assistance

## Training Data

| Category | Examples |
|----------|----------|
| Tool Calling | 112 |
| Spanish Natural/Technical | 52 |
| Image Generation | 30 |
| Conversational Spanish | 20 |
| Chain of Thought Reasoning | 20 |
| Programming | 30+ |
| **Total** | **297 specialized examples** |

## Usage

### With Transformers

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("luepow/thau")
tokenizer = AutoTokenizer.from_pretrained("luepow/thau")

# Chat format
prompt = """<|system|>
Eres THAU, un asistente AI inteligente y servicial.</s>
<|user|>
Hola, quien eres?</s>
<|assistant|>
"""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

### With Ollama (Recommended)

```bash
ollama pull luepow/thau
ollama run luepow/thau
```

## Tool Calling Format

THAU uses a JSON-based tool calling format:

```
<tool_call>{"name": "tool_name", "arguments": {"param": "value"}}</tool_call>
```

### Available Tools

| Tool | Description |
|------|-------------|
| `get_current_time` | Get current date/time |
| `web_search` | Search the internet |
| `execute_python` | Run Python code |
| `generate_image` | Generate image from prompt |
| `read_file` | Read file contents |
| `list_directory` | List directory contents |

### Example

**User**: What time is it?

**THAU**:
```
<tool_call>{"name": "get_current_time", "arguments": {}}</tool_call>
```

## Limitations

- Model size limits complex multi-step reasoning
- May hallucinate on topics outside training data
- Tool calling accuracy varies by complexity
- Spanish is the primary language; English is secondary
- Best for simple to moderate complexity tasks

## Training Details

- **Full Training**: 3,022 data points, 4,533 steps, loss 0.94
- **Specialized v2.0**: 297 examples, 745 steps, loss 0.43
- **Hardware**: Apple Silicon (MPS)
- **Training Time**: ~7 minutes for specialized phase

## Citation

```bibtex
@misc{thau2024,
  title={THAU v2.0: A Self-Learning Language Model},
  author={Luis Perez (luepow)},
  year={2024},
  url={https://huggingface.co/luepow/thau}
}
```

## Links

- **Ollama**: [luepow/thau](https://ollama.com/luepow/thau)
- **GitHub**: [luepow/thau](https://github.com/luepow/thau)

## Acknowledgments

- **Thomas & Aurora** - Inspiration for the cognitive age progression system
- **Claude (Anthropic)** - AI pair programming partner
- **TinyLlama Team** - Excellent base model
- **Hugging Face** - Model hosting and transformers library

---

*THAU v2.0 - Built with incremental learning and specialized training*

*Dedicated to Thomas & Aurora*