---
license: apache-2.0
language:
- es
- en
tags:
- llm
- self-learning
- tool-calling
- spanish
- tinyllama
- lora
base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
model-index:
- name: thau
results: []
---
# THAU v2.0 - Self-Learning Language Model
**THAU** (Thinking, Helpful, Autonomous, Understanding) is a self-learning language model fine-tuned from TinyLlama-1.1B with specialized training in tool calling, reasoning, and Spanish.
## Model Description
| Attribute | Value |
|-----------|-------|
| **Base Model** | TinyLlama-1.1B-Chat-v1.0 |
| **Parameters** | ~1.1B |
| **Training Method** | LoRA Fine-tuning |
| **Final Loss** | 0.43 |
| **Languages** | Spanish (primary), English |
| **License** | Apache 2.0 |
## Capabilities
- **Tool Calling**: Native JSON-based function invocation
- **Chain of Thought**: Step-by-step reasoning for complex problems
- **Image Generation**: Prompt engineering for image generation
- **Spanish Fluency**: Natural and technical conversations
- **Programming**: Python, JavaScript, Java assistance
## Training Data
| Category | Examples |
|----------|----------|
| Tool Calling | 112 |
| Spanish Natural/Technical | 52 |
| Image Generation | 30 |
| Conversational Spanish | 20 |
| Chain of Thought Reasoning | 20 |
| Programming | 30+ |
| **Total** | **297 specialized examples** |
## Usage
### With Transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("luepow/thau")
tokenizer = AutoTokenizer.from_pretrained("luepow/thau")
# Chat format
prompt = """<|system|>
Eres THAU, un asistente AI inteligente y servicial.
<|user|>
Hola, quien eres?
<|assistant|>
"""
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### With Ollama (Recommended)
```bash
ollama pull luepow/thau
ollama run luepow/thau
```
## Tool Calling Format
THAU uses a JSON-based tool calling format:
```
{"name": "tool_name", "arguments": {"param": "value"}}
```
### Available Tools
| Tool | Description |
|------|-------------|
| `get_current_time` | Get current date/time |
| `web_search` | Search the internet |
| `execute_python` | Run Python code |
| `generate_image` | Generate image from prompt |
| `read_file` | Read file contents |
| `list_directory` | List directory contents |
### Example
**User**: What time is it?
**THAU**:
```
{"name": "get_current_time", "arguments": {}}
```
## Limitations
- Model size limits complex multi-step reasoning
- May hallucinate on topics outside training data
- Tool calling accuracy varies by complexity
- Spanish is the primary language; English is secondary
- Best for simple to moderate complexity tasks
## Training Details
- **Full Training**: 3,022 data points, 4,533 steps, loss 0.94
- **Specialized v2.0**: 297 examples, 745 steps, loss 0.43
- **Hardware**: Apple Silicon (MPS)
- **Training Time**: ~7 minutes for specialized phase
## Citation
```bibtex
@misc{thau2024,
title={THAU v2.0: A Self-Learning Language Model},
author={Luis Perez (luepow)},
year={2024},
url={https://huggingface.co/luepow/thau}
}
```
## Links
- **Ollama**: [luepow/thau](https://ollama.com/luepow/thau)
- **GitHub**: [luepow/thau](https://github.com/luepow/thau)
## Acknowledgments
- **Thomas & Aurora** - Inspiration for the cognitive age progression system
- **Claude (Anthropic)** - AI pair programming partner
- **TinyLlama Team** - Excellent base model
- **Hugging Face** - Model hosting and transformers library
---
*THAU v2.0 - Built with incremental learning and specialized training*
*Dedicated to Thomas & Aurora*