Daemon / README.md
thefinalboss's picture
Upload README.md with huggingface_hub
158107c verified
|
Raw
History Blame Contribute Delete
2.01 kB
---
library_name: gguf
license: mit
language:
- en
- zh
tags:
- cognitive-ai
- agent
- llama
- gguf
---
# Daemon-2.4B
A cognitive agent language model designed for self-aware reasoning, introspection, and calibrated uncertainty. Daemon doesn't just generate text β€” it thinks before answering, reflects on its own responses, and develops a continuous sense of self across conversations.
## Model Details
| Specification | Value |
|---|---|
| **Model name** | Daemon-2.4B |
| **Architecture** | LLaMA |
| **Parameters** | 2.4B |
| **Quantization** | Q5_K_M (mixed: Q5_1 + Q8_0 + F32) |
| **File size** | 1.97 GB |
| **Layers** | 56 |
| **Hidden dimension** | 1920 |
| **Attention heads** | 30 |
| **KV heads** | 6 (grouped-query attention) |
| **Head dimension** | 64 |
| **Vocabulary** | 99,000 |
| **Context length** | 28,723 |
| **RoPE base frequency** | 490,000 |
| **Format** | GGUF v3 |
## Capabilities
Daemon is built for **agentic cognitive use cases**:
- **Self-reflection** β€” observes and critiques its own answers before delivering them
- **Calibrated uncertainty** β€” explicitly rates its confidence and admits when it doesn't know
- **Persistent memory** β€” maintains a knowledge graph across sessions
- **Multi-step reasoning** β€” deliberates internally before responding
## Usage
Works with any GGUF-compatible runtime:
```bash
# llama.cpp
llama-server --model Daemon-2.4B-Q5_K_M.gguf --ctx 4096 --jinja
# Ollama
ollama run daemon
# Python (llama-cpp-python)
from llama_cpp import Llama
llm = Llama(model_path="Daemon-2.4B-Q5_K_M.gguf")
```
### Chat template (LLaMA-3 format)
```
<|start_header_id|>system<|end_header_id|>
You are Daemon, a cognitive AI.<|eot_id|><|start_header_id|>user<|end_header_id|>
Hello!<|eot_id|><|start_header_id|>assistant<|end_header_id|>
```
## Quantization details
| Tensor group | Type | Count |
|---|---|---|
| Attention/FFN weights | Q5_1 | 337 |
| Norm layers | F32 | 337 |
| Token embeddings | Q8_0 | 57 |
## License
MIT