Upload 4 files
Browse files- README.md +63 -3
- config.json +13 -0
- model.pt +3 -0
- tokenizer.json +0 -0
README.md
CHANGED
|
@@ -1,3 +1,63 @@
|
|
| 1 |
-
---
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- id
|
| 4 |
+
license: mit
|
| 5 |
+
tags:
|
| 6 |
+
- text-generation
|
| 7 |
+
- indonesian
|
| 8 |
+
- pytorch
|
| 9 |
+
- caca
|
| 10 |
+
datasets:
|
| 11 |
+
- Lyon28/Corpus-Indonesia
|
| 12 |
+
- Lyon28/Caca-Behavior
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
# Caca-Tiny 🔥
|
| 16 |
+
|
| 17 |
+
Caca-Tiny adalah language model berbahasa Indonesia yang dilatih menggunakan arsitektur transformer decoder.
|
| 18 |
+
|
| 19 |
+
## Model Details
|
| 20 |
+
|
| 21 |
+
- **Architecture**: Transformer Decoder
|
| 22 |
+
- **Parameters**: ~4,156,928
|
| 23 |
+
- **Vocabulary Size**: 8000
|
| 24 |
+
- **Max Sequence Length**: 512
|
| 25 |
+
- **Training Data**: Lyon28/Corpus-Indonesia
|
| 26 |
+
- **Fine-tuning Data**: Lyon28/Caca-Behavior
|
| 27 |
+
|
| 28 |
+
## Usage
|
| 29 |
+
|
| 30 |
+
```python
|
| 31 |
+
import torch
|
| 32 |
+
from safetensors.torch import load_file
|
| 33 |
+
|
| 34 |
+
state_dict = load_file("model.safetensors")
|
| 35 |
+
|
| 36 |
+
prompt = "Indonesia adalah"
|
| 37 |
+
generated = model.generate(prompt, max_new_tokens=50)
|
| 38 |
+
print(generated)
|
| 39 |
+
```
|
| 40 |
+
|
| 41 |
+
## Training
|
| 42 |
+
|
| 43 |
+
Model ini dilatih dengan:
|
| 44 |
+
- Optimizer: AdamW
|
| 45 |
+
- Learning Rate: 3e-4
|
| 46 |
+
- Batch Size: 8
|
| 47 |
+
- Epochs: 3 (pre-training) + 2 (fine-tuning)
|
| 48 |
+
|
| 49 |
+
## License
|
| 50 |
+
|
| 51 |
+
MIT License
|
| 52 |
+
|
| 53 |
+
## Citation
|
| 54 |
+
|
| 55 |
+
```bibtex
|
| 56 |
+
@misc{caca-tiny,
|
| 57 |
+
author = {Lyon28},
|
| 58 |
+
title = {Caca-Tiny: Indonesian Language Model},
|
| 59 |
+
year = {2026},
|
| 60 |
+
publisher = {Hugging Face},
|
| 61 |
+
url = {https://huggingface.co/Lyon28/Caca-Tiny}
|
| 62 |
+
}
|
| 63 |
+
```
|
config.json
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"model_type": "caca",
|
| 3 |
+
"model_name": "Caca-Tiny",
|
| 4 |
+
"version": "1.0.0",
|
| 5 |
+
"vocab_size": 8000,
|
| 6 |
+
"embedding_dim": 256,
|
| 7 |
+
"num_layers": 4,
|
| 8 |
+
"num_heads": 4,
|
| 9 |
+
"ffn_hidden_dim": 512,
|
| 10 |
+
"max_seq_length": 512,
|
| 11 |
+
"dropout": 0.1,
|
| 12 |
+
"head_dim": 64
|
| 13 |
+
}
|
model.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fd840462a63ee165c22d99e6b295b18fc3da832afaad863005f1045c519627f9
|
| 3 |
+
size 17171715
|
tokenizer.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|