YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Biawak-8B-Base
- library_name: transformers
- base_model: Qwen/Qwen3-8B
- tags: qwen, qwen3, causal-lm, continued-pretraining, indonesian, id, prd, dtp
- license: apache-2.0
- language: id, en
๐ Overview
Biawak-8B-Base is an 8-billion-parameter Large Language Model (LLM) adapted specifically for Indonesia's strategic focus areas:
- Perlindungan Ruang Digital (PRD) โ Digital Space Protection
- Digital Talent Pool (DTP) โ Workforce and digital capability development
This model is built through Continued Preโtraining (CPT) on the Qwenโ3โ8B base model using a curated Indonesian dataset.
๐ง Model Details
Model Description
- Developed by: AITF Indonesia
- Model Type: Causal Language Model (Base)
- Base Model: Qwen/Qwen3-8B
- Language: Indonesian (Primary), English (Secondary)
- License: Apache 2.0
- Training Method: Continued Preโtraining (CPT)
๐ฏ Goal
To create a sovereign, domainโspecialized Indonesian foundation model with strong understanding of:
- Digital policies (UU PDP, UU ITE)
- Digital workforce & skill landscape (DTP)
๐ Dataset Composition
Total Dataset Size: ~214.2 Million Tokens
| Category | Description | Token Count (M) | Percentage |
|---|---|---|---|
| DTP | Digital HR, tech syllabi, certifications, job trends | 94.0 | ~43.9% |
| PRD | Cybersecurity, PDP Law, content moderation, hoax prevention | 92.0 | ~42.9% |
| Wikipedia ID | General knowledge anchor & grammar stability | 28.2 | ~13.2% |
| Total | โ | 214.2 | 100% |
๐งฉ Intended Use
As a Base Model, Biawakโ8B outputs text completions and can be adapted into chat/instruct variants.
1. PRD (Perlindungan Ruang Digital)
- Policy sentiment analysis
- Misinformation pattern detection
- Understanding legal terminology (UU ITE, UU PDP)
2. DTP (Digital Talent Pool)
- Skill gap analysis
- Curriculum drafting assistance
- Job description & talent understanding
๐ How to Get Started
Load the model using HuggingFace Transformers:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
# 1. Configuration
model_id = "YOUR_USERNAME/Biawak-8B-Base" # Replace with your actual Hub ID
# 2. Load Model
# Use bfloat16 for A100/A10G, float16 for T4
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# 3. Inference Example (Completion)
input_text = "Strategi utama untuk mengurangi gap talenta digital di Indonesia adalah"
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=100,
do_sample=True,
temperature=0.7
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
โ๏ธ Training Details
Training Procedure
The model was continuedโpretrained with a causal language modeling (CLM) objective while preserving base reasoning capabilities.
Hardware & Environment
- GPU: NVIDIA A100 80GB (Colab Pro+)
- Training Duration: ~36 hours
- Frameworks: PyTorch, Transformers, Accelerate
๐ง Hyperparameters (Highlights)
- Sequence Length: 4096
- Optimizer: AdamW
- Scheduler: Cosine Decay
- Precision: bf16
โ ๏ธ Limitations
- Base Model: No SFT or RLHF; fewโshot prompting may be required.
- Web Data Bias: May inherit biases from Indonesian web sources.
- Hallucinations: Possible incorrect factual output.
โ Recommendations
For production use, it is recommended to:
- Perform Supervised FineโTuning (SFT) for PRD/DTP domains
- Add highโquality instruction datasets
- Apply evaluation benchmarks before deployment
- Downloads last month
- 59
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support