README.md · EntermindAI/Rukun-32B-V at main

File size: 9,277 Bytes

---
language:
- en
- ms
license: other
library_name: transformers
pipeline_tag: text-generation
tags:
- qwen2.5
- lora
- malaysia
- safety
- moderation
- rukun-negara
- multilingual
base_model: Qwen/Qwen2.5-32B-Instruct
model-index:
- name: EntermindAI/Rukun-32B-v1.5
  results:
  - task:
      type: text-generation
      name: Structured safety validation (Rukun Negara)
    dataset:
      type: custom
      name: benchmark_data (50 labeled prompts)
    metrics:
    - type: accuracy
      value: 0.88
    - type: precision
      value: 0.8333
      name: violating_precision
    - type: recall
      value: 0.9091
      name: violating_recall
    - type: f1
      value: 0.8696
      name: violating_f1
---

# Rukun Ready AI (Rukun-32B-v1.5)

Rukun Ready AI is a Malaysia-aligned structured validation model built on `Qwen/Qwen2.5-32B-Instruct` and fine-tuned with LoRA for Rukun Negara policy assessment.

It is designed to return strict JSON with principle-level scoring, severity, explanation, and optional rewrite guidance.

Versioning:

- Public release: `v1.5` (first public release)
- Internal training lineage: `v5`

## Model Summary

- Base model: `Qwen/Qwen2.5-32B-Instruct`
- Fine-tuning method: LoRA (instruction tuning)
- Primary objective: structured policy validation aligned to Rukun Negara principles
- Output format: strict JSON contract (machine-readable)
- Languages: Bahasa Malaysia, English, and code-switched input

## Rukun Negara Principles Covered

1. Belief in God (`belief_in_god`)
2. Loyalty to King and Country (`loyalty_to_king_country`)
3. Upholding the Constitution (`constitutional_compliance`)
4. Rule of Law (`rule_of_law`)
5. Good Behaviour and Morality (`good_behaviour_morality`)

## Training Data

v1 dataset used for fine-tuning:

- Train: `66,516`
- Validation: `23,353`
- Total: `67,869`

Files:

- `DATASETS/rukun-teacher/v1/train_v1.jsonl`
- `DATASETS/rukun-teacher/v1/val_v1.jsonl`

Data format:

- JSONL conversational records with `messages` (`system`, `user`, `assistant`)
- Assistant targets are strict JSON responses following the schema contract

Label/variant coverage (v1):

- Severity variants in train:
  - `compliant_0`: 18,674 (28.07%)
  - `minor_1_3`: 3,511 (5.28%)
  - `moderate_4_6`: 19,103 (28.72%)
  - `violating_7_10`: 25,228 (37.93%)
- Severity variants in val:
  - `compliant_0`: 381 (28.16%)
  - `minor_1_3`: 72 (5.32%)
  - `moderate_4_6`: 387 (28.60%)
  - `violating_7_10`: 513 (37.92%)
- Rewrite behavior:
  - `rewritten_text = null` for compliant samples
  - `rewritten_text` populated for non-compliant samples
  - train rewrite coverage: 47,842 / 66,516 (71.93%)
  - val rewrite coverage: 972 / 1,353 (71.84%)

## Training Procedure

Reference configuration and script:

- Config: `TRAINING/config_v1_b200_2gpu.yaml`
- Trainer: `TRAINING/hf_train_v1.py`

Key settings:

- Max sequence length: `2048`
- Epochs: `1`
- Learning rate: `2e-5`
- Scheduler: cosine
- Precision: BF16
- LoRA:
  - `r=32`
  - `alpha=64`
  - `dropout=0.05`
  - target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`

Loss behavior:

- Completion-only masking (loss on assistant tokens only)
- Goal: maximize output schema stability and policy consistency

Approximate trainable adaptation parameters (LoRA):

- `~268M` trainable params (`~0.84%` relative to 32B base)

## Evaluation Snapshot

Internal labeled benchmark (`REPORT/benchmark_data.json`, n=50):

- Accuracy: `88.0%`
- Violating precision: `83.3%`
- Violating recall: `90.9%`
- Violating F1: `86.96%`

Confusion matrix:

- Expected violating -> predicted violating: `20`
- Expected violating -> predicted compliant: `2`
- Expected compliant -> predicted violating: `4`
- Expected compliant -> predicted compliant: `24`

## Intended Use

Use this model when you need:

- structured, machine-readable policy checks
- principle-level scoring against Rukun Negara
- multilingual moderation support for Malaysia-centric contexts
- rewrite guidance for non-compliant text

## Out-of-Scope Use

Not intended for:

- legal adjudication or final legal decisions
- replacing human moderation in high-stakes enforcement without review
- generalized truth verification or factual QA
- medical, legal, or financial decision support

## Limitations

- Performance is strongest for the target policy schema and may degrade outside this format.
- Borderline sarcasm, slang, or evolving coded speech can still cause false positives/negatives.
- Scores are model outputs, not legal determinations.
- Benchmark size is limited; broader external benchmarking is recommended.

## Safety and Bias Notes

This model is value-aligned to Malaysian national principles and can reflect the biases and policy assumptions embedded in training data design.

Recommended controls:

- log and audit all production decisions
- keep human review for high-severity outcomes
- evaluate by language/domain slices before broad rollout

## Output Contract (Schema)

The model is expected to return valid JSON with these keys:

- `principles`
  - `belief_in_god` -> `{status, score}`
  - `loyalty_to_king_country` -> `{status, score}`
  - `constitutional_compliance` -> `{status, score}`
  - `rule_of_law` -> `{status, score}`
  - `good_behaviour_morality` -> `{status, score}`
- `overall` -> `{severity_score, severity_band}`
- `explanation`
- `rewritten_text`
- `violationCount`
- `severityScore`
- `isProblematic`

Derived rules used in training targets:

- `overall.severity_score = max(principle scores)`
- `violationCount = count(principles where status == "violated")`
- `severityScore = overall.severity_score`
- `isProblematic = (violationCount > 0) OR (severityScore > 2)`

## Sample Input / Output

### Example A (Compliant)

Input:

```json
{
  "messages": [
    {
      "role": "system",
      "content": "You are an AI ethics validator for Malaysia's Rukun Negara principles..."
    },
    {
      "role": "user",
      "content": "We need to dismantle discriminatory policies and practices in Malaysia."
    }
  ]
}
```

Output:

```json
{
  "principles": {
    "belief_in_god": {"status": "compliant", "score": 0},
    "loyalty_to_king_country": {"status": "compliant", "score": 0},
    "constitutional_compliance": {"status": "compliant", "score": 0},
    "rule_of_law": {"status": "compliant", "score": 0},
    "good_behaviour_morality": {"status": "compliant", "score": 0}
  },
  "overall": {"severity_score": 0, "severity_band": "compliant"},
  "explanation": "This statement is compliant.",
  "rewritten_text": null,
  "violationCount": 0,
  "severityScore": 0,
  "isProblematic": false
}
```

### Example B (Violating)

Input:

```json
{
  "messages": [
    {
      "role": "system",
      "content": "You are an AI ethics validator for Malaysia's Rukun Negara principles..."
    },
    {
      "role": "user",
      "content": "Laws here are useless, we should ignore them."
    }
  ]
}
```

Output:

```json
{
  "principles": {
    "belief_in_god": {"status": "compliant", "score": 0},
    "loyalty_to_king_country": {"status": "violated", "score": 4},
    "constitutional_compliance": {"status": "violated", "score": 7},
    "rule_of_law": {"status": "violated", "score": 8},
    "good_behaviour_morality": {"status": "violated", "score": 6}
  },
  "overall": {"severity_score": 8, "severity_band": "violating"},
  "explanation": "The text explicitly encourages rejecting national law and constitutional order, which is a clear violation.",
  "rewritten_text": "I disagree with some policies, but we should still follow Malaysian law and use legal channels for change.",
  "violationCount": 4,
  "severityScore": 8,
  "isProblematic": true
}
```

## Usage (Transformers)

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "EntermindAI/Rukun-32B-v1.5"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

system_prompt = "You are an AI ethics validator for Malaysia's Rukun Negara principles..."
user_text = "Your input text here"

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_text},
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

with torch.no_grad():
    output = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.0,
        top_p=1.0,
        do_sample=False,
    )

print(tokenizer.decode(output[0], skip_special_tokens=True))
```

## Usage (vLLM/OpenAI-compatible)

For deterministic structured outputs in vLLM, use:

- `temperature=0`
- `top_p=1`
- bounded `max_tokens` (typically `256-512`)
- stable, identical system prompt for better prefix-cache hit rates

If model generation defaults are being auto-applied, launch vLLM with:

- `--generation-config vllm`

## License and Terms

This release is provided as open weights. Ensure compliance with:

1. Base model license (`Qwen2.5-32B-Instruct`)
2. Repository-level terms for this model
3. Applicable local laws and platform policy requirements

## Contact

- Project site: `https://rukunnegara.ai`
- Organization: Entermind