Frostie08
/

Luma-base

Model card Files Files and versions

Luma-base / README.md

Frostie08's picture

Update README.md

59a4f14 verified 3 months ago

|

history blame contribute delete

2.12 kB

	# Luma-base: A High-Performance Foundation Model for Haitian Creole (Kreyòl Ayisyen)

	Luma-base is a state-of-the-art 4-billion parameter language model, specialized in Haitian Creole. Based on the Qwen3-4B architecture, it has undergone extensive domain-specific pre-training to capture the nuances, grammar, and cultural context of the Haitian language.

	## 🚀 Project Overview
	The Luma project aims to bridge the gap in high-quality AI tools for Haitian Creole. Luma-base is the core engine designed to serve as a backbone for STT (Speech-to-Text) correction, translation, and text generation.

	- Developer: Frostie08
	- Model Type: Causal Language Model
	- Base Model: Qwen3-4B
	- Language: Haitian Creole (ht-HT)
	- License: Apache-2.0

	## 📊 Technical Specifications & Training
	Luma-base was trained using the Unsloth library to ensure maximum efficiency and mathematical precision.

	### Training Details:
	- Dataset: `kani-pretrain` (A curated, high-quality corpus of Haitian Creole literature, news, and formal texts).
	- Steps: 3,591 steps (3 full epochs).
	- Batch Size: 16 (Total).
	- Optimizer: AdamW 8-bit.
	- Learning Rate: 2e-4 with Cosine Scheduler.
	- Precision: Mixed Precision (16-bit).

	### Performance:
	- Final Validation Loss: 1.9252 🎯
	- Final Training Loss: 1.4520
	- Perplexity: ~6.8 (indicating high confidence in word prediction).



	## 🛠️ Implementation & Usage

	### 1. For Direct Inference (Text Completion)
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	model_name = "Frostie08/Luma-base"

	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype=torch.float16,
	device_map="auto"
	)

	# Example: Historical/Biblical context completion
	text = "Nan konmansman, Bondye te kreye..."
	inputs = tokenizer(text, return_tensors="pt").to("cuda")

	with torch.no_grad():
	outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.6)

	print(tokenizer.decode(outputs[0], skip_special_tokens=True))