mjpsm
/

Ubuntu-xgb-model

culturally-rooted

Eval Results (legacy)

Model card Files Files and versions

Ubuntu-xgb-model / README.md

mjpsm's picture

Update README.md

e9c6c37 verified 5 months ago

|

history blame contribute delete

3.22 kB

	---
	language: en
	license: mit
	tags:
	- regression
	- soulprint
	- ubuntu
	- xgboost
	- culturally-rooted
	model-index:
	- name: Ubuntu_xgb_model
	results:
	- task:
	type: regression
	name: Ubuntu Regression
	dataset:
	name: Ubuntu-regression_data.jsonl
	type: synthetic
	metrics:
	- type: mse
	value: 0.0121
	- type: rmse
	value: 0.1101
	- type: r2
	value: 0.8817
	---

	# Ubuntu Regression Model (Soulprint Archetype)

	## 🧩 Overview
	The Ubuntu_xgb_model is part of the Soulprint archetype family of models.
	It predicts an Ubuntu alignment score (0.0–1.0) for text inputs, where Ubuntu represents "I am because we are": harmony, inclusion, and community bridge-building.

	- 0.0–0.3 → Low Ubuntu (exclusion, selfishness, division)
	- 0.4–0.7 → Medium Ubuntu (partial inclusion, effort but incomplete)
	- 0.8–1.0 → High Ubuntu (harmony, belonging, collective well-being)

	This model is trained with XGBoost regression on a custom dataset of 918 rows, balanced across Low, Medium, and High Ubuntu examples. Data was generated using culturally diverse contexts (family, school, workplace, community, cultural rituals).

	---

	## 📊 Training Details
	- Framework: Python 3, scikit-learn, XGBoost
	- Embeddings: SentenceTransformer `"all-mpnet-base-v2"`
	- Algorithm: `XGBRegressor`
	- Training Size: 918 rows
	- Train/Test Split: 80/20

	### ⚙️ Hyperparameters
	- `n_estimators=300`
	- `learning_rate=0.05`
	- `max_depth=6`
	- `subsample=0.8`
	- `colsample_bytree=0.8`
	- `random_state=42`

	---

	## 📈 Evaluation Results
	On the held-out test set (20% of data):
	- MSE: 0.0121
	- RMSE: 0.1101
	- R² Score: 0.882

	---

	## 🚀 Usage

	### Load Model
	```python
	import joblib
	import xgboost as xgb
	from sentence_transformers import SentenceTransformer
	from huggingface_hub import hf_hub_download

	# -----------------------------
	# 1. Download model from Hugging Face Hub
	# -----------------------------
	REPO_ID = "mjpsm/Ubuntu_xgb_model" # change if you used a different repo name
	FILENAME = "Ubuntu_xgb_model.pkl"

	model_path = hf_hub_download(repo_id=REPO_ID, filename=FILENAME)

	# -----------------------------
	# 2. Load model + embedder
	# -----------------------------
	model = joblib.load(model_path)
	embedder = SentenceTransformer("all-mpnet-base-v2")

	# -----------------------------
	# 3. Example prediction
	# -----------------------------
	text = "During our class project, I made sure everyone’s ideas were included."
	embedding = embedder.encode([text])
	score = model.predict(embedding)[0]

	print("Predicted Ubuntu Score:", round(float(score), 3))

	```

	## 🌍 Applications

	- Community storytelling evaluation

	- Character alignment in cultural narratives

	- AI assistants tuned to Afrocentric archetypes

	- Training downstream models in the Soulprint system

	## ⚠️ Limitations

	- Dataset is synthetic (generated + curated). Real-world generalization should be validated.

	- The model is context-specific to Ubuntu values and may not generalize beyond Afrocentric cultural framing.

	- Scores are approximate indicators — interpretation depends on narrative context.