legacy-datasets/common_voice
Updated • 1.43k • 144
Stahili LLM is a large language model designed for community-driven insights, localized interactions, and engagement tracking. Built with a focus on user participation, it facilitates structured data collection, analytics, and automation in survey-based applications.
To use Stahili LLM, you can either install it via pip or run it using Hugging Face's API:
pip install transformers torch
Alternatively, load it via the Hugging Face model hub:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "itshunja/stahili"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
input_text = "How does Stahili optimize survey engagement?"
inputs = tokenizer(input_text, return_tensors="pt")
output = model.generate(**inputs, max_length=200)
print(tokenizer.decode(output[0], skip_special_tokens=True))
To fine-tune Stahili LLM on a specific dataset:
python train.py --model itshunja/stahili --dataset custom_dataset.json
Use the Hugging Face Inference API:
from transformers import pipeline
generator = pipeline("text-generation", model="itshunja/stahili")
response = generator("Explain the Stahili rewards program.")
print(response[0]['generated_text'])
We welcome contributions! To contribute:
git checkout -b feature-name).git commit -m 'Add new feature').git push origin feature-name).This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
For questions or support, reach out via Hugging Face Discussions or contact Isaac Hunja.
Base model
mistralai/Mistral-Small-24B-Base-2501