Text Generation
Transformers
PyTorch
English
need_conversational
feature-extraction
conversational-ai
service-marketplace
intent-classification
custom_code
Instructions to use yogami9/need-ai-conversational-model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use yogami9/need-ai-conversational-model with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="yogami9/need-ai-conversational-model", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("yogami9/need-ai-conversational-model", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use yogami9/need-ai-conversational-model with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "yogami9/need-ai-conversational-model" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "yogami9/need-ai-conversational-model", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/yogami9/need-ai-conversational-model
- SGLang
How to use yogami9/need-ai-conversational-model with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "yogami9/need-ai-conversational-model" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "yogami9/need-ai-conversational-model", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "yogami9/need-ai-conversational-model" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "yogami9/need-ai-conversational-model", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use yogami9/need-ai-conversational-model with Docker Model Runner:
docker model run hf.co/yogami9/need-ai-conversational-model
NEED AI Conversational Model
Multi-task conversational AI model for the NEED service marketplace platform.
Model Description
This model performs three simultaneous tasks:
- Intent Classification - Identifies user intent (10 classes)
- Category Classification - Classifies service category (20 classes)
- Response Generation - Generates contextual responses
Architecture
- Type: Custom Transformer (Encoder-Decoder)
- Parameters: ~95.9M
- Dimensions: 512 (model), 6 layers, 8 heads
- Vocabulary: 50,257 tokens (GPT-2 tokenizer)
- Context Length: 512 tokens
Usage
Installation
pip install torch transformers huggingface_hub
Loading the Model
import torch
from transformers import AutoTokenizer
from huggingface_hub import hf_hub_download
# Download model and custom code
model_path = hf_hub_download(repo_id="yogami9/need-ai-conversational-model", filename="pytorch_model.bin")
modeling_path = hf_hub_download(repo_id="yogami9/need-ai-conversational-model", filename="modeling_need.py")
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("yogami9/need-ai-conversational-model")
# Import and load model
import sys
import os
sys.path.insert(0, os.path.dirname(modeling_path))
from modeling_need import NEEDConversationalModel
model = NEEDConversationalModel.from_pretrained(model_path)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)
model.eval()
Inference Example
def generate_response(text: str, max_length: int = 100):
# Prepare input
input_text = f"Human: {text}"
input_ids = tokenizer.encode(input_text, return_tensors='pt').to(device)
speaker_ids = torch.zeros_like(input_ids) # Human speaker
# Generate
with torch.no_grad():
output_ids = model.generate(
input_ids=input_ids,
speaker_ids=speaker_ids,
max_length=max_length,
temperature=0.8,
top_k=50
)
# Decode
response = tokenizer.decode(output_ids[0], skip_special_tokens=True)
return response
# Test
response = generate_response("I need a house cleaner in Lagos")
print(response)
Training Data
Trained on conversations from the NEED service marketplace covering:
- Service requests and inquiries
- Pricing questions
- Location-based queries
- Booking requests
- Category navigation help
Languages: English (Nigerian context)
Training Details
- Framework: PyTorch
- Training Hardware: Google Colab (Tesla T4 GPU)
- Optimizer: AdamW (lr=5e-5, weight_decay=0.01)
- Scheduler: CosineAnnealingLR
- Batch Size: 2-4
- Epochs: 3+ (early stopping)
Limitations
- Trained primarily on Nigerian English context
- Limited to 512 token context window
- May require fine-tuning for other domains
- Custom architecture requires manual loading
Intended Use
Primary: Customer service automation for NEED platform
Suitable for:
- Service request routing
- Basic customer support
- Intent detection
- Category recommendation
Not suitable for:
- Medical/legal advice
- Financial transactions
- High-stakes decision making
Citation
@misc{need-ai-2025,
title={NEED AI Conversational Model},
author={NEED Service App Team},
year={2025},
publisher={Hugging Face},
url={https://huggingface.co/yogami9/need-ai-conversational-model}
}
Contact
- Email: needserviceapp@gmail.com
- Repository: GitHub
License
Apache 2.0
- Downloads last month
- 4