Instructions to use yogami9/need-ai-conversational-model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use yogami9/need-ai-conversational-model with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="yogami9/need-ai-conversational-model", trust_remote_code=True)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("yogami9/need-ai-conversational-model", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use yogami9/need-ai-conversational-model with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "yogami9/need-ai-conversational-model"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "yogami9/need-ai-conversational-model",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/yogami9/need-ai-conversational-model

SGLang

How to use yogami9/need-ai-conversational-model with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "yogami9/need-ai-conversational-model" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "yogami9/need-ai-conversational-model",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "yogami9/need-ai-conversational-model" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "yogami9/need-ai-conversational-model",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use yogami9/need-ai-conversational-model with Docker Model Runner:
```
docker model run hf.co/yogami9/need-ai-conversational-model
```

NEED AI Conversational Model

Multi-task conversational AI model for the NEED service marketplace platform.

Model Description

This model performs three simultaneous tasks:

Intent Classification - Identifies user intent (10 classes)
Category Classification - Classifies service category (20 classes)
Response Generation - Generates contextual responses

Architecture

Type: Custom Transformer (Encoder-Decoder)
Parameters: ~95.9M
Dimensions: 512 (model), 6 layers, 8 heads
Vocabulary: 50,257 tokens (GPT-2 tokenizer)
Context Length: 512 tokens

Usage

Installation

pip install torch transformers huggingface_hub

Loading the Model

import torch
from transformers import AutoTokenizer
from huggingface_hub import hf_hub_download

# Download model and custom code
model_path = hf_hub_download(repo_id="yogami9/need-ai-conversational-model", filename="pytorch_model.bin")
modeling_path = hf_hub_download(repo_id="yogami9/need-ai-conversational-model", filename="modeling_need.py")

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("yogami9/need-ai-conversational-model")

# Import and load model
import sys
import os
sys.path.insert(0, os.path.dirname(modeling_path))
from modeling_need import NEEDConversationalModel

model = NEEDConversationalModel.from_pretrained(model_path)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)
model.eval()

Inference Example

def generate_response(text: str, max_length: int = 100):
    # Prepare input
    input_text = f"Human: {text}"
    input_ids = tokenizer.encode(input_text, return_tensors='pt').to(device)
    speaker_ids = torch.zeros_like(input_ids)  # Human speaker

    # Generate
    with torch.no_grad():
        output_ids = model.generate(
            input_ids=input_ids,
            speaker_ids=speaker_ids,
            max_length=max_length,
            temperature=0.8,
            top_k=50
        )

    # Decode
    response = tokenizer.decode(output_ids[0], skip_special_tokens=True)
    return response

# Test
response = generate_response("I need a house cleaner in Lagos")
print(response)

Training Data

Trained on conversations from the NEED service marketplace covering:

Service requests and inquiries
Pricing questions
Location-based queries
Booking requests
Category navigation help

Languages: English (Nigerian context)

Training Details

Framework: PyTorch
Training Hardware: Google Colab (Tesla T4 GPU)
Optimizer: AdamW (lr=5e-5, weight_decay=0.01)
Scheduler: CosineAnnealingLR
Batch Size: 2-4
Epochs: 3+ (early stopping)

Limitations

Trained primarily on Nigerian English context
Limited to 512 token context window
May require fine-tuning for other domains
Custom architecture requires manual loading

Intended Use

Primary: Customer service automation for NEED platform

Suitable for:

Service request routing
Basic customer support
Intent detection
Category recommendation

Not suitable for:

Medical/legal advice
Financial transactions
High-stakes decision making

Citation

@misc{need-ai-2025,
  title={NEED AI Conversational Model},
  author={NEED Service App Team},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/yogami9/need-ai-conversational-model}
}

Contact

Email: needserviceapp@gmail.com
Repository: GitHub

License

Apache 2.0

Downloads last month: 4