Spaces:
Sleeping
Sleeping
metadata
title: ๐๐ก๐๐ญ ๐๐ฎ๐ฆ๐จ๐ฎ๐ซ ๐๐๐ ๐.๐๐
emoji: ๐จ
colorFrom: purple
colorTo: yellow
sdk: docker
pinned: false
๐ง DNAI Humour Chatbot - Interactive Web Interface
A beautiful, fully-functional chat interface for the dnai-humour-0.5B-instruct model. Chat with a witty, lightweight AI assistant that's fast, friendly, and surprisingly capable.
โจ Features
- ๐ฌ Real-time Chat - Smooth, responsive conversation interface
- ๐จ Beautiful UI - Modern design with dark/light mode
- โก Fast Responses - Optimized for 0.5B parameter model
- ๐ญ Personality - Witty and helpful, not robotic
- ๐ฑ Responsive - Works on all devices
- ๐งน Clean UX - Clear conversations, suggestions, timestamps
๐ Quick Start
Prerequisites
- Python 3.8+
- CUDA-capable GPU (recommended) or CPU
- 2GB+ VRAM (GPU) or 4GB+ RAM (CPU)
Installation
- Clone the Space
git clone https://huggingface.co/spaces/YOUR-USERNAME/dnai-humour-chatbot
cd dnai-humour-chatbot
- Install Dependencies
pip install -r requirements.txt
- Run the Application
uvicorn app:app --host 0.0.0.0 --port 7860
- Open Browser
http://localhost:7860
๐ Deploy to Hugging Face Spaces
Step 1: Create Space
- Go to Hugging Face
- Click New Space
- Settings:
- Name:
dnai-humour-chatbot - SDK: Docker
- Hardware: T4 Small (or CPU basic for testing)
- Visibility: Public
- Name:
Step 2: Upload Files
Upload these files to your Space:
dnai-humour-chatbot/
โโโ app.py
โโโ requirements.txt
โโโ index.html
โโโ README.md
โโโ Dockerfile (optional)
Step 3: Create Dockerfile (if needed)
FROM python:3.10-slim
WORKDIR /app
# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application
COPY . .
# Expose port
EXPOSE 7860
# Run
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
Step 4: Wait for Build
- Hugging Face will automatically build your Space
- Check logs for any errors
- Model will download on first startup (~500MB)
- Once "Running", your chatbot is live! ๐
๐ File Structure
dnai-humour-chatbot/
โ
โโโ app.py # FastAPI backend with model inference
โ โโโ /api/chat # POST - Send messages
โ โโโ /api/info # GET - Model information
โ โโโ /health # GET - Health check
โ โโโ /api/reset # POST - Reset conversation
โ
โโโ index.html # React-based chat interface
โ โโโ Message history
โ โโโ Dark/Light mode
โ โโโ Typing indicators
โ โโโ Quick suggestions
โ
โโโ requirements.txt # Python dependencies
โ โโโ transformers
โ โโโ torch
โ โโโ fastapi
โ โโโ uvicorn
โ
โโโ README.md # This file
๐ API Documentation
POST /api/chat
Request:
{
"messages": [
{
"role": "user",
"content": "Tell me a joke"
}
],
"temperature": 0.7,
"max_tokens": 256
}
Response:
{
"response": "Why don't scientists trust atoms? Because they make up everything! ๐",
"model": "DarkNeuron-AI/dnai-humour-0.5B-instruct",
"tokens_used": 45
}
GET /api/info
Response:
{
"model_name": "DNAI Humour 0.5B Instruct",
"version": "1.0",
"base_model": "Qwen2.5-0.5B-Instruct",
"parameters": "~0.5 Billion",
"capabilities": [
"Instruction following",
"Conversational AI",
"Light humor",
"Low-latency responses"
]
}
๐จ UI Features
Chat Interface
- Clean, modern design
- Message bubbles with timestamps
- User/Assistant avatars
- Smooth animations
Dark/Light Mode
- Toggle between themes
- Smooth transitions
- Persistent preferences (client-side)
Smart Suggestions
- Quick-start prompts
- Contextual examples
- One-click input
Typing Indicators
- Real-time feedback
- Loading animations
- Response timing
โ๏ธ Configuration
Model Parameters (app.py)
MODEL_NAME = "DarkNeuron-AI/dnai-humour-0.5B-instruct"
MAX_LENGTH = 512 # Context window
TEMPERATURE = 0.7 # Creativity (0.0-1.0)
TOP_P = 0.9 # Nucleus sampling
TOP_K = 50 # Top-k sampling
Generation Settings
Adjust in /api/chat request:
temperature: 0.1 (focused) to 1.0 (creative)max_tokens: 50 (short) to 512 (long)stream: true/false (streaming support)
๐ Troubleshooting
Issue: Model not loading
Symptoms: 503 errors, "Model not loaded"
Solutions:
# Check CUDA availability
python -c "import torch; print(torch.cuda.is_available())"
# Verify model download
python -c "from transformers import AutoModelForCausalLM; AutoModelForCausalLM.from_pretrained('DarkNeuron-AI/dnai-humour-0.5B-instruct')"
# Check logs
uvicorn app:app --log-level debug
Issue: Out of memory
Solutions:
- Reduce
max_tokensin generation - Use CPU instead of GPU (slower but works)
- Enable model quantization (INT8)
# In app.py, modify model loading:
model = AutoModelForCausalLM.from_pretrained(
MODEL_NAME,
load_in_8bit=True, # Enable INT8 quantization
device_map="auto"
)
Issue: Slow responses
Solutions:
- Use GPU instead of CPU
- Reduce
max_tokens - Lower
temperaturefor faster sampling - Use smaller batch size
๐ Performance Benchmarks
| Hardware | Response Time | Memory Usage |
|---|---|---|
| T4 GPU | ~1-2s | ~2GB VRAM |
| CPU | ~5-10s | ~4GB RAM |
| A10G GPU | ~0.5-1s | ~2GB VRAM |
๐ฏ Use Cases
- Educational Chatbots - Learning companions
- Personal Assistants - Quick help & info
- Code Helpers - Programming Q&A
- Creative Writing - Brainstorming & ideas
- General Chat - Friendly conversation
๐ซ Limitations
- Not for production medical/legal advice
- Limited context window (512 tokens)
- 0.5B parameters - expect occasional mistakes
- No long-term memory - each conversation is independent
- English-focused - other languages may be limited
๐ Privacy & Safety
- No data logging - conversations not stored
- Local processing - your data stays with you
- No tracking - no analytics or monitoring
- Open source - fully transparent code
๐ ๏ธ Advanced Customization
Change Model Personality
Edit the system prompt in app.py:
def format_chat_prompt(messages: List[Message]) -> str:
system_prompt = "You are a helpful, witty AI assistant."
formatted_messages = [f"System: {system_prompt}"]
# ... rest of code
Add Memory
Implement conversation history storage:
# Simple in-memory storage
conversations = {}
@app.post("/api/chat")
async def chat(request: ChatRequest, user_id: str = "default"):
if user_id not in conversations:
conversations[user_id] = []
# ... use stored history
Enable Streaming
For real-time token-by-token responses:
from fastapi.responses import StreamingResponse
@app.post("/api/chat/stream")
async def chat_stream(request: ChatRequest):
def generate():
# Yield tokens as they're generated
for token in model.generate_stream(...):
yield f"data: {token}\n\n"
return StreamingResponse(generate(), media_type="text/event-stream")
๐ Support
For issues, questions, or suggestions:
- Open an issue on GitHub
- Contact via Hugging Face
- Check model card: DarkNeuron-AI/dnai-humour-0.5B-instruct
๐ Acknowledgments
- Base Model: Qwen2.5-0.5B-Instruct by Alibaba
- Dataset: OpenAssistant v1
- Framework: Hugging Face Transformers
- UI: React + Tailwind CSS
๐ License
MIT License - Free to use, modify, and distribute