Spaces:
Build error
Domify Academy Super Bot - Backend Documentation
Overview
The backend is built with Node.js + Express + tRPC and integrates with NVIDIA APIs for LLM and image generation. It provides a robust, scalable foundation for the Grok-inspired AI chatbot.
Architecture
βββββββββββββββββββββββββββββββββββββββββββ
β Frontend (React) β
β (Ask | Imagine mode switcher) β
ββββββββββββββββ¬βββββββββββββββββββββββββββ
β tRPC API calls
βΌ
βββββββββββββββββββββββββββββββββββββββββββ
β Backend (Node.js + Express) β
β β
β βββββββββββββββββββββββββββββββββββ β
β β tRPC Routers β β
β β ββ chat.send β β
β β ββ imagine.generate β β
β β ββ search.online β β
β βββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββ β
β β LLM Engine β β
β β ββ Llama-3 70B (primary) β β
β β ββ Fallback models β β
β β ββ Reasoning generation β β
β βββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββ β
β β Image Generation β β
β β ββ SDXL β β
β β ββ Flux β β
β β ββ Video conversion β β
β βββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββ β
β β Middleware β β
β β ββ Rate limiting β β
β β ββ Request logging β β
β β ββ Caching β β
β β ββ Error handling β β
β βββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββΌβββββββββββ
βΌ βΌ βΌ
MySQL NVIDIA API DuckDuckGo
(DB) (LLM/IMG) (Search)
Key Features
1. LLM Integration with Smart Fallback
Primary Model: Llama-3 70B Fallback Chain: Llama-2 70B β Mistral Large β Llama-3 8B
// Automatic fallback if primary model is busy
const response = await generateResponseWithReasoning(
userPrompt,
searchResults,
enableThinking
);
2. DeepSeek-Style Reasoning
Generate internal thought process before final answer:
// With reasoning enabled
const { reasoning, response } = await generateResponseWithReasoning(
"What is quantum computing?",
undefined,
true // enableThinking
);
3. Rate Limiting
Token bucket algorithm prevents API abuse:
- 30 requests/minute per user
- 5 burst requests allowed
- Automatic cleanup of old buckets
const { allowed, remainingTokens } = checkRateLimit(userId, "chat");
if (!allowed) {
// Rate limit exceeded
}
4. Search Integration
DuckDuckGo search for "Search Online" mode:
const results = await searchOnline("latest AI news", 5);
const formatted = formatSearchResults(results);
5. Database Management
Conversation history, messages, images, and feedback stored in MySQL:
// Save a message
await saveMessage(conversationId, "assistant", content, reasoning);
// Get conversation history
const messages = await getConversationMessages(conversationId);
6. Google Sheets Integration
Log feedback for analytics:
await logFeedbackToSheets({
userId,
rating: "like",
comment: "Great response!",
timestamp: new Date().toISOString()
});
API Endpoints (tRPC Procedures)
Chat Procedures
chat.send (Protected)
Send a message and get a response.
Input:
{
prompt: string; // User message
enableSearch: boolean; // Enable web search
enableThinking: boolean; // Enable reasoning
history: Array<{ // Conversation history
role: "user" | "assistant";
content: string;
}>;
}
Output:
{
success: boolean;
response: string; // LLM response
reasoning: string; // Internal thoughts
model: string; // Model used
tokensUsed: number; // Token count
}
Image Generation Procedures
imagine.generate (Protected)
Generate an image from a prompt.
Input:
{
prompt: string; // Image description
}
Output:
{
success: boolean;
imageUrl: string; // Generated image URL
prompt: string;
}
Search Procedures
search.online (Public)
Search the web using DuckDuckGo.
Input:
{
query: string; // Search query
maxResults: number; // Max results (default: 5)
}
Output:
{
success: boolean;
results: Array<{
title: string;
url: string;
snippet: string;
}>;
}
Database Schema
Users Table
CREATE TABLE users (
id INT PRIMARY KEY AUTO_INCREMENT,
openId VARCHAR(64) UNIQUE NOT NULL,
email VARCHAR(320),
name TEXT,
tier VARCHAR(50) DEFAULT 'free',
role ENUM('user', 'admin') DEFAULT 'user',
ipAddress VARCHAR(45),
createdAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updatedAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
);
Conversations Table
CREATE TABLE conversations (
id INT PRIMARY KEY AUTO_INCREMENT,
userId INT NOT NULL,
title TEXT,
mode ENUM('ask', 'imagine') DEFAULT 'ask',
createdAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updatedAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
);
Messages Table
CREATE TABLE messages (
id INT PRIMARY KEY AUTO_INCREMENT,
conversationId INT NOT NULL,
role ENUM('user', 'assistant') NOT NULL,
content LONGTEXT NOT NULL,
reasoning TEXT,
metadata JSON,
createdAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Images Table
CREATE TABLE images (
id INT PRIMARY KEY AUTO_INCREMENT,
userId INT NOT NULL,
conversationId INT,
prompt TEXT NOT NULL,
url TEXT NOT NULL,
metadata JSON,
createdAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Feedback Table
CREATE TABLE feedback (
id INT PRIMARY KEY AUTO_INCREMENT,
userId INT NOT NULL,
messageId INT,
imageId INT,
rating ENUM('like', 'dislike') NOT NULL,
comment TEXT,
createdAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
File Structure
server/
βββ llm.ts # LLM engine with NVIDIA integration
βββ search.ts # DuckDuckGo search integration
βββ rateLimit.ts # Rate limiting middleware
βββ db.ts # Database helper functions
βββ googleSheets.ts # Google Sheets logging
βββ middleware.ts # Industrial-standard middleware
βββ routers.ts # tRPC procedure definitions
βββ _core/
βββ index.ts # Server entry point
βββ context.ts # tRPC context
βββ trpc.ts # tRPC setup
βββ llm.ts # Built-in LLM helper
βββ imageGeneration.ts # Built-in image generation
βββ env.ts # Environment variables
Environment Variables
Required
| Variable | Description |
|---|---|
DATABASE_URL |
MySQL connection string |
NVIDIA_API_KEY |
NVIDIA API key |
JWT_SECRET |
Session token secret |
Optional
| Variable | Description | Default |
|---|---|---|
GOOGLE_SHEETS_API_KEY |
Google Sheets API key | (disabled) |
GOOGLE_SHEETS_ID |
Google Sheet ID | (disabled) |
RATE_LIMIT_REQUESTS |
Requests per minute | 30 |
RATE_LIMIT_WINDOW |
Rate limit window (seconds) | 3600 |
Industrial-Standard Features
1. Caching
In-memory cache for frequently accessed data:
cacheManager.set("key", data, 300); // 5 minute TTL
const cached = cacheManager.get("key");
2. Performance Monitoring
Track operation durations:
performanceMonitor.record("llm_call", 1234); // ms
const stats = performanceMonitor.getStats("llm_call");
3. Request Logging
Automatic request/response logging with performance metrics.
4. Error Handling
Comprehensive error handling with proper HTTP status codes.
5. Health Checks
Endpoint for monitoring application health:
GET /api/health
Response:
{
"status": "healthy",
"uptime": 3600,
"database": "connected",
"cache": "active",
"memoryUsage": 256
}
6. Security Headers
Automatic security headers on all responses:
X-Content-Type-Options: nosniffX-Frame-Options: DENYX-XSS-Protection: 1; mode=blockStrict-Transport-Security: max-age=31536000
Development
Local Setup
# Install dependencies
pnpm install
# Set up environment variables
cp .env.example .env
# Edit .env with your values
# Run database migrations
pnpm run db:push
# Start development server
pnpm run dev
Testing
# Run tests
pnpm run test
# Watch mode
pnpm run test:watch
Type Checking
# Check TypeScript
pnpm run check
Deployment
See DEPLOYMENT.md for complete deployment instructions.
Quick Deploy to Hugging Face
# 1. Create a new Space on Hugging Face
# 2. Push code to the Space repository
git push origin main
# 3. Set environment variables in Space settings
# 4. Hugging Face automatically builds and deploys
Monitoring
Logs
Check application logs in Hugging Face Space:
Logs tab β Filter by date/time β Search for errors
Metrics
Monitor key metrics:
- Response time: Average LLM response time
- Error rate: Percentage of failed requests
- Cache hit rate: Percentage of cached responses
- Database performance: Query execution time
Alerts
Set up alerts for:
- High error rate (>5%)
- Slow responses (>5s)
- Database connection failures
- Memory usage >80%
Troubleshooting
Issue: "Rate limit exceeded"
Cause: User has exceeded request limit
Solution:
- Wait for rate limit window to reset
- Upgrade user tier for higher limits
- Adjust
RATE_LIMIT_REQUESTSif needed
Issue: "NVIDIA API error"
Cause: Invalid API key or quota exceeded
Solution:
- Verify
NVIDIA_API_KEYis correct - Check NVIDIA API dashboard for quota
- Wait for quota reset or upgrade plan
Issue: "Database connection failed"
Cause: Invalid connection string or network issue
Solution:
- Verify
DATABASE_URLformat - Check database is running and accessible
- Verify firewall rules allow connection
Issue: "Out of memory"
Cause: Memory leak or insufficient resources
Solution:
- Restart the application
- Review recent code changes
- Upgrade Space compute resources
Best Practices
- Always use rate limiting to prevent abuse
- Cache frequently accessed data to improve performance
- Log all errors for debugging and monitoring
- Use environment variables for configuration
- Validate user input before processing
- Handle errors gracefully with proper HTTP status codes
- Monitor performance and optimize bottlenecks
- Keep dependencies updated for security
Support
For issues or questions:
- Check logs in Hugging Face Space
- Review
DEPLOYMENT.mdfor deployment issues - Check
ARCHITECTURE.mdfor design details - Contact NVIDIA support for API issues
License
MIT License - See LICENSE file for details