Spaces:

Sabithulla
/

lightweight-ai-backend

Sleeping

App Files Files Community

lightweight-ai-backend / API_REFERENCE.md

AI Backend Deploy

Deploy Lightweight AI Backend (2026-02-23 19:37:44)

d39e477 28 days ago

preview code

raw

history blame contribute delete

11.1 kB

	# API REFERENCE - Lightweight AI Backend

	Quick reference for integrating the API endpoints into your frontend projects.

	## 🔗 Base URL

	```
	https://your-username-lightweight-ai-backend.hf.space
	```

	---

	## 📡 Available Endpoints

	All endpoints are accessible via HTTP POST requests to `/api/predict` with different parameters.

	### 1. Generate Chat

	Purpose: General conversational AI responses

	Endpoint: `POST /api/predict`

	Request:
	```json
	{
	"data": [
	"Your question or prompt here",
	150,
	0.7
	]
	}
	```

	Parameters:
	\| Index \| Name \| Type \| Range \| Default \| Description \|
	\|-------\|------\|------\|-------\|---------\|-------------\|
	\| 0 \| prompt \| string \| N/A \| N/A \| The user's question or message \|
	\| 1 \| max_tokens \| int \| 50-200 \| 150 \| Maximum length of response \|
	\| 2 \| temperature \| float \| 0.1-1.0 \| 0.7 \| Randomness (0=deterministic, 1=creative) \|

	Response:
	```json
	{
	"data": [
	"Your question or prompt here response from the model..."
	]
	}
	```

	Examples:

	Python:
	```python
	import requests

	response = requests.post(
	"https://your-space-url/api/predict",
	json={"data": ["What is Python?", 150, 0.7]}
	)
	result = response.json()["data"][0]
	print(result)
	```

	JavaScript:
	```javascript
	const response = await fetch('https://your-space-url/api/predict', {
	method: 'POST',
	headers: {'Content-Type': 'application/json'},
	body: JSON.stringify({data: ["What is AI?", 150, 0.7]})
	});
	const result = await response.json();
	console.log(result.data[0]);
	```

	cURL:
	```bash
	curl -X POST https://your-space-url/api/predict \
	-H "Content-Type: application/json" \
	-d '{"data": ["Hello!", 150, 0.7]}'
	```

	---

	### 2. Generate Code

	Purpose: Generate code based on descriptions

	Endpoint: `POST /api/predict`

	Request:
	```json
	{
	"data": [
	"Write a Python function to reverse a string",
	256,
	0.3
	]
	}
	```

	Parameters:
	\| Index \| Name \| Type \| Range \| Default \| Description \|
	\|-------\|------\|------\|-------\|---------\|-------------\|
	\| 0 \| prompt \| string \| N/A \| N/A \| Description of the code to generate \|
	\| 1 \| max_tokens \| int \| 100-300 \| 256 \| Maximum code length \|
	\| 2 \| temperature \| float \| 0.1-1.0 \| 0.3 \| Lower = more deterministic code \|

	Response:
	```json
	{
	"data": [
	"def reverse_string(s):\n return s[::-1]\n\n# Usage\nprint(reverse_string('hello'))..."
	]
	}
	```

	Example:

	Python:
	```python
	response = requests.post(
	"https://your-space-url/api/predict",
	json={"data": ["Create a function that calculates factorial", 256, 0.3]}
	)
	code = response.json()["data"][0]
	print(code)
	```

	---

	### 3. Summarize Text

	Purpose: Generate summaries of long text

	Endpoint: `POST /api/predict`

	Request:
	```json
	{
	"data": [
	"Long text to summarize goes here... at least 50 characters.",
	100
	]
	}
	```

	Parameters:
	\| Index \| Name \| Type \| Range \| Default \| Description \|
	\|-------\|------\|------\|-------\|---------\|-------------\|
	\| 0 \| text \| string \| 50+ chars \| N/A \| Text to summarize \|
	\| 1 \| max_length \| int \| 20-150 \| 100 \| Maximum summary length \|

	Response:
	```json
	{
	"data": [
	"Summary of the provided text..."
	]
	}
	```

	Example:

	Python:
	```python
	long_text = """
	Machine learning is a subset of artificial intelligence (AI) that focuses
	on enabling systems to learn from and make decisions based on data...
	"""

	response = requests.post(
	"https://your-space-url/api/predict",
	json={"data": [long_text, 100]}
	)
	summary = response.json()["data"][0]
	print(summary)
	```

	---

	### 4. Generate Image

	Purpose: Generate images from text descriptions

	Endpoint: `POST /api/predict`

	Request:
	```json
	{
	"data": [
	"A sunset over mountains",
	256,
	256
	]
	}
	```

	Parameters:
	\| Index \| Name \| Type \| Range \| Default \| Description \|
	\|-------\|------\|------\|-------\|---------\|-------------\|
	\| 0 \| prompt \| string \| N/A \| N/A \| Image description \|
	\| 1 \| width \| int \| 128-256 \| 256 \| Image width in pixels \|
	\| 2 \| height \| int \| 128-256 \| 256 \| Image height in pixels \|

	Response:
	Image returned as binary data (PNG format)

	Example:

	Python:
	```python
	from PIL import Image
	from io import BytesIO

	response = requests.post(
	"https://your-space-url/api/predict",
	json={"data": ["A red sunset", 256, 256]}
	)

	# Save image from response
	with open('generated_image.png', 'wb') as f:
	f.write(response.content)

	# Or load as PIL Image
	img = Image.open(BytesIO(response.content))
	img.show()
	```

	JavaScript (for frontend):
	```javascript
	const response = await fetch('https://your-space-url/api/predict', {
	method: 'POST',
	headers: {'Content-Type': 'application/json'},
	body: JSON.stringify({data: ["A blue ocean", 256, 256]})
	});

	// Get image blob
	const blob = await response.blob();
	const url = URL.createObjectURL(blob);

	// Display in image element
	document.getElementById('image').src = url;
	```

	---

	## 🔄 Response Codes

	\| Code \| Meaning \| Solution \|
	\|------\|---------\|----------\|
	\| 200 \| Success \| Response contains generated output \|
	\| 400 \| Bad Request \| Check parameters (wrong JSON format) \|
	\| 503 \| Service Unavailable \| Space is starting/restarting (wait 1-2 min) \|
	\| 504 \| Timeout \| Request took too long (try shorter max_tokens) \|

	---

	## ⏱️ Performance Tips

	### Reduce Latency

	1. Use lower max_tokens:
	```python
	# Fast: 50-100 tokens
	max_tokens = 75 # ~2-3 seconds

	# Medium: 100-200 tokens
	max_tokens = 150 # ~4-6 seconds

	# Slow: 200-300 tokens
	max_tokens = 250 # ~8-12 seconds
	```

	2. Warm up the model:
	- First request loads the model (5-10 seconds)
	- Subsequent requests are faster
	- Consider sending a "warm-up" request on app startup

	3. Batch similar requests:
	- Queue requests intelligently
	- Don't send all at once

	### Error Handling

	```python
	import requests
	import time

	def call_api_with_retry(url, data, max_retries=3):
	"""Call API with retry logic"""
	for attempt in range(max_retries):
	try:
	response = requests.post(
	url,
	json={"data": data},
	timeout=60
	)
	if response.status_code == 200:
	return response.json()["data"][0]
	elif response.status_code == 503:
	# Service restarting, wait and retry
	time.sleep(5)
	continue
	else:
	return f"Error: {response.status_code}"
	except requests.exceptions.Timeout:
	if attempt < max_retries - 1:
	print("Timeout, retrying...")
	time.sleep(2)
	else:
	return "Error: Request timeout"

	return "Error: Max retries exceeded"

	# Usage
	result = call_api_with_retry(
	"https://your-space-url/api/predict",
	["Your prompt", 150, 0.7]
	)
	print(result)
	```

	---

	## 💡 Integration Examples

	### React Frontend

	```jsx
	import React, { useState } from 'react';

	export default function ChatApp() {
	const [input, setInput] = useState('');
	const [response, setResponse] = useState('');
	const [loading, setLoading] = useState(false);

	const handleSubmit = async (e) => {
	e.preventDefault();
	setLoading(true);

	try {
	const result = await fetch(
	'https://your-space-url/api/predict',
	{
	method: 'POST',
	headers: {'Content-Type': 'application/json'},
	body: JSON.stringify({data: [input, 150, 0.7]})
	}
	);

	const data = await result.json();
	setResponse(data.data[0]);
	} catch (error) {
	setResponse('Error: ' + error.message);
	} finally {
	setLoading(false);
	}
	};

	return (
	<div>
	<form onSubmit={handleSubmit}>
	<input
	value={input}
	onChange={(e) => setInput(e.target.value)}
	placeholder="Ask me anything..."
	/>
	<button type="submit" disabled={loading}>
	{loading ? 'Generating...' : 'Send'}
	</button>
	</form>
	{response && <div>{response}</div>}
	</div>
	);
	}
	```

	### Vue.js

	```vue
	<template>
	<div>
	<input v-model="prompt" placeholder="Ask a question..." />
	<button @click="generateResponse" :disabled="loading">
	{{ loading ? 'Generating...' : 'Send' }}
	</button>
	<p v-if="response">{{ response }}</p>
	</div>
	</template>

	<script>
	export default {
	data() {
	return {
	prompt: '',
	response: '',
	loading: false
	};
	},
	methods: {
	async generateResponse() {
	this.loading = true;
	try {
	const res = await fetch(
	'https://your-space-url/api/predict',
	{
	method: 'POST',
	headers: {'Content-Type': 'application/json'},
	body: JSON.stringify({data: [this.prompt, 150, 0.7]})
	}
	);
	const data = await res.json();
	this.response = data.data[0];
	} catch (error) {
	this.response = 'Error: ' + error.message;
	} finally {
	this.loading = false;
	}
	}
	}
	};
	</script>
	```

	### Node.js Backend

	```javascript
	const express = require('express');
	const axios = require('axios');

	const app = express();
	app.use(express.json());

	app.post('/chat', async (req, res) => {
	const { prompt } = req.body;

	try {
	const response = await axios.post(
	'https://your-space-url/api/predict',
	{
	data: [prompt, 150, 0.7]
	}
	);

	res.json({ response: response.data.data[0] });
	} catch (error) {
	res.status(500).json({ error: error.message });
	}
	});

	app.listen(3000, () => console.log('Server running on :3000'));
	```

	---

	## 🔐 Important Notes

	### Rate Limiting
	- Free tier: ~2 requests per second
	- Space sleeps after 48h inactivity (wakes on request)
	- No hard quota, but be respectful

	### Data Privacy
	- All requests processed on Space server
	- No data sent to external APIs
	- Check Hugging Face privacy policy

	### Bandwidth
	- Requests are queued and processed sequentially
	- Typical response: < 2MB
	- No file uploads supported

	---

	## 📞 Troubleshooting API Calls

	### 503 Service Unavailable
	```
	Cause: Space restarting or models loading
	Solution: Wait 30-60 seconds and retry
	```

	### 504 Gateway Timeout
	```
	Cause: Request took >60 seconds
	Solution: Reduce max_tokens or try simpler prompt
	```

	### Empty Response
	```
	Cause: Model failed silently
	Solution: Check Space logs, try different prompt
	```

	### Wrong Response Format
	```
	Cause: Endpoint called incorrectly
	Solution: Ensure {"data": [arg1, arg2, ...]} structure
	```

	---

	## 🎯 Production Checklist

	- [ ] Replace `your-space-url` with actual URL
	- [ ] Add error handling for API failures
	- [ ] Implement request timeout (60s)
	- [ ] Add retry logic (exponential backoff)
	- [ ] Monitor API response times
	- [ ] Cache responses if possible
	- [ ] Set up alerting for 503/504 errors
	- [ ] Test under expected load
	- [ ] Document API usage in your project

	---

	API Reference v1.0
	Last Updated: 2024