Spaces:

Sabithulla
/

lightweight-ai-backend

Sleeping

App Files Files Community

lightweight-ai-backend / API_REFERENCE.md

AI Backend Deploy

Deploy Lightweight AI Backend (2026-02-23 19:37:44)

d39e477 28 days ago

preview code

raw

history blame contribute delete

11.1 kB

A newer version of the Gradio SDK is available: 6.9.0

Upgrade

API REFERENCE - Lightweight AI Backend

Quick reference for integrating the API endpoints into your frontend projects.

🔗 Base URL

https://your-username-lightweight-ai-backend.hf.space

📡 Available Endpoints

All endpoints are accessible via HTTP POST requests to /api/predict with different parameters.

1. Generate Chat

Purpose: General conversational AI responses

Endpoint: POST /api/predict

Request:

{
  "data": [
    "Your question or prompt here",
    150,
    0.7
  ]
}

Parameters:

Index	Name	Type	Range	Default	Description
0	prompt	string	N/A	N/A	The user's question or message
1	max_tokens	int	50-200	150	Maximum length of response
2	temperature	float	0.1-1.0	0.7	Randomness (0=deterministic, 1=creative)

Response:

{
  "data": [
    "Your question or prompt here response from the model..."
  ]
}

Examples:

Python:

import requests

response = requests.post(
    "https://your-space-url/api/predict",
    json={"data": ["What is Python?", 150, 0.7]}
)
result = response.json()["data"][0]
print(result)

JavaScript:

const response = await fetch('https://your-space-url/api/predict', {
  method: 'POST',
  headers: {'Content-Type': 'application/json'},
  body: JSON.stringify({data: ["What is AI?", 150, 0.7]})
});
const result = await response.json();
console.log(result.data[0]);

cURL:

curl -X POST https://your-space-url/api/predict \
  -H "Content-Type: application/json" \
  -d '{"data": ["Hello!", 150, 0.7]}'

2. Generate Code

Purpose: Generate code based on descriptions

Endpoint: POST /api/predict

Request:

{
  "data": [
    "Write a Python function to reverse a string",
    256,
    0.3
  ]
}

Parameters:

Index	Name	Type	Range	Default	Description
0	prompt	string	N/A	N/A	Description of the code to generate
1	max_tokens	int	100-300	256	Maximum code length
2	temperature	float	0.1-1.0	0.3	Lower = more deterministic code

Response:

{
  "data": [
    "def reverse_string(s):\n    return s[::-1]\n\n# Usage\nprint(reverse_string('hello'))..."
  ]
}

Example:

Python:

response = requests.post(
    "https://your-space-url/api/predict",
    json={"data": ["Create a function that calculates factorial", 256, 0.3]}
)
code = response.json()["data"][0]
print(code)

3. Summarize Text

Purpose: Generate summaries of long text

Endpoint: POST /api/predict

Request:

{
  "data": [
    "Long text to summarize goes here... at least 50 characters.",
    100
  ]
}

Parameters:

Index	Name	Type	Range	Default	Description
0	text	string	50+ chars	N/A	Text to summarize
1	max_length	int	20-150	100	Maximum summary length

Response:

{
  "data": [
    "Summary of the provided text..."
  ]
}

Example:

Python:

long_text = """
Machine learning is a subset of artificial intelligence (AI) that focuses 
on enabling systems to learn from and make decisions based on data...
"""

response = requests.post(
    "https://your-space-url/api/predict",
    json={"data": [long_text, 100]}
)
summary = response.json()["data"][0]
print(summary)

4. Generate Image

Purpose: Generate images from text descriptions

Endpoint: POST /api/predict

Request:

{
  "data": [
    "A sunset over mountains",
    256,
    256
  ]
}

Parameters:

Index	Name	Type	Range	Default	Description
0	prompt	string	N/A	N/A	Image description
1	width	int	128-256	256	Image width in pixels
2	height	int	128-256	256	Image height in pixels

Response: Image returned as binary data (PNG format)

Example:

Python:

from PIL import Image
from io import BytesIO

response = requests.post(
    "https://your-space-url/api/predict",
    json={"data": ["A red sunset", 256, 256]}
)

# Save image from response
with open('generated_image.png', 'wb') as f:
    f.write(response.content)

# Or load as PIL Image
img = Image.open(BytesIO(response.content))
img.show()

JavaScript (for frontend):

const response = await fetch('https://your-space-url/api/predict', {
  method: 'POST',
  headers: {'Content-Type': 'application/json'},
  body: JSON.stringify({data: ["A blue ocean", 256, 256]})
});

// Get image blob
const blob = await response.blob();
const url = URL.createObjectURL(blob);

// Display in image element
document.getElementById('image').src = url;

🔄 Response Codes

Code	Meaning	Solution
200	Success	Response contains generated output
400	Bad Request	Check parameters (wrong JSON format)
503	Service Unavailable	Space is starting/restarting (wait 1-2 min)
504	Timeout	Request took too long (try shorter max_tokens)

⏱️ Performance Tips

Reduce Latency

Use lower max_tokens:

# Fast: 50-100 tokens
max_tokens = 75  # ~2-3 seconds

# Medium: 100-200 tokens
max_tokens = 150  # ~4-6 seconds

# Slow: 200-300 tokens
max_tokens = 250  # ~8-12 seconds

Warm up the model:
- First request loads the model (5-10 seconds)
- Subsequent requests are faster
- Consider sending a "warm-up" request on app startup
Batch similar requests:
- Queue requests intelligently
- Don't send all at once

Error Handling

import requests
import time

def call_api_with_retry(url, data, max_retries=3):
    """Call API with retry logic"""
    for attempt in range(max_retries):
        try:
            response = requests.post(
                url,
                json={"data": data},
                timeout=60
            )
            if response.status_code == 200:
                return response.json()["data"][0]
            elif response.status_code == 503:
                # Service restarting, wait and retry
                time.sleep(5)
                continue
            else:
                return f"Error: {response.status_code}"
        except requests.exceptions.Timeout:
            if attempt < max_retries - 1:
                print("Timeout, retrying...")
                time.sleep(2)
            else:
                return "Error: Request timeout"
    
    return "Error: Max retries exceeded"

# Usage
result = call_api_with_retry(
    "https://your-space-url/api/predict",
    ["Your prompt", 150, 0.7]
)
print(result)

💡 Integration Examples

React Frontend

import React, { useState } from 'react';

export default function ChatApp() {
  const [input, setInput] = useState('');
  const [response, setResponse] = useState('');
  const [loading, setLoading] = useState(false);

  const handleSubmit = async (e) => {
    e.preventDefault();
    setLoading(true);

    try {
      const result = await fetch(
        'https://your-space-url/api/predict',
        {
          method: 'POST',
          headers: {'Content-Type': 'application/json'},
          body: JSON.stringify({data: [input, 150, 0.7]})
        }
      );
      
      const data = await result.json();
      setResponse(data.data[0]);
    } catch (error) {
      setResponse('Error: ' + error.message);
    } finally {
      setLoading(false);
    }
  };

  return (
    <div>
      <form onSubmit={handleSubmit}>
        <input
          value={input}
          onChange={(e) => setInput(e.target.value)}
          placeholder="Ask me anything..."
        />
        <button type="submit" disabled={loading}>
          {loading ? 'Generating...' : 'Send'}
        </button>
      </form>
      {response && <div>{response}</div>}
    </div>
  );
}

Vue.js

<template>
  <div>
    <input v-model="prompt" placeholder="Ask a question..." />
    <button @click="generateResponse" :disabled="loading">
      {{ loading ? 'Generating...' : 'Send' }}
    </button>
    <p v-if="response">{{ response }}</p>
  </div>
</template>

<script>
export default {
  data() {
    return {
      prompt: '',
      response: '',
      loading: false
    };
  },
  methods: {
    async generateResponse() {
      this.loading = true;
      try {
        const res = await fetch(
          'https://your-space-url/api/predict',
          {
            method: 'POST',
            headers: {'Content-Type': 'application/json'},
            body: JSON.stringify({data: [this.prompt, 150, 0.7]})
          }
        );
        const data = await res.json();
        this.response = data.data[0];
      } catch (error) {
        this.response = 'Error: ' + error.message;
      } finally {
        this.loading = false;
      }
    }
  }
};
</script>

Node.js Backend

const express = require('express');
const axios = require('axios');

const app = express();
app.use(express.json());

app.post('/chat', async (req, res) => {
  const { prompt } = req.body;

  try {
    const response = await axios.post(
      'https://your-space-url/api/predict',
      {
        data: [prompt, 150, 0.7]
      }
    );

    res.json({ response: response.data.data[0] });
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

app.listen(3000, () => console.log('Server running on :3000'));

🔐 Important Notes

Rate Limiting

Free tier: ~2 requests per second
Space sleeps after 48h inactivity (wakes on request)
No hard quota, but be respectful

Data Privacy

All requests processed on Space server
No data sent to external APIs
Check Hugging Face privacy policy

Bandwidth

Requests are queued and processed sequentially
Typical response: < 2MB
No file uploads supported

📞 Troubleshooting API Calls

503 Service Unavailable

Cause: Space restarting or models loading
Solution: Wait 30-60 seconds and retry

504 Gateway Timeout

Cause: Request took >60 seconds
Solution: Reduce max_tokens or try simpler prompt

Empty Response

Cause: Model failed silently
Solution: Check Space logs, try different prompt

Wrong Response Format

Cause: Endpoint called incorrectly
Solution: Ensure {"data": [arg1, arg2, ...]} structure

🎯 Production Checklist

Replace your-space-url with actual URL
Add error handling for API failures
Implement request timeout (60s)
Add retry logic (exponential backoff)
Monitor API response times
Cache responses if possible
Set up alerting for 503/504 errors
Test under expected load
Document API usage in your project

API Reference v1.0 Last Updated: 2024