Spaces:

Sabithulla
/

lightweight-ai-backend

Sleeping

File size: 11,067 Bytes

d39e477

# API REFERENCE - Lightweight AI Backend

Quick reference for integrating the API endpoints into your frontend projects.

## 🔗 Base URL

```
https://your-username-lightweight-ai-backend.hf.space
```

---

## 📡 Available Endpoints

All endpoints are accessible via HTTP POST requests to `/api/predict` with different parameters.

### 1. Generate Chat

**Purpose:** General conversational AI responses

**Endpoint:** `POST /api/predict`

**Request:**
```json
{
  "data": [
    "Your question or prompt here",
    150,
    0.7
  ]
}
```

**Parameters:**
| Index | Name | Type | Range | Default | Description |
|-------|------|------|-------|---------|-------------|
| 0 | prompt | string | N/A | N/A | The user's question or message |
| 1 | max_tokens | int | 50-200 | 150 | Maximum length of response |
| 2 | temperature | float | 0.1-1.0 | 0.7 | Randomness (0=deterministic, 1=creative) |

**Response:**
```json
{
  "data": [
    "Your question or prompt here response from the model..."
  ]
}
```

**Examples:**

**Python:**
```python
import requests

response = requests.post(
    "https://your-space-url/api/predict",
    json={"data": ["What is Python?", 150, 0.7]}
)
result = response.json()["data"][0]
print(result)
```

**JavaScript:**
```javascript
const response = await fetch('https://your-space-url/api/predict', {
  method: 'POST',
  headers: {'Content-Type': 'application/json'},
  body: JSON.stringify({data: ["What is AI?", 150, 0.7]})
});
const result = await response.json();
console.log(result.data[0]);
```

**cURL:**
```bash
curl -X POST https://your-space-url/api/predict \
  -H "Content-Type: application/json" \
  -d '{"data": ["Hello!", 150, 0.7]}'
```

---

### 2. Generate Code

**Purpose:** Generate code based on descriptions

**Endpoint:** `POST /api/predict`

**Request:**
```json
{
  "data": [
    "Write a Python function to reverse a string",
    256,
    0.3
  ]
}
```

**Parameters:**
| Index | Name | Type | Range | Default | Description |
|-------|------|------|-------|---------|-------------|
| 0 | prompt | string | N/A | N/A | Description of the code to generate |
| 1 | max_tokens | int | 100-300 | 256 | Maximum code length |
| 2 | temperature | float | 0.1-1.0 | 0.3 | Lower = more deterministic code |

**Response:**
```json
{
  "data": [
    "def reverse_string(s):\n    return s[::-1]\n\n# Usage\nprint(reverse_string('hello'))..."
  ]
}
```

**Example:**

**Python:**
```python
response = requests.post(
    "https://your-space-url/api/predict",
    json={"data": ["Create a function that calculates factorial", 256, 0.3]}
)
code = response.json()["data"][0]
print(code)
```

---

### 3. Summarize Text

**Purpose:** Generate summaries of long text

**Endpoint:** `POST /api/predict`

**Request:**
```json
{
  "data": [
    "Long text to summarize goes here... at least 50 characters.",
    100
  ]
}
```

**Parameters:**
| Index | Name | Type | Range | Default | Description |
|-------|------|------|-------|---------|-------------|
| 0 | text | string | 50+ chars | N/A | Text to summarize |
| 1 | max_length | int | 20-150 | 100 | Maximum summary length |

**Response:**
```json
{
  "data": [
    "Summary of the provided text..."
  ]
}
```

**Example:**

**Python:**
```python
long_text = """
Machine learning is a subset of artificial intelligence (AI) that focuses 
on enabling systems to learn from and make decisions based on data...
"""

response = requests.post(
    "https://your-space-url/api/predict",
    json={"data": [long_text, 100]}
)
summary = response.json()["data"][0]
print(summary)
```

---

### 4. Generate Image

**Purpose:** Generate images from text descriptions

**Endpoint:** `POST /api/predict`

**Request:**
```json
{
  "data": [
    "A sunset over mountains",
    256,
    256
  ]
}
```

**Parameters:**
| Index | Name | Type | Range | Default | Description |
|-------|------|------|-------|---------|-------------|
| 0 | prompt | string | N/A | N/A | Image description |
| 1 | width | int | 128-256 | 256 | Image width in pixels |
| 2 | height | int | 128-256 | 256 | Image height in pixels |

**Response:**
Image returned as binary data (PNG format)

**Example:**

**Python:**
```python
from PIL import Image
from io import BytesIO

response = requests.post(
    "https://your-space-url/api/predict",
    json={"data": ["A red sunset", 256, 256]}
)

# Save image from response
with open('generated_image.png', 'wb') as f:
    f.write(response.content)

# Or load as PIL Image
img = Image.open(BytesIO(response.content))
img.show()
```

**JavaScript (for frontend):**
```javascript
const response = await fetch('https://your-space-url/api/predict', {
  method: 'POST',
  headers: {'Content-Type': 'application/json'},
  body: JSON.stringify({data: ["A blue ocean", 256, 256]})
});

// Get image blob
const blob = await response.blob();
const url = URL.createObjectURL(blob);

// Display in image element
document.getElementById('image').src = url;
```

---

## 🔄 Response Codes

| Code | Meaning | Solution |
|------|---------|----------|
| 200 | Success | Response contains generated output |
| 400 | Bad Request | Check parameters (wrong JSON format) |
| 503 | Service Unavailable | Space is starting/restarting (wait 1-2 min) |
| 504 | Timeout | Request took too long (try shorter max_tokens) |

---

## ⏱️ Performance Tips

### Reduce Latency

1. **Use lower max_tokens:**
   ```python
   # Fast: 50-100 tokens
   max_tokens = 75  # ~2-3 seconds
   
   # Medium: 100-200 tokens
   max_tokens = 150  # ~4-6 seconds
   
   # Slow: 200-300 tokens
   max_tokens = 250  # ~8-12 seconds
   ```

2. **Warm up the model:**
   - First request loads the model (5-10 seconds)
   - Subsequent requests are faster
   - Consider sending a "warm-up" request on app startup

3. **Batch similar requests:**
   - Queue requests intelligently
   - Don't send all at once

### Error Handling

```python
import requests
import time

def call_api_with_retry(url, data, max_retries=3):
    """Call API with retry logic"""
    for attempt in range(max_retries):
        try:
            response = requests.post(
                url,
                json={"data": data},
                timeout=60
            )
            if response.status_code == 200:
                return response.json()["data"][0]
            elif response.status_code == 503:
                # Service restarting, wait and retry
                time.sleep(5)
                continue
            else:
                return f"Error: {response.status_code}"
        except requests.exceptions.Timeout:
            if attempt < max_retries - 1:
                print("Timeout, retrying...")
                time.sleep(2)
            else:
                return "Error: Request timeout"
    
    return "Error: Max retries exceeded"

# Usage
result = call_api_with_retry(
    "https://your-space-url/api/predict",
    ["Your prompt", 150, 0.7]
)
print(result)
```

---

## 💡 Integration Examples

### React Frontend

```jsx
import React, { useState } from 'react';

export default function ChatApp() {
  const [input, setInput] = useState('');
  const [response, setResponse] = useState('');
  const [loading, setLoading] = useState(false);

  const handleSubmit = async (e) => {
    e.preventDefault();
    setLoading(true);

    try {
      const result = await fetch(
        'https://your-space-url/api/predict',
        {
          method: 'POST',
          headers: {'Content-Type': 'application/json'},
          body: JSON.stringify({data: [input, 150, 0.7]})
        }
      );
      
      const data = await result.json();
      setResponse(data.data[0]);
    } catch (error) {
      setResponse('Error: ' + error.message);
    } finally {
      setLoading(false);
    }
  };

  return (
    <div>
      <form onSubmit={handleSubmit}>
        <input
          value={input}
          onChange={(e) => setInput(e.target.value)}
          placeholder="Ask me anything..."
        />
        <button type="submit" disabled={loading}>
          {loading ? 'Generating...' : 'Send'}
        </button>
      </form>
      {response && <div>{response}</div>}
    </div>
  );
}
```

### Vue.js

```vue
<template>
  <div>
    <input v-model="prompt" placeholder="Ask a question..." />
    <button @click="generateResponse" :disabled="loading">
      {{ loading ? 'Generating...' : 'Send' }}
    </button>
    <p v-if="response">{{ response }}</p>
  </div>
</template>

<script>
export default {
  data() {
    return {
      prompt: '',
      response: '',
      loading: false
    };
  },
  methods: {
    async generateResponse() {
      this.loading = true;
      try {
        const res = await fetch(
          'https://your-space-url/api/predict',
          {
            method: 'POST',
            headers: {'Content-Type': 'application/json'},
            body: JSON.stringify({data: [this.prompt, 150, 0.7]})
          }
        );
        const data = await res.json();
        this.response = data.data[0];
      } catch (error) {
        this.response = 'Error: ' + error.message;
      } finally {
        this.loading = false;
      }
    }
  }
};
</script>
```

### Node.js Backend

```javascript
const express = require('express');
const axios = require('axios');

const app = express();
app.use(express.json());

app.post('/chat', async (req, res) => {
  const { prompt } = req.body;

  try {
    const response = await axios.post(
      'https://your-space-url/api/predict',
      {
        data: [prompt, 150, 0.7]
      }
    );

    res.json({ response: response.data.data[0] });
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

app.listen(3000, () => console.log('Server running on :3000'));
```

---

## 🔐 Important Notes

### Rate Limiting
- Free tier: ~2 requests per second
- Space sleeps after 48h inactivity (wakes on request)
- No hard quota, but be respectful

### Data Privacy
- All requests processed on Space server
- No data sent to external APIs
- Check Hugging Face privacy policy

### Bandwidth
- Requests are queued and processed sequentially
- Typical response: < 2MB
- No file uploads supported

---

## 📞 Troubleshooting API Calls

### 503 Service Unavailable
```
Cause: Space restarting or models loading
Solution: Wait 30-60 seconds and retry
```

### 504 Gateway Timeout
```
Cause: Request took >60 seconds
Solution: Reduce max_tokens or try simpler prompt
```

### Empty Response
```
Cause: Model failed silently
Solution: Check Space logs, try different prompt
```

### Wrong Response Format
```
Cause: Endpoint called incorrectly
Solution: Ensure {"data": [arg1, arg2, ...]} structure
```

---

## 🎯 Production Checklist

- [ ] Replace `your-space-url` with actual URL
- [ ] Add error handling for API failures
- [ ] Implement request timeout (60s)
- [ ] Add retry logic (exponential backoff)
- [ ] Monitor API response times
- [ ] Cache responses if possible
- [ ] Set up alerting for 503/504 errors
- [ ] Test under expected load
- [ ] Document API usage in your project

---

**API Reference v1.0**
**Last Updated: 2024**