---
title: Question Generation AI
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: apache-2.0
app_port: 7860
---

# Question Generation AI

This Hugging Face Space provides a ChatGPT-style interface for generating thoughtful questions from input statements using the **Meta Llama-3.1-8B-Instruct** model.

## Features

- 🤖 **ChatGPT-Style Interface**: Intuitive chat interface for generating questions
- 🎯 **Customizable**: Adjust number of questions, difficulty level, and creativity
- 📚 **Llama Powered**: Uses Meta.s instruction-tuned Llama 3.1 model for high-quality questions
- 🚀 **Fast & Reliable**: Optimized for quick response times
- 🔧 **GPU Optimized**: Runs efficiently on NVIDIA A10G hardware
- 💡 **Educational Focus**: Perfect for creating study materials and assessments

## How to Use

### Chat Interface
Simply enter any statement or topic in the chat box, and the AI will generate thoughtful questions about it. You can:

- **Adjust Settings**: Control the number of questions (1-10), difficulty level, and creativity
- **Try Different Topics**: Works great with educational content, research topics, or any text
- **Interactive Experience**: Chat-like interface similar to ChatGPT

### API Access (Still Available)
The original API endpoints are still accessible at `/generate-questions` for programmatic access.

**Request Body:**
```json
{
  "statement": "Your input statement here",
  "num_questions": 5,
  "temperature": 0.8,
  "max_length": 2048,
  "difficulty_level": "mixed"
}
```

**Parameters:**
- `statement` (required): The input text to generate questions from
- `num_questions` (1-10): Number of questions to generate (default: 5)
- `temperature` (0.1-2.0): Generation creativity (default: 0.8)
- `max_length` (100-4096): Maximum response length (default: 2048)
- `difficulty_level`: "easy", "medium", "hard", or "mixed" (default: "mixed")

**Response:**
```json
{
  "questions": [
    "What is the main concept discussed?",
    "How does this relate to...?",
    "Why is this important?"
  ],
  "statement": "Your original statement",
  "metadata": {
    "model": "DavidAU/Llama-3.1-1-million-ctx-DeepHermes-Deep-Reasoning-8B-GGUF",
    "temperature": 0.8,
    "difficulty_level": "mixed"
  }
}
```

### Health Check
**GET** `/health`

Check the API and model status.

**Response:**
```json
{
  "status": "healthy",
  "model_loaded": true,
  "device": "cuda",
  "memory_usage": {
    "allocated_gb": 12.5,
    "reserved_gb": 14.2,
    "total_gb": 24.0
  }
}
```

## Usage Examples

### Python
```python
import requests

# API endpoint
url = "https://your-space-name.hf.space/generate-questions"

# Request payload
data = {
    "statement": "Artificial intelligence is transforming healthcare by enabling more accurate diagnoses, personalized treatments, and efficient drug discovery processes.",
    "num_questions": 3,
    "difficulty_level": "medium"
}

# Make request
response = requests.post(url, json=data)
questions = response.json()["questions"]

for i, question in enumerate(questions, 1):
    print(f"{i}. {question}")
```

### JavaScript
```javascript
const generateQuestions = async (statement) => {
  const response = await fetch('https://your-space-name.hf.space/generate-questions', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      statement: statement,
      num_questions: 5,
      difficulty_level: 'mixed'
    })
  });
  
  const data = await response.json();
  return data.questions;
};
```

### cURL
```bash
curl -X POST "https://your-space-name.hf.space/generate-questions" \
     -H "Content-Type: application/json" \
     -d '{
       "statement": "Climate change is one of the most pressing challenges of our time.",
       "num_questions": 4,
       "difficulty_level": "hard"
     }'
```

## Model Information

This API uses the **DavidAU/Llama-3.1-1-million-ctx-DeepHermes-Deep-Reasoning-8B-GGUF** model, which features:

- **Enhanced Reasoning**: Built on DeepHermes reasoning capabilities
- **Large Context**: Supports up to 1 million tokens context length
- **Optimized Format**: GGUF quantization for efficient inference
- **Thinking Process**: Uses `<think>` tags for internal reasoning

## Hardware Requirements

- **GPU**: NVIDIA A10G (24GB VRAM)
- **Memory**: ~14-16GB VRAM usage
- **Context**: Up to 32K tokens (adjustable based on available memory)

## API Documentation

Visit `/docs` for interactive API documentation with Swagger UI.

## Error Handling

The API returns appropriate HTTP status codes:
- `200`: Success
- `400`: Bad Request (invalid parameters)
- `503`: Service Unavailable (model not loaded)
- `500`: Internal Server Error

## Rate Limits

This is a demo space. For production use, consider:
- Implementing rate limiting
- Adding authentication
- Scaling to multiple instances
- Using dedicated inference endpoints

## Support

For issues or questions:
1. Check the `/health` endpoint
2. Review the error messages
3. Ensure your requests match the API schema
4. Consider adjusting parameters for your hardware

---

**Note**: This Space requires a GPU runtime to function properly. Make sure your Space is configured with GPU support.