Spaces:
Sleeping
title: Question Generation AI
emoji: π€
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: apache-2.0
app_port: 7860
Question Generation AI
This Hugging Face Space provides a ChatGPT-style interface for generating thoughtful questions from input statements using the Meta Llama-3.1-8B-Instruct model.
Features
- π€ ChatGPT-Style Interface: Intuitive chat interface for generating questions
- π― Customizable: Adjust number of questions, difficulty level, and creativity
- π Llama Powered: Uses Meta.s instruction-tuned Llama 3.1 model for high-quality questions
- π Fast & Reliable: Optimized for quick response times
- π§ GPU Optimized: Runs efficiently on NVIDIA A10G hardware
- π‘ Educational Focus: Perfect for creating study materials and assessments
How to Use
Chat Interface
Simply enter any statement or topic in the chat box, and the AI will generate thoughtful questions about it. You can:
- Adjust Settings: Control the number of questions (1-10), difficulty level, and creativity
- Try Different Topics: Works great with educational content, research topics, or any text
- Interactive Experience: Chat-like interface similar to ChatGPT
API Access (Still Available)
The original API endpoints are still accessible at /generate-questions for programmatic access.
Request Body:
{
"statement": "Your input statement here",
"num_questions": 5,
"temperature": 0.8,
"max_length": 2048,
"difficulty_level": "mixed"
}
Parameters:
statement(required): The input text to generate questions fromnum_questions(1-10): Number of questions to generate (default: 5)temperature(0.1-2.0): Generation creativity (default: 0.8)max_length(100-4096): Maximum response length (default: 2048)difficulty_level: "easy", "medium", "hard", or "mixed" (default: "mixed")
Response:
{
"questions": [
"What is the main concept discussed?",
"How does this relate to...?",
"Why is this important?"
],
"statement": "Your original statement",
"metadata": {
"model": "DavidAU/Llama-3.1-1-million-ctx-DeepHermes-Deep-Reasoning-8B-GGUF",
"temperature": 0.8,
"difficulty_level": "mixed"
}
}
Health Check
GET /health
Check the API and model status.
Response:
{
"status": "healthy",
"model_loaded": true,
"device": "cuda",
"memory_usage": {
"allocated_gb": 12.5,
"reserved_gb": 14.2,
"total_gb": 24.0
}
}
Usage Examples
Python
import requests
# API endpoint
url = "https://your-space-name.hf.space/generate-questions"
# Request payload
data = {
"statement": "Artificial intelligence is transforming healthcare by enabling more accurate diagnoses, personalized treatments, and efficient drug discovery processes.",
"num_questions": 3,
"difficulty_level": "medium"
}
# Make request
response = requests.post(url, json=data)
questions = response.json()["questions"]
for i, question in enumerate(questions, 1):
print(f"{i}. {question}")
JavaScript
const generateQuestions = async (statement) => {
const response = await fetch('https://your-space-name.hf.space/generate-questions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
statement: statement,
num_questions: 5,
difficulty_level: 'mixed'
})
});
const data = await response.json();
return data.questions;
};
cURL
curl -X POST "https://your-space-name.hf.space/generate-questions" \
-H "Content-Type: application/json" \
-d '{
"statement": "Climate change is one of the most pressing challenges of our time.",
"num_questions": 4,
"difficulty_level": "hard"
}'
Model Information
This API uses the DavidAU/Llama-3.1-1-million-ctx-DeepHermes-Deep-Reasoning-8B-GGUF model, which features:
- Enhanced Reasoning: Built on DeepHermes reasoning capabilities
- Large Context: Supports up to 1 million tokens context length
- Optimized Format: GGUF quantization for efficient inference
- Thinking Process: Uses
<think>tags for internal reasoning
Hardware Requirements
- GPU: NVIDIA A10G (24GB VRAM)
- Memory: ~14-16GB VRAM usage
- Context: Up to 32K tokens (adjustable based on available memory)
API Documentation
Visit /docs for interactive API documentation with Swagger UI.
Error Handling
The API returns appropriate HTTP status codes:
200: Success400: Bad Request (invalid parameters)503: Service Unavailable (model not loaded)500: Internal Server Error
Rate Limits
This is a demo space. For production use, consider:
- Implementing rate limiting
- Adding authentication
- Scaling to multiple instances
- Using dedicated inference endpoints
Support
For issues or questions:
- Check the
/healthendpoint - Review the error messages
- Ensure your requests match the API schema
- Consider adjusting parameters for your hardware
Note: This Space requires a GPU runtime to function properly. Make sure your Space is configured with GPU support.