--- title: Question Generation AI emoji: 🤖 colorFrom: blue colorTo: purple sdk: docker pinned: false license: apache-2.0 app_port: 7860 --- # Question Generation AI This Hugging Face Space provides a ChatGPT-style interface for generating thoughtful questions from input statements using the **Meta Llama-3.1-8B-Instruct** model. ## Features - 🤖 **ChatGPT-Style Interface**: Intuitive chat interface for generating questions - 🎯 **Customizable**: Adjust number of questions, difficulty level, and creativity - 📚 **Llama Powered**: Uses Meta.s instruction-tuned Llama 3.1 model for high-quality questions - 🚀 **Fast & Reliable**: Optimized for quick response times - 🔧 **GPU Optimized**: Runs efficiently on NVIDIA A10G hardware - 💡 **Educational Focus**: Perfect for creating study materials and assessments ## How to Use ### Chat Interface Simply enter any statement or topic in the chat box, and the AI will generate thoughtful questions about it. You can: - **Adjust Settings**: Control the number of questions (1-10), difficulty level, and creativity - **Try Different Topics**: Works great with educational content, research topics, or any text - **Interactive Experience**: Chat-like interface similar to ChatGPT ### API Access (Still Available) The original API endpoints are still accessible at `/generate-questions` for programmatic access. **Request Body:** ```json { "statement": "Your input statement here", "num_questions": 5, "temperature": 0.8, "max_length": 2048, "difficulty_level": "mixed" } ``` **Parameters:** - `statement` (required): The input text to generate questions from - `num_questions` (1-10): Number of questions to generate (default: 5) - `temperature` (0.1-2.0): Generation creativity (default: 0.8) - `max_length` (100-4096): Maximum response length (default: 2048) - `difficulty_level`: "easy", "medium", "hard", or "mixed" (default: "mixed") **Response:** ```json { "questions": [ "What is the main concept discussed?", "How does this relate to...?", "Why is this important?" ], "statement": "Your original statement", "metadata": { "model": "DavidAU/Llama-3.1-1-million-ctx-DeepHermes-Deep-Reasoning-8B-GGUF", "temperature": 0.8, "difficulty_level": "mixed" } } ``` ### Health Check **GET** `/health` Check the API and model status. **Response:** ```json { "status": "healthy", "model_loaded": true, "device": "cuda", "memory_usage": { "allocated_gb": 12.5, "reserved_gb": 14.2, "total_gb": 24.0 } } ``` ## Usage Examples ### Python ```python import requests # API endpoint url = "https://your-space-name.hf.space/generate-questions" # Request payload data = { "statement": "Artificial intelligence is transforming healthcare by enabling more accurate diagnoses, personalized treatments, and efficient drug discovery processes.", "num_questions": 3, "difficulty_level": "medium" } # Make request response = requests.post(url, json=data) questions = response.json()["questions"] for i, question in enumerate(questions, 1): print(f"{i}. {question}") ``` ### JavaScript ```javascript const generateQuestions = async (statement) => { const response = await fetch('https://your-space-name.hf.space/generate-questions', { method: 'POST', headers: { 'Content-Type': 'application/json', }, body: JSON.stringify({ statement: statement, num_questions: 5, difficulty_level: 'mixed' }) }); const data = await response.json(); return data.questions; }; ``` ### cURL ```bash curl -X POST "https://your-space-name.hf.space/generate-questions" \ -H "Content-Type: application/json" \ -d '{ "statement": "Climate change is one of the most pressing challenges of our time.", "num_questions": 4, "difficulty_level": "hard" }' ``` ## Model Information This API uses the **DavidAU/Llama-3.1-1-million-ctx-DeepHermes-Deep-Reasoning-8B-GGUF** model, which features: - **Enhanced Reasoning**: Built on DeepHermes reasoning capabilities - **Large Context**: Supports up to 1 million tokens context length - **Optimized Format**: GGUF quantization for efficient inference - **Thinking Process**: Uses `` tags for internal reasoning ## Hardware Requirements - **GPU**: NVIDIA A10G (24GB VRAM) - **Memory**: ~14-16GB VRAM usage - **Context**: Up to 32K tokens (adjustable based on available memory) ## API Documentation Visit `/docs` for interactive API documentation with Swagger UI. ## Error Handling The API returns appropriate HTTP status codes: - `200`: Success - `400`: Bad Request (invalid parameters) - `503`: Service Unavailable (model not loaded) - `500`: Internal Server Error ## Rate Limits This is a demo space. For production use, consider: - Implementing rate limiting - Adding authentication - Scaling to multiple instances - Using dedicated inference endpoints ## Support For issues or questions: 1. Check the `/health` endpoint 2. Review the error messages 3. Ensure your requests match the API schema 4. Consider adjusting parameters for your hardware --- **Note**: This Space requires a GPU runtime to function properly. Make sure your Space is configured with GPU support.