Spaces:

david167
/

question-generation-api

Sleeping

App Files Files Community

question-generation-api / README.md

david167

Switch back to Llama-3.1-8B-Instruct model: update prompts, generation params, and UI descriptions

e6b5afc 8 months ago

preview code

raw

history blame contribute delete

5.24 kB

metadata

title: Question Generation AI
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: apache-2.0
app_port: 7860

Question Generation AI

This Hugging Face Space provides a ChatGPT-style interface for generating thoughtful questions from input statements using the Meta Llama-3.1-8B-Instruct model.

Features

🤖 ChatGPT-Style Interface: Intuitive chat interface for generating questions
🎯 Customizable: Adjust number of questions, difficulty level, and creativity
📚 Llama Powered: Uses Meta.s instruction-tuned Llama 3.1 model for high-quality questions
🚀 Fast & Reliable: Optimized for quick response times
🔧 GPU Optimized: Runs efficiently on NVIDIA A10G hardware
💡 Educational Focus: Perfect for creating study materials and assessments

How to Use

Chat Interface

Simply enter any statement or topic in the chat box, and the AI will generate thoughtful questions about it. You can:

Adjust Settings: Control the number of questions (1-10), difficulty level, and creativity
Try Different Topics: Works great with educational content, research topics, or any text
Interactive Experience: Chat-like interface similar to ChatGPT

API Access (Still Available)

The original API endpoints are still accessible at /generate-questions for programmatic access.

Request Body:

{
  "statement": "Your input statement here",
  "num_questions": 5,
  "temperature": 0.8,
  "max_length": 2048,
  "difficulty_level": "mixed"
}

Parameters:

statement (required): The input text to generate questions from
num_questions (1-10): Number of questions to generate (default: 5)
temperature (0.1-2.0): Generation creativity (default: 0.8)
max_length (100-4096): Maximum response length (default: 2048)
difficulty_level: "easy", "medium", "hard", or "mixed" (default: "mixed")

Response:

{
  "questions": [
    "What is the main concept discussed?",
    "How does this relate to...?",
    "Why is this important?"
  ],
  "statement": "Your original statement",
  "metadata": {
    "model": "DavidAU/Llama-3.1-1-million-ctx-DeepHermes-Deep-Reasoning-8B-GGUF",
    "temperature": 0.8,
    "difficulty_level": "mixed"
  }
}

Health Check

GET /health

Check the API and model status.

Response:

{
  "status": "healthy",
  "model_loaded": true,
  "device": "cuda",
  "memory_usage": {
    "allocated_gb": 12.5,
    "reserved_gb": 14.2,
    "total_gb": 24.0
  }
}

Usage Examples

Python

import requests

# API endpoint
url = "https://your-space-name.hf.space/generate-questions"

# Request payload
data = {
    "statement": "Artificial intelligence is transforming healthcare by enabling more accurate diagnoses, personalized treatments, and efficient drug discovery processes.",
    "num_questions": 3,
    "difficulty_level": "medium"
}

# Make request
response = requests.post(url, json=data)
questions = response.json()["questions"]

for i, question in enumerate(questions, 1):
    print(f"{i}. {question}")

JavaScript

const generateQuestions = async (statement) => {
  const response = await fetch('https://your-space-name.hf.space/generate-questions', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      statement: statement,
      num_questions: 5,
      difficulty_level: 'mixed'
    })
  });
  
  const data = await response.json();
  return data.questions;
};

cURL

curl -X POST "https://your-space-name.hf.space/generate-questions" \
     -H "Content-Type: application/json" \
     -d '{
       "statement": "Climate change is one of the most pressing challenges of our time.",
       "num_questions": 4,
       "difficulty_level": "hard"
     }'

Model Information

This API uses the DavidAU/Llama-3.1-1-million-ctx-DeepHermes-Deep-Reasoning-8B-GGUF model, which features:

Enhanced Reasoning: Built on DeepHermes reasoning capabilities
Large Context: Supports up to 1 million tokens context length
Optimized Format: GGUF quantization for efficient inference
Thinking Process: Uses <think> tags for internal reasoning

Hardware Requirements

GPU: NVIDIA A10G (24GB VRAM)
Memory: ~14-16GB VRAM usage
Context: Up to 32K tokens (adjustable based on available memory)

API Documentation

Visit /docs for interactive API documentation with Swagger UI.

Error Handling

The API returns appropriate HTTP status codes:

200: Success
400: Bad Request (invalid parameters)
503: Service Unavailable (model not loaded)
500: Internal Server Error

Rate Limits

This is a demo space. For production use, consider:

Implementing rate limiting
Adding authentication
Scaling to multiple instances
Using dedicated inference endpoints

Support

For issues or questions:

Check the /health endpoint
Review the error messages
Ensure your requests match the API schema
Consider adjusting parameters for your hardware

Note: This Space requires a GPU runtime to function properly. Make sure your Space is configured with GPU support.