Spaces:

david167
/

question-generation-api

Sleeping

App Files Files Community

question-generation-api / README.md

david167

Switch back to Llama-3.1-8B-Instruct model: update prompts, generation params, and UI descriptions

e6b5afc 8 months ago

preview code

raw

history blame contribute delete

5.24 kB

	---
	title: Question Generation AI
	emoji: 🤖
	colorFrom: blue
	colorTo: purple
	sdk: docker
	pinned: false
	license: apache-2.0
	app_port: 7860
	---

	# Question Generation AI

	This Hugging Face Space provides a ChatGPT-style interface for generating thoughtful questions from input statements using the Meta Llama-3.1-8B-Instruct model.

	## Features

	- 🤖 ChatGPT-Style Interface: Intuitive chat interface for generating questions
	- 🎯 Customizable: Adjust number of questions, difficulty level, and creativity
	- 📚 Llama Powered: Uses Meta.s instruction-tuned Llama 3.1 model for high-quality questions
	- 🚀 Fast & Reliable: Optimized for quick response times
	- 🔧 GPU Optimized: Runs efficiently on NVIDIA A10G hardware
	- 💡 Educational Focus: Perfect for creating study materials and assessments

	## How to Use

	### Chat Interface
	Simply enter any statement or topic in the chat box, and the AI will generate thoughtful questions about it. You can:

	- Adjust Settings: Control the number of questions (1-10), difficulty level, and creativity
	- Try Different Topics: Works great with educational content, research topics, or any text
	- Interactive Experience: Chat-like interface similar to ChatGPT

	### API Access (Still Available)
	The original API endpoints are still accessible at `/generate-questions` for programmatic access.

	Request Body:
	```json
	{
	"statement": "Your input statement here",
	"num_questions": 5,
	"temperature": 0.8,
	"max_length": 2048,
	"difficulty_level": "mixed"
	}
	```

	Parameters:
	- `statement` (required): The input text to generate questions from
	- `num_questions` (1-10): Number of questions to generate (default: 5)
	- `temperature` (0.1-2.0): Generation creativity (default: 0.8)
	- `max_length` (100-4096): Maximum response length (default: 2048)
	- `difficulty_level`: "easy", "medium", "hard", or "mixed" (default: "mixed")

	Response:
	```json
	{
	"questions": [
	"What is the main concept discussed?",
	"How does this relate to...?",
	"Why is this important?"
	],
	"statement": "Your original statement",
	"metadata": {
	"model": "DavidAU/Llama-3.1-1-million-ctx-DeepHermes-Deep-Reasoning-8B-GGUF",
	"temperature": 0.8,
	"difficulty_level": "mixed"
	}
	}
	```

	### Health Check
	GET `/health`

	Check the API and model status.

	Response:
	```json
	{
	"status": "healthy",
	"model_loaded": true,
	"device": "cuda",
	"memory_usage": {
	"allocated_gb": 12.5,
	"reserved_gb": 14.2,
	"total_gb": 24.0
	}
	}
	```

	## Usage Examples

	### Python
	```python
	import requests

	# API endpoint
	url = "https://your-space-name.hf.space/generate-questions"

	# Request payload
	data = {
	"statement": "Artificial intelligence is transforming healthcare by enabling more accurate diagnoses, personalized treatments, and efficient drug discovery processes.",
	"num_questions": 3,
	"difficulty_level": "medium"
	}

	# Make request
	response = requests.post(url, json=data)
	questions = response.json()["questions"]

	for i, question in enumerate(questions, 1):
	print(f"{i}. {question}")
	```

	### JavaScript
	```javascript
	const generateQuestions = async (statement) => {
	const response = await fetch('https://your-space-name.hf.space/generate-questions', {
	method: 'POST',
	headers: {
	'Content-Type': 'application/json',
	},
	body: JSON.stringify({
	statement: statement,
	num_questions: 5,
	difficulty_level: 'mixed'
	})
	});

	const data = await response.json();
	return data.questions;
	};
	```

	### cURL
	```bash
	curl -X POST "https://your-space-name.hf.space/generate-questions" \
	-H "Content-Type: application/json" \
	-d '{
	"statement": "Climate change is one of the most pressing challenges of our time.",
	"num_questions": 4,
	"difficulty_level": "hard"
	}'
	```

	## Model Information

	This API uses the DavidAU/Llama-3.1-1-million-ctx-DeepHermes-Deep-Reasoning-8B-GGUF model, which features:

	- Enhanced Reasoning: Built on DeepHermes reasoning capabilities
	- Large Context: Supports up to 1 million tokens context length
	- Optimized Format: GGUF quantization for efficient inference
	- Thinking Process: Uses `<think>` tags for internal reasoning

	## Hardware Requirements

	- GPU: NVIDIA A10G (24GB VRAM)
	- Memory: ~14-16GB VRAM usage
	- Context: Up to 32K tokens (adjustable based on available memory)

	## API Documentation

	Visit `/docs` for interactive API documentation with Swagger UI.

	## Error Handling

	The API returns appropriate HTTP status codes:
	- `200`: Success
	- `400`: Bad Request (invalid parameters)
	- `503`: Service Unavailable (model not loaded)
	- `500`: Internal Server Error

	## Rate Limits

	This is a demo space. For production use, consider:
	- Implementing rate limiting
	- Adding authentication
	- Scaling to multiple instances
	- Using dedicated inference endpoints

	## Support

	For issues or questions:
	1. Check the `/health` endpoint
	2. Review the error messages
	3. Ensure your requests match the API schema
	4. Consider adjusting parameters for your hardware

	---

	Note: This Space requires a GPU runtime to function properly. Make sure your Space is configured with GPU support.