Spaces:

IW2025
/

InclusiveWorldChatbotSpace

Sleeping

Go to huggingface.co/spaces
Click "Create new Space"
Choose the following settings:
- Owner: Your username
- Space name: inclusive-world-curriculum-assistant (or your preferred name)
- Space SDK: Gradio
- Space hardware: CPU (or GPU if you have access)
- License: Choose appropriate license
- Visibility: Public or Private

2. Upload Files

Upload the following files to your Space:

Required Files:

app.py - Main Gradio application
config.py - Configuration settings
utils.py - Utility functions
requirements.txt - Python dependencies
README.md - Documentation
app_config.toml - Spaces configuration

Optional Files:

Slides/ directory with your curriculum PDFs
.gitignore - Git ignore rules

3. Configure Environment Variables

In your Space settings, add these environment variables:

HF_HUB_ENABLE_HF_TRANSFER=1
TRANSFORMERS_CACHE=/tmp/transformers_cache
HF_HOME=/tmp/hf_home

4. Set Up Curriculum Files

Create a Slides/ directory in your Space
Upload your curriculum PDF files to this directory
Ensure PDFs contain extractable text (not just images)

5. Deploy and Test

Automatic Deployment: Spaces will automatically build and deploy your app
Monitor Build: Check the build logs for any errors
Test the App: Visit your Space URL and test the functionality

🔧 Configuration Options

Model Selection

The app is configured to use microsoft/DialoGPT-medium for optimal performance on Spaces. You can change this in config.py:

MODEL_CONFIG = {
    "model_name": "microsoft/DialoGPT-medium",  # Change this
    # ... other settings
}

Gradio Interface Settings

Update app_config.toml for Gradio-specific settings:

[gradio]
title = "Inclusive World Curriculum Assistant"
description = "AI-powered assistant that answers questions about curriculum and shows relevant slide pages"
theme = "soft"
share = false

Hardware Requirements

Update app_config.toml based on your Space's hardware:

[hardware]
cpu = "2"        # Number of CPU cores
memory = "8GB"   # RAM requirement
disk = "10GB"    # Disk space

🐛 Troubleshooting

Common Issues

Build Fails

Check that all required files are uploaded
Verify requirements.txt has correct package versions
Ensure Python version compatibility

Model Loading Issues

Check if the model name is accessible
Verify internet connectivity
Try a smaller model if memory is limited

PDF Processing Errors

Ensure PDFs are not corrupted
Check that PDFs contain text (not just images)
Verify file permissions

Page Matching Issues

Ensure PDFs have proper page structure
Check that text extraction is working correctly
Verify metadata is being stored properly

Performance Issues

Use CPU instead of GPU if available
Reduce model size in config
Optimize chunk sizes for vector database

Debug Steps

Check Build Logs: Look for error messages in the build process
Test Locally: Run the app locally first to identify issues
Simplify: Remove complex features temporarily to isolate problems
Monitor Resources: Check CPU and memory usage in Space settings

📊 Monitoring and Maintenance

Performance Monitoring

Monitor response times for Q&A queries
Check memory usage during model loading
Track vector database performance
Monitor page matching accuracy

Regular Maintenance

Update dependencies periodically
Monitor model performance and accuracy
Backup curriculum documents
Review and update configuration settings

🔒 Security Considerations

Access Control

Use private Spaces for sensitive curriculum content
Implement authentication if needed
Monitor access logs

Data Privacy

Ensure curriculum content doesn't contain sensitive information
Use appropriate licensing for educational content
Follow data protection regulations

📈 Scaling Considerations

For High Usage

Consider using GPU Spaces for better performance
Implement caching for frequently asked questions
Use larger models for better response quality
Optimize vector database settings

Cost Optimization

Use CPU Spaces when possible
Implement request rate limiting
Monitor resource usage
Choose appropriate model sizes

🎓 Educational Deployment Tips

For Educational Institutions

Content Management: Organize curriculum by weeks/topics
Access Control: Use private Spaces for institutional content
Customization: Adapt prompts for specific curriculum needs
Integration: Consider integrating with existing LMS systems

For Individual Instructors

Content Preparation: Ensure PDFs are well-structured with clear page content
Testing: Test with various question types
Documentation: Provide clear usage instructions for students
Feedback: Collect student feedback for improvements

📞 Support

For deployment issues:

Check the Hugging Face Spaces documentation
Review build logs for specific error messages
Test with minimal configuration first
Consider using the Hugging Face community forums

🆕 New Features in This Version

Page-Level Matching

Shows exact slide pages that match your questions
Provides content previews from specific pages
Ranks pages by relevance to your query

Enhanced RAG Pipeline

Page metadata tracking throughout the process
Improved relevance scoring
Better content organization

Gradio Interface

Modern, responsive web interface
Better user experience
Optimized for educational use

Happy Deploying! 🚀