Spaces:
Sleeping
π Generative AI for Programming Education
π Live Demo
Hugging Face Spaces: [Coming Soon - Deploy using DEPLOYMENT.md guide]
π Problem Statement
Current programming education struggles with high dropout rates, inefficient feedback loops, and a lack of personalized learningβproblems exacerbated by limited instructor bandwidth. While Generative AI (e.g., Copilot, ChatGPT) can help, most tools prioritize productivity over learning, offering code solutions without explanations or tailored guidance. This risks student over-reliance without deeper comprehension.
π― Solution
To address this gap, we fine-tuned CodeLlama-7B to provide structured, educational code feedbackβnot just correct answers. Our model analyzes student code and delivers:
- Instant, actionable reviews (e.g., "This loop can be optimized from O(nΒ²) to O(n) using a hashmap")
- Beginner-friendly explanations (e.g., "In Python, list.append() modifies the list in-place but returns Noneβthat's why your print() shows None")
- Personalized adaptation (e.g., adjusting feedback depth based on inferred skill level)
Unlike generic AI tools, our system is explicitly designed for education, balancing correctness, pedagogy, and ethical safeguards against over-reliance.
β¨ Features
π§ Fine-tuned CodeLlama-7B Model
- Trained on code review and code feedback datasets
- 7B parameters for comprehensive understanding
- Educational focus rather than productivity optimization
π Progressive Learning Interface
- 5-stage educational process:
- Code Analysis - Strengths, weaknesses, issues
- Improvement Guide - Step-by-step instructions
- Learning Points - Key concepts and objectives
- Comprehension Quiz - Test understanding
- Code Fix - Improved solution (only after learning)
π Educational Features
- Student Level Adaptation (Beginner/Intermediate/Advanced)
- Comprehension Questions generated by the model
- Learning Objectives for each feedback
- Step-by-step improvement guides
- Algorithm complexity explanations
π‘οΈ Ethical Safeguards
- Progressive learning flow prevents solution jumping
- Comprehension testing before showing fixes
- Educational explanations rather than quick answers
- Best practices promotion
π Hugging Face Spaces Deployment
Hardware Specifications
- CPU: 2 vCPU (virtual CPU cores)
- RAM: 16 GB
- Plan: FREE tier
- Storage: Sufficient for model and application
Optimization Features
- β 16GB RAM optimization for fine-tuned model
- β CPU-only inference (no GPU required)
- β Memory management with gradient checkpointing
- β Demo mode for immediate testing
- β Progressive loading with fallback options
Performance Expectations
- Demo Mode: Instant response
- Fine-tuned Model: 5-10 minutes initial loading
- Memory Usage: Optimized for 16GB constraint
- Concurrent Users: Limited by CPU cores
π οΈ Installation & Setup
Local Development
# Clone the repository
git clone https://github.com/TomoriFarouk/GenAI-For-Programming-Language.git
cd GenAI-For-Programming-Language
# Install dependencies
pip install -r requirements.txt
# Run the application
streamlit run app.py
Hugging Face Spaces Deployment
Follow the detailed guide in DEPLOYMENT.md for step-by-step instructions.
π Project Structure
GenAI-For-Programming-Language/
βββ app.py # Main Streamlit interface (HF Spaces optimized)
βββ fine.py # Fine-tuned model integration
βββ config.py # Configuration settings
βββ requirements.txt # Dependencies
βββ README.md # This file
βββ DEPLOYMENT.md # HF Spaces deployment guide
βββ .gitignore # Excludes model files
βββ .gitattributes # File type configuration
βββ example_usage.py # Usage examples
π§ Model Architecture
Base Model
- CodeLlama-7B-Instruct-hf
- 7 billion parameters
- Code-specific training
Fine-tuning Datasets
- Code Review Dataset: Structured feedback on code quality
- Code Feedback Dataset: Educational explanations and improvements
Training Process
- LoRA fine-tuning for efficiency
- Educational prompt engineering
- Multi-stage feedback generation
π― Usage Examples
Input Code
def find_duplicates(numbers):
x = []
for i in range(len(numbers)):
for j in range(i+1, len(numbers)):
if numbers[i] == numbers[j]:
x.append(numbers[i])
return x
Generated Feedback
- Analysis: Identifies O(nΒ²) complexity, poor variable naming
- Improvement Guide: Step-by-step optimization instructions
- Learning Points: Algorithm complexity, naming conventions
- Quiz: "What is the time complexity and how to improve it?"
- Code Fix: Optimized O(n) solution with better naming
π§ Configuration
Model Settings
- Path:
./model(for HF Spaces) - Device: CPU-optimized for 16GB RAM
- Memory: Gradient checkpointing enabled
Educational Settings
- Student Levels: Beginner, Intermediate, Advanced
- Feedback Types: Syntax, Logic, Optimization, Style
- Learning Objectives: Comprehensive programming concepts
π Performance
Local Environment
- GPU: Recommended for faster inference
- RAM: 16GB+ recommended
- Storage: 30GB+ for model files
Hugging Face Spaces
- CPU: 2 vCPU (sufficient for inference)
- RAM: 16GB (optimized for this constraint)
- Loading Time: 5-10 minutes for fine-tuned model
- Demo Mode: Instant response
π€ Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Acknowledgments
- CodeLlama team for the base model
- Hugging Face for the Spaces platform
- Streamlit for the web interface framework
π Contact
For questions or support, please open an issue on GitHub.
π Empowering programming education through AI-driven, structured learning experiences.