# 🎓 Generative AI for Programming Education

## 🚀 Live Demo
**Hugging Face Spaces**: [Coming Soon - Deploy using DEPLOYMENT.md guide]

## 📋 Problem Statement
Current programming education struggles with high dropout rates, inefficient feedback loops, and a lack of personalized learning—problems exacerbated by limited instructor bandwidth. While Generative AI (e.g., Copilot, ChatGPT) can help, most tools prioritize productivity over learning, offering code solutions without explanations or tailored guidance. This risks student over-reliance without deeper comprehension.

## 🎯 Solution
To address this gap, we fine-tuned **CodeLlama-7B** to provide structured, educational code feedback—not just correct answers. Our model analyzes student code and delivers:

- **Instant, actionable reviews** (e.g., "This loop can be optimized from O(n²) to O(n) using a hashmap")
- **Beginner-friendly explanations** (e.g., "In Python, list.append() modifies the list in-place but returns None—that's why your print() shows None")
- **Personalized adaptation** (e.g., adjusting feedback depth based on inferred skill level)

Unlike generic AI tools, our system is explicitly designed for education, balancing correctness, pedagogy, and ethical safeguards against over-reliance.

## ✨ Features

### 🧠 **Fine-tuned CodeLlama-7B Model**
- Trained on **code review** and **code feedback** datasets
- **7B parameters** for comprehensive understanding
- **Educational focus** rather than productivity optimization

### 📊 **Progressive Learning Interface**
- **5-stage educational process**:
  1. **Code Analysis** - Strengths, weaknesses, issues
  2. **Improvement Guide** - Step-by-step instructions
  3. **Learning Points** - Key concepts and objectives
  4. **Comprehension Quiz** - Test understanding
  5. **Code Fix** - Improved solution (only after learning)

### 🎓 **Educational Features**
- **Student Level Adaptation** (Beginner/Intermediate/Advanced)
- **Comprehension Questions** generated by the model
- **Learning Objectives** for each feedback
- **Step-by-step improvement guides**
- **Algorithm complexity explanations**

### 🛡️ **Ethical Safeguards**
- **Progressive learning flow** prevents solution jumping
- **Comprehension testing** before showing fixes
- **Educational explanations** rather than quick answers
- **Best practices promotion**

## 🚀 **Hugging Face Spaces Deployment**

### **Hardware Specifications**
- **CPU**: 2 vCPU (virtual CPU cores)
- **RAM**: 16 GB
- **Plan**: FREE tier
- **Storage**: Sufficient for model and application

### **Optimization Features**
- ✅ **16GB RAM optimization** for fine-tuned model
- ✅ **CPU-only inference** (no GPU required)
- ✅ **Memory management** with gradient checkpointing
- ✅ **Demo mode** for immediate testing
- ✅ **Progressive loading** with fallback options

### **Performance Expectations**
- **Demo Mode**: Instant response
- **Fine-tuned Model**: 5-10 minutes initial loading
- **Memory Usage**: Optimized for 16GB constraint
- **Concurrent Users**: Limited by CPU cores

## 🛠️ Installation & Setup

### **Local Development**
```bash
# Clone the repository
git clone https://github.com/TomoriFarouk/GenAI-For-Programming-Language.git
cd GenAI-For-Programming-Language

# Install dependencies
pip install -r requirements.txt

# Run the application
streamlit run app.py
```

### **Hugging Face Spaces Deployment**
Follow the detailed guide in `DEPLOYMENT.md` for step-by-step instructions.

## 📁 Project Structure

```
GenAI-For-Programming-Language/
├── app.py                    # Main Streamlit interface (HF Spaces optimized)
├── fine.py                   # Fine-tuned model integration
├── config.py                 # Configuration settings
├── requirements.txt          # Dependencies
├── README.md                 # This file
├── DEPLOYMENT.md            # HF Spaces deployment guide
├── .gitignore               # Excludes model files
├── .gitattributes           # File type configuration
└── example_usage.py         # Usage examples
```

## 🧠 Model Architecture

### **Base Model**
- **CodeLlama-7B-Instruct-hf**
- **7 billion parameters**
- **Code-specific training**

### **Fine-tuning Datasets**
1. **Code Review Dataset**: Structured feedback on code quality
2. **Code Feedback Dataset**: Educational explanations and improvements

### **Training Process**
- **LoRA fine-tuning** for efficiency
- **Educational prompt engineering**
- **Multi-stage feedback generation**

## 🎯 Usage Examples

### **Input Code**
```python
def find_duplicates(numbers):
    x = []
    for i in range(len(numbers)):
        for j in range(i+1, len(numbers)):
            if numbers[i] == numbers[j]:
                x.append(numbers[i])
    return x
```

### **Generated Feedback**
1. **Analysis**: Identifies O(n²) complexity, poor variable naming
2. **Improvement Guide**: Step-by-step optimization instructions
3. **Learning Points**: Algorithm complexity, naming conventions
4. **Quiz**: "What is the time complexity and how to improve it?"
5. **Code Fix**: Optimized O(n) solution with better naming

## 🔧 Configuration

### **Model Settings**
- **Path**: `./model` (for HF Spaces)
- **Device**: CPU-optimized for 16GB RAM
- **Memory**: Gradient checkpointing enabled

### **Educational Settings**
- **Student Levels**: Beginner, Intermediate, Advanced
- **Feedback Types**: Syntax, Logic, Optimization, Style
- **Learning Objectives**: Comprehensive programming concepts

## 🚀 Performance

### **Local Environment**
- **GPU**: Recommended for faster inference
- **RAM**: 16GB+ recommended
- **Storage**: 30GB+ for model files

### **Hugging Face Spaces**
- **CPU**: 2 vCPU (sufficient for inference)
- **RAM**: 16GB (optimized for this constraint)
- **Loading Time**: 5-10 minutes for fine-tuned model
- **Demo Mode**: Instant response

## 🤝 Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Test thoroughly
5. Submit a pull request

## 📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

## 🙏 Acknowledgments

- **CodeLlama team** for the base model
- **Hugging Face** for the Spaces platform
- **Streamlit** for the web interface framework

## 📞 Contact

For questions or support, please open an issue on GitHub.

---

**🎓 Empowering programming education through AI-driven, structured learning experiences.**