FaroukTomori commited on
Commit
5fffedf
·
verified ·
1 Parent(s): 51e6602

Upload 7 files

Browse files
Files changed (7) hide show
  1. src/DEPLOYMENT.md +168 -0
  2. src/README.md +185 -0
  3. src/app.py +377 -0
  4. src/config.py +126 -0
  5. src/example_usage.py +186 -0
  6. src/fine.py +945 -0
  7. src/requirements.txt +37 -0
src/DEPLOYMENT.md ADDED
@@ -0,0 +1,168 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🚀 Deployment Guide: Hugging Face Spaces
2
+
3
+ ## Quick Start (5 minutes)
4
+
5
+ ### Step 1: Prepare Your Repository
6
+ 1. **Create a GitHub repository** with your project files
7
+ 2. **Upload all files** from this directory to your GitHub repo
8
+ 3. **Make sure you have**:
9
+ - `app.py` (main Streamlit app)
10
+ - `fine.py` (AI tutor implementation)
11
+ - `requirements.txt` (dependencies)
12
+ - `README.md` (documentation)
13
+
14
+ ### Step 2: Create Hugging Face Space
15
+ 1. **Go to** [huggingface.co/spaces](https://huggingface.co/spaces)
16
+ 2. **Click** "Create new Space"
17
+ 3. **Fill in the details**:
18
+ - **Owner**: Your HF username
19
+ - **Space name**: `ai-programming-tutor`
20
+ - **License**: Choose appropriate license
21
+ - **SDK**: Select **Streamlit**
22
+ - **Python version**: 3.10
23
+ 4. **Click** "Create Space"
24
+
25
+ ### Step 3: Connect Your Repository
26
+ 1. **In your Space settings**, go to "Repository" tab
27
+ 2. **Select** "GitHub repository"
28
+ 3. **Choose** your GitHub repository
29
+ 4. **Set the main file** to `app.py`
30
+ 5. **Click** "Save"
31
+
32
+ ### Step 4: Upload Your Fine-tuned Model
33
+ 1. **In your Space**, go to "Files" tab
34
+ 2. **Create a folder** called `model`
35
+ 3. **Upload your fine-tuned model files**:
36
+ - `model-00001-of-00006.safetensors`
37
+ - `model-00002-of-00006.safetensors`
38
+ - `model-00003-of-00006.safetensors`
39
+ - `model-00004-of-00006.safetensors`
40
+ - `model-00005-of-00006.safetensors`
41
+ - `model-00006-of-00006.safetensors`
42
+ - `config.json`
43
+ - `tokenizer.json`
44
+ - `tokenizer.model`
45
+ - `tokenizer_config.json`
46
+ - `special_tokens_map.json`
47
+ - `generation_config.json`
48
+
49
+ ### Step 5: Update Model Path
50
+ 1. **Edit** `app.py` in your Space
51
+ 2. **Change the model path** to:
52
+ ```python
53
+ model_path = "./model" # Path to uploaded model
54
+ ```
55
+ 3. **Save** the changes
56
+
57
+ ### Step 6: Deploy
58
+ 1. **Your Space will automatically build** and deploy
59
+ 2. **Wait for the build to complete** (5-10 minutes)
60
+ 3. **Your app will be live** at: `https://huggingface.co/spaces/YOUR_USERNAME/ai-programming-tutor`
61
+
62
+ ## 🎯 Advanced Configuration
63
+
64
+ ### Hardware Settings
65
+ - **CPU**: Default (sufficient for inference)
66
+ - **GPU**: T4 (recommended for faster inference)
67
+ - **Memory**: 16GB+ (required for 7B model)
68
+
69
+ ### Environment Variables
70
+ Add these in your Space settings:
71
+ ```
72
+ TOKENIZERS_PARALLELISM=false
73
+ DATASETS_DISABLE_MULTIPROCESSING=1
74
+ ```
75
+
76
+ ### Custom Domain (Optional)
77
+ 1. **In Space settings**, go to "Settings" tab
78
+ 2. **Enable** "Custom domain"
79
+ 3. **Add your domain** (e.g., `tutor.yourdomain.com`)
80
+
81
+ ## 🔧 Troubleshooting
82
+
83
+ ### Common Issues
84
+
85
+ **Issue**: Model not loading
86
+ - **Solution**: Check model path and file structure
87
+ - **Debug**: Look at Space logs in "Settings" → "Logs"
88
+
89
+ **Issue**: Out of memory
90
+ - **Solution**: Upgrade to GPU hardware
91
+ - **Alternative**: Use demo mode
92
+
93
+ **Issue**: Build fails
94
+ - **Solution**: Check `requirements.txt` for missing dependencies
95
+ - **Debug**: Review build logs
96
+
97
+ ### Performance Optimization
98
+
99
+ 1. **Enable GPU** in Space settings
100
+ 2. **Use model quantization** for faster inference
101
+ 3. **Implement caching** for repeated requests
102
+ 4. **Add rate limiting** to prevent abuse
103
+
104
+ ## 📊 Monitoring
105
+
106
+ ### Usage Analytics
107
+ - **View usage** in Space settings
108
+ - **Monitor performance** with built-in metrics
109
+ - **Track user engagement** through logs
110
+
111
+ ### Cost Management
112
+ - **Free tier**: 16 hours/month GPU time
113
+ - **Pro tier**: $9/month for unlimited GPU
114
+ - **Enterprise**: Custom pricing
115
+
116
+ ## 🌐 Sharing Your App
117
+
118
+ ### Public Access
119
+ 1. **Set Space to public** in settings
120
+ 2. **Share the URL** with users
121
+ 3. **Add to HF Spaces showcase**
122
+
123
+ ### Embedding
124
+ ```html
125
+ <iframe
126
+ src="https://huggingface.co/spaces/YOUR_USERNAME/ai-programming-tutor"
127
+ width="100%"
128
+ height="800px"
129
+ frameborder="0"
130
+ ></iframe>
131
+ ```
132
+
133
+ ## 🔒 Security Considerations
134
+
135
+ 1. **Input validation** for code submissions
136
+ 2. **Rate limiting** to prevent abuse
137
+ 3. **Content filtering** for inappropriate code
138
+ 4. **User authentication** (optional)
139
+
140
+ ## 📈 Scaling
141
+
142
+ ### For High Traffic
143
+ 1. **Upgrade to Pro tier** for unlimited GPU
144
+ 2. **Implement caching** with Redis
145
+ 3. **Use load balancing** for multiple instances
146
+ 4. **Monitor performance** and optimize
147
+
148
+ ### For Production Use
149
+ 1. **Add user authentication**
150
+ 2. **Implement logging** and analytics
151
+ 3. **Set up monitoring** and alerts
152
+ 4. **Create backup** and recovery procedures
153
+
154
+ ## 🎉 Success!
155
+
156
+ Your AI Programming Tutor is now live and accessible to students worldwide!
157
+
158
+ **Next steps**:
159
+ 1. **Test thoroughly** with different code examples
160
+ 2. **Gather user feedback** and iterate
161
+ 3. **Share with your target audience**
162
+ 4. **Monitor usage** and improve based on data
163
+
164
+ ## 📞 Support
165
+
166
+ - **Hugging Face Docs**: [docs.huggingface.co](https://docs.huggingface.co)
167
+ - **Spaces Documentation**: [huggingface.co/docs/hub/spaces](https://huggingface.co/docs/hub/spaces)
168
+ - **Community Forum**: [discuss.huggingface.co](https://discuss.huggingface.co)
src/README.md ADDED
@@ -0,0 +1,185 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🎓 Generative AI for Programming Education
2
+
3
+ ## 🚀 Live Demo
4
+ **Hugging Face Spaces**: [Coming Soon - Deploy using DEPLOYMENT.md guide]
5
+
6
+ ## 📋 Problem Statement
7
+ Current programming education struggles with high dropout rates, inefficient feedback loops, and a lack of personalized learning—problems exacerbated by limited instructor bandwidth. While Generative AI (e.g., Copilot, ChatGPT) can help, most tools prioritize productivity over learning, offering code solutions without explanations or tailored guidance. This risks student over-reliance without deeper comprehension.
8
+
9
+ ## 🎯 Solution
10
+ To address this gap, we fine-tuned **CodeLlama-7B** to provide structured, educational code feedback—not just correct answers. Our model analyzes student code and delivers:
11
+
12
+ - **Instant, actionable reviews** (e.g., "This loop can be optimized from O(n²) to O(n) using a hashmap")
13
+ - **Beginner-friendly explanations** (e.g., "In Python, list.append() modifies the list in-place but returns None—that's why your print() shows None")
14
+ - **Personalized adaptation** (e.g., adjusting feedback depth based on inferred skill level)
15
+
16
+ Unlike generic AI tools, our system is explicitly designed for education, balancing correctness, pedagogy, and ethical safeguards against over-reliance.
17
+
18
+ ## ✨ Features
19
+
20
+ ### 🧠 **Fine-tuned CodeLlama-7B Model**
21
+ - Trained on **code review** and **code feedback** datasets
22
+ - **7B parameters** for comprehensive understanding
23
+ - **Educational focus** rather than productivity optimization
24
+
25
+ ### 📊 **Progressive Learning Interface**
26
+ - **5-stage educational process**:
27
+ 1. **Code Analysis** - Strengths, weaknesses, issues
28
+ 2. **Improvement Guide** - Step-by-step instructions
29
+ 3. **Learning Points** - Key concepts and objectives
30
+ 4. **Comprehension Quiz** - Test understanding
31
+ 5. **Code Fix** - Improved solution (only after learning)
32
+
33
+ ### 🎓 **Educational Features**
34
+ - **Student Level Adaptation** (Beginner/Intermediate/Advanced)
35
+ - **Comprehension Questions** generated by the model
36
+ - **Learning Objectives** for each feedback
37
+ - **Step-by-step improvement guides**
38
+ - **Algorithm complexity explanations**
39
+
40
+ ### 🛡️ **Ethical Safeguards**
41
+ - **Progressive learning flow** prevents solution jumping
42
+ - **Comprehension testing** before showing fixes
43
+ - **Educational explanations** rather than quick answers
44
+ - **Best practices promotion**
45
+
46
+ ## 🚀 **Hugging Face Spaces Deployment**
47
+
48
+ ### **Hardware Specifications**
49
+ - **CPU**: 2 vCPU (virtual CPU cores)
50
+ - **RAM**: 16 GB
51
+ - **Plan**: FREE tier
52
+ - **Storage**: Sufficient for model and application
53
+
54
+ ### **Optimization Features**
55
+ - ✅ **16GB RAM optimization** for fine-tuned model
56
+ - ✅ **CPU-only inference** (no GPU required)
57
+ - ✅ **Memory management** with gradient checkpointing
58
+ - ✅ **Demo mode** for immediate testing
59
+ - ✅ **Progressive loading** with fallback options
60
+
61
+ ### **Performance Expectations**
62
+ - **Demo Mode**: Instant response
63
+ - **Fine-tuned Model**: 5-10 minutes initial loading
64
+ - **Memory Usage**: Optimized for 16GB constraint
65
+ - **Concurrent Users**: Limited by CPU cores
66
+
67
+ ## 🛠️ Installation & Setup
68
+
69
+ ### **Local Development**
70
+ ```bash
71
+ # Clone the repository
72
+ git clone https://github.com/TomoriFarouk/GenAI-For-Programming-Language.git
73
+ cd GenAI-For-Programming-Language
74
+
75
+ # Install dependencies
76
+ pip install -r requirements.txt
77
+
78
+ # Run the application
79
+ streamlit run app.py
80
+ ```
81
+
82
+ ### **Hugging Face Spaces Deployment**
83
+ Follow the detailed guide in `DEPLOYMENT.md` for step-by-step instructions.
84
+
85
+ ## 📁 Project Structure
86
+
87
+ ```
88
+ GenAI-For-Programming-Language/
89
+ ├── app.py # Main Streamlit interface (HF Spaces optimized)
90
+ ├── fine.py # Fine-tuned model integration
91
+ ├── config.py # Configuration settings
92
+ ├── requirements.txt # Dependencies
93
+ ├── README.md # This file
94
+ ├── DEPLOYMENT.md # HF Spaces deployment guide
95
+ ├── .gitignore # Excludes model files
96
+ ├── .gitattributes # File type configuration
97
+ └── example_usage.py # Usage examples
98
+ ```
99
+
100
+ ## 🧠 Model Architecture
101
+
102
+ ### **Base Model**
103
+ - **CodeLlama-7B-Instruct-hf**
104
+ - **7 billion parameters**
105
+ - **Code-specific training**
106
+
107
+ ### **Fine-tuning Datasets**
108
+ 1. **Code Review Dataset**: Structured feedback on code quality
109
+ 2. **Code Feedback Dataset**: Educational explanations and improvements
110
+
111
+ ### **Training Process**
112
+ - **LoRA fine-tuning** for efficiency
113
+ - **Educational prompt engineering**
114
+ - **Multi-stage feedback generation**
115
+
116
+ ## 🎯 Usage Examples
117
+
118
+ ### **Input Code**
119
+ ```python
120
+ def find_duplicates(numbers):
121
+ x = []
122
+ for i in range(len(numbers)):
123
+ for j in range(i+1, len(numbers)):
124
+ if numbers[i] == numbers[j]:
125
+ x.append(numbers[i])
126
+ return x
127
+ ```
128
+
129
+ ### **Generated Feedback**
130
+ 1. **Analysis**: Identifies O(n²) complexity, poor variable naming
131
+ 2. **Improvement Guide**: Step-by-step optimization instructions
132
+ 3. **Learning Points**: Algorithm complexity, naming conventions
133
+ 4. **Quiz**: "What is the time complexity and how to improve it?"
134
+ 5. **Code Fix**: Optimized O(n) solution with better naming
135
+
136
+ ## 🔧 Configuration
137
+
138
+ ### **Model Settings**
139
+ - **Path**: `./model` (for HF Spaces)
140
+ - **Device**: CPU-optimized for 16GB RAM
141
+ - **Memory**: Gradient checkpointing enabled
142
+
143
+ ### **Educational Settings**
144
+ - **Student Levels**: Beginner, Intermediate, Advanced
145
+ - **Feedback Types**: Syntax, Logic, Optimization, Style
146
+ - **Learning Objectives**: Comprehensive programming concepts
147
+
148
+ ## 🚀 Performance
149
+
150
+ ### **Local Environment**
151
+ - **GPU**: Recommended for faster inference
152
+ - **RAM**: 16GB+ recommended
153
+ - **Storage**: 30GB+ for model files
154
+
155
+ ### **Hugging Face Spaces**
156
+ - **CPU**: 2 vCPU (sufficient for inference)
157
+ - **RAM**: 16GB (optimized for this constraint)
158
+ - **Loading Time**: 5-10 minutes for fine-tuned model
159
+ - **Demo Mode**: Instant response
160
+
161
+ ## 🤝 Contributing
162
+
163
+ 1. Fork the repository
164
+ 2. Create a feature branch
165
+ 3. Make your changes
166
+ 4. Test thoroughly
167
+ 5. Submit a pull request
168
+
169
+ ## 📄 License
170
+
171
+ This project is licensed under the MIT License - see the LICENSE file for details.
172
+
173
+ ## 🙏 Acknowledgments
174
+
175
+ - **CodeLlama team** for the base model
176
+ - **Hugging Face** for the Spaces platform
177
+ - **Streamlit** for the web interface framework
178
+
179
+ ## 📞 Contact
180
+
181
+ For questions or support, please open an issue on GitHub.
182
+
183
+ ---
184
+
185
+ **🎓 Empowering programming education through AI-driven, structured learning experiences.**
src/app.py ADDED
@@ -0,0 +1,377 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ AI Programming Tutor - Hugging Face Spaces Deployment
3
+ Comprehensive Educational Feedback System
4
+ """
5
+
6
+ import json
7
+ from fine import ProgrammingEducationAI, ComprehensiveFeedback
8
+ import streamlit as st
9
+ import torch
10
+ import os
11
+ import gc
12
+ import warnings
13
+ warnings.filterwarnings("ignore", category=UserWarning)
14
+
15
+ # Environment setup for HF Spaces
16
+ os.environ["TOKENIZERS_PARALLELISM"] = "false"
17
+ os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:128"
18
+ os.environ["DATASETS_DISABLE_MULTIPROCESSING"] = "1"
19
+
20
+ # Clear CUDA cache if available
21
+ if torch.cuda.is_available():
22
+ torch.cuda.empty_cache()
23
+ gc.collect()
24
+
25
+
26
+ def main():
27
+ st.set_page_config(
28
+ page_title="AI Programming Tutor",
29
+ page_icon="🎓",
30
+ layout="wide",
31
+ initial_sidebar_state="expanded"
32
+ )
33
+
34
+ st.title("🎓 AI Programming Tutor")
35
+ st.subheader("Comprehensive Educational Feedback System")
36
+ st.markdown("---")
37
+
38
+ # Sidebar configuration
39
+ with st.sidebar:
40
+ st.header("⚙️ Configuration")
41
+
42
+ # Model selection
43
+ model_option = st.selectbox(
44
+ "Choose Model:",
45
+ ["Use Demo Mode", "Use Fine-tuned Model"],
46
+ help="Demo mode works immediately. Fine-tuned model requires loading (5-10 minutes on HF Spaces)."
47
+ )
48
+
49
+ # Student level selection
50
+ student_level = st.selectbox(
51
+ "Student Level:",
52
+ ["beginner", "intermediate", "advanced"],
53
+ help="Adjusts feedback complexity and learning objectives"
54
+ )
55
+
56
+ # Memory info for HF Spaces
57
+ if st.checkbox("Show System Info"):
58
+ import psutil
59
+ memory = psutil.virtual_memory()
60
+ st.metric("Available RAM",
61
+ f"{memory.available / (1024**3):.1f} GB")
62
+ st.metric("RAM Usage", f"{memory.percent}%")
63
+ st.metric("CPU Cores", psutil.cpu_count())
64
+
65
+ # HF Spaces specific instructions
66
+ st.markdown("---")
67
+ st.markdown("### 🚀 Hugging Face Spaces")
68
+ st.info("""
69
+ **Hardware**: 2 vCPU, 16GB RAM (FREE)
70
+
71
+ **Recommendations**:
72
+ - Use Demo Mode for quick testing
73
+ - Fine-tuned model takes 5-10 minutes to load
74
+ - 16GB RAM is sufficient for your model
75
+ """)
76
+
77
+ # Main content area
78
+ col1, col2 = st.columns([1, 1])
79
+
80
+ with col1:
81
+ st.header("📝 Student Code Input")
82
+
83
+ # Code input
84
+ student_code = st.text_area(
85
+ "Paste your Python code here:",
86
+ height=300,
87
+ placeholder="""# Example code to test:
88
+ def find_duplicates(numbers):
89
+ x = []
90
+ for i in range(len(numbers)):
91
+ for j in range(i+1, len(numbers)):
92
+ if numbers[i] == numbers[j]:
93
+ x.append(numbers[i])
94
+ return x
95
+
96
+ # Test the function
97
+ result = find_duplicates([1, 2, 3, 2, 4, 5, 3])
98
+ print(result)""",
99
+ help="Paste your Python code here for analysis"
100
+ )
101
+
102
+ # Generate feedback button
103
+ if st.button("🎯 Generate Comprehensive Feedback", type="primary"):
104
+ if not student_code.strip():
105
+ st.warning("⚠️ Please enter some code first!")
106
+ else:
107
+ generate_feedback(student_code, student_level, model_option)
108
+
109
+ with col2:
110
+ st.header("📊 Feedback Results")
111
+
112
+ if 'feedback' in st.session_state:
113
+ display_feedback(st.session_state['feedback'])
114
+
115
+
116
+ def generate_feedback(code: str, student_level: str, model_option: str):
117
+ """Generate comprehensive feedback using the AI tutor or demo mode"""
118
+ with st.spinner("🤖 Analyzing your code..."):
119
+ try:
120
+ if model_option == "Use Fine-tuned Model":
121
+ # Check if model is already loaded
122
+ if 'ai_tutor' not in st.session_state:
123
+ with st.spinner("🚀 Loading fine-tuned model (this may take 5-10 minutes on HF Spaces)..."):
124
+ try:
125
+ # Use relative path for HF Spaces
126
+ model_path = "./model" # Will be updated when model is uploaded
127
+ ai_tutor = ProgrammingEducationAI(model_path)
128
+ ai_tutor.load_model()
129
+ st.session_state['ai_tutor'] = ai_tutor
130
+ st.success(
131
+ "✅ Fine-tuned model loaded successfully!")
132
+ except Exception as e:
133
+ st.error(f"❌ Error loading model: {e}")
134
+ st.info("💡 Switching to demo mode...")
135
+ model_option = "Use Demo Mode"
136
+
137
+ if 'ai_tutor' in st.session_state:
138
+ # Use fine-tuned model
139
+ feedback = st.session_state['ai_tutor'].generate_comprehensive_feedback(
140
+ code, student_level)
141
+ st.session_state['feedback'] = feedback
142
+ st.success("✅ Feedback generated using fine-tuned model!")
143
+ else:
144
+ # Fallback to demo mode
145
+ feedback = create_demo_feedback(code, student_level)
146
+ st.session_state['feedback'] = feedback
147
+ st.success("✅ Demo feedback generated as fallback!")
148
+ else:
149
+ # Demo mode
150
+ feedback = create_demo_feedback(code, student_level)
151
+ st.session_state['feedback'] = feedback
152
+ st.success("✅ Demo feedback generated!")
153
+ except Exception as e:
154
+ st.error(f"❌ Error generating feedback: {e}")
155
+ # Fallback to demo mode
156
+ feedback = create_demo_feedback(code, student_level)
157
+ st.session_state['feedback'] = feedback
158
+ st.success("✅ Demo feedback generated as fallback!")
159
+
160
+
161
+ def create_demo_feedback(code: str, student_level: str) -> ComprehensiveFeedback:
162
+ """Create demo feedback for testing without model"""
163
+ return ComprehensiveFeedback(
164
+ code_snippet=code,
165
+ student_level=student_level,
166
+ strengths=[
167
+ "Your code has a clear structure and logic",
168
+ "You're using appropriate Python syntax",
169
+ "The function name is descriptive"
170
+ ],
171
+ weaknesses=[
172
+ "Variable names could be more descriptive",
173
+ "Missing comments explaining the logic",
174
+ "Could benefit from error handling"
175
+ ],
176
+ issues=[
177
+ "Using generic variable names (x, i, j)",
178
+ "No input validation",
179
+ "Nested loops could be optimized"
180
+ ],
181
+ step_by_step_improvement=[
182
+ "Step 1: Replace 'x' with 'duplicates' for better readability",
183
+ "Step 2: Add comments explaining the nested loop logic",
184
+ "Step 3: Consider using a set for O(n) time complexity",
185
+ "Step 4: Add input validation for edge cases"
186
+ ],
187
+ learning_points=[
188
+ "Good variable naming improves code readability and maintainability",
189
+ "Comments help others (and yourself) understand complex logic",
190
+ "Algorithm complexity matters - O(n²) vs O(n) can make a huge difference",
191
+ "Always consider edge cases and input validation"
192
+ ],
193
+ review_summary="Your code works correctly but could be improved with better naming, comments, and optimization. The logic is sound for a beginner level.",
194
+ comprehension_question="What is the time complexity of your current algorithm and how could you improve it?",
195
+ comprehension_answer="The current algorithm has O(n²) time complexity due to nested loops. It could be improved to O(n) using a hash set.",
196
+ explanation="Nested loops multiply their complexities. Using a set allows us to check for duplicates in O(1) time per element.",
197
+ improved_code="""def find_duplicates(numbers):
198
+ # Use a set for O(n) time complexity
199
+ duplicates = []
200
+ seen = set()
201
+
202
+ for num in numbers:
203
+ if num in seen:
204
+ duplicates.append(num)
205
+ else:
206
+ seen.add(num)
207
+
208
+ return duplicates
209
+
210
+ # Test the function
211
+ result = find_duplicates([1, 2, 3, 2, 4, 5, 3])
212
+ print(result)""",
213
+ fix_explanation="The improved version uses a set to track seen numbers, reducing time complexity from O(n²) to O(n) and making the code more readable with better variable names.",
214
+ difficulty_level=student_level,
215
+ learning_objectives=["algorithm_complexity",
216
+ "code_readability", "best_practices"],
217
+ estimated_time_to_improve="10-15 minutes"
218
+ )
219
+
220
+
221
+ def display_feedback(feedback: ComprehensiveFeedback):
222
+ """Display comprehensive feedback in a progressive learning flow"""
223
+
224
+ # Initialize session state for tracking progress
225
+ if 'quiz_completed' not in st.session_state:
226
+ st.session_state['quiz_completed'] = False
227
+ if 'current_step' not in st.session_state:
228
+ st.session_state['current_step'] = 1
229
+
230
+ # Progress indicator
231
+ st.markdown("### 🎯 Learning Progress")
232
+ progress_bar = st.progress(0)
233
+
234
+ # Calculate progress based on current step
235
+ if st.session_state['current_step'] == 1:
236
+ progress_bar.progress(20)
237
+ elif st.session_state['current_step'] == 2:
238
+ progress_bar.progress(40)
239
+ elif st.session_state['current_step'] == 3:
240
+ progress_bar.progress(60)
241
+ elif st.session_state['current_step'] == 4:
242
+ progress_bar.progress(80)
243
+ elif st.session_state['current_step'] == 5:
244
+ progress_bar.progress(100)
245
+
246
+ # Step 1: Analysis (Always available)
247
+ if st.session_state['current_step'] >= 1:
248
+ st.markdown("### 📊 Step 1: Code Analysis")
249
+
250
+ col1, col2, col3 = st.columns(3)
251
+
252
+ with col1:
253
+ st.markdown("#### ✅ Strengths")
254
+ for i, strength in enumerate(feedback.strengths, 1):
255
+ st.markdown(f"**{i}.** {strength}")
256
+
257
+ with col2:
258
+ st.markdown("#### ❌ Weaknesses")
259
+ for i, weakness in enumerate(feedback.weaknesses, 1):
260
+ st.markdown(f"**{i}.** {weakness}")
261
+
262
+ with col3:
263
+ st.markdown("#### ⚠️ Issues")
264
+ for i, issue in enumerate(feedback.issues, 1):
265
+ st.markdown(f"**{i}.** {issue}")
266
+
267
+ st.markdown("#### 📋 Review Summary")
268
+ st.info(feedback.review_summary)
269
+
270
+ if st.session_state['current_step'] == 1:
271
+ if st.button("✅ I understand the analysis - Continue to Step 2", type="primary"):
272
+ st.session_state['current_step'] = 2
273
+ st.rerun()
274
+
275
+ # Step 2: Improvement Guide (Available after Step 1)
276
+ if st.session_state['current_step'] >= 2:
277
+ st.markdown("---")
278
+ st.markdown("### 📝 Step 2: Improvement Guide")
279
+
280
+ st.markdown("#### Step-by-Step Instructions")
281
+ for i, step in enumerate(feedback.step_by_step_improvement, 1):
282
+ st.markdown(f"**Step {i}:** {step}")
283
+
284
+ st.markdown("---")
285
+ st.markdown(
286
+ f"**⏱️ Estimated time to improve:** {feedback.estimated_time_to_improve}")
287
+
288
+ if st.session_state['current_step'] == 2:
289
+ if st.button("✅ I understand the improvement steps - Continue to Step 3", type="primary"):
290
+ st.session_state['current_step'] = 3
291
+ st.rerun()
292
+
293
+ # Step 3: Learning Points (Available after Step 2)
294
+ if st.session_state['current_step'] >= 3:
295
+ st.markdown("---")
296
+ st.markdown("### 🎓 Step 3: Learning Points")
297
+
298
+ st.markdown("#### Key Concepts to Understand")
299
+ for i, point in enumerate(feedback.learning_points, 1):
300
+ st.markdown(f"**{i}.** {point}")
301
+
302
+ st.markdown("---")
303
+ st.markdown("#### 🎯 Learning Objectives")
304
+ for objective in feedback.learning_objectives:
305
+ st.markdown(f"• {objective}")
306
+
307
+ if st.session_state['current_step'] == 3:
308
+ if st.button("✅ I understand the learning points - Continue to Step 4", type="primary"):
309
+ st.session_state['current_step'] = 4
310
+ st.rerun()
311
+
312
+ # Step 4: Comprehension Quiz (Available after Step 3)
313
+ if st.session_state['current_step'] >= 4:
314
+ st.markdown("---")
315
+ st.markdown("### ❓ Step 4: Comprehension Check")
316
+
317
+ st.markdown(
318
+ "**Before you see the solution, let's test your understanding:**")
319
+ st.markdown(f"**Question:** {feedback.comprehension_question}")
320
+
321
+ # Quiz interface
322
+ user_answer = st.text_area(
323
+ "Your answer:",
324
+ placeholder="Type your answer here...",
325
+ height=100,
326
+ key="quiz_answer"
327
+ )
328
+
329
+ if st.button("Check My Answer", type="primary"):
330
+ if user_answer.strip():
331
+ st.markdown("**Correct Answer:**")
332
+ st.success(feedback.comprehension_answer)
333
+ st.markdown("**Explanation:**")
334
+ st.info(feedback.explanation)
335
+
336
+ if not st.session_state['quiz_completed']:
337
+ st.session_state['quiz_completed'] = True
338
+ st.session_state['current_step'] = 5
339
+ st.rerun()
340
+ else:
341
+ st.warning("Please provide an answer first!")
342
+
343
+ # Step 5: Code Fix (Only available after completing quiz)
344
+ if st.session_state['current_step'] >= 5 and st.session_state['quiz_completed']:
345
+ st.markdown("---")
346
+ st.markdown("### 🔧 Step 5: Improved Code Solution")
347
+
348
+ st.markdown(
349
+ "🎉 **Congratulations! You've completed the learning process. Here's the improved version:**")
350
+
351
+ st.markdown("#### 🔧 Enhanced Version")
352
+ st.code(feedback.improved_code, language="python")
353
+
354
+ st.markdown("#### 💡 What Changed")
355
+ st.info(feedback.fix_explanation)
356
+
357
+ # Reset button for new analysis
358
+ if st.button("🔄 Analyze New Code", type="secondary"):
359
+ st.session_state['current_step'] = 1
360
+ st.session_state['quiz_completed'] = False
361
+ if 'feedback' in st.session_state:
362
+ del st.session_state['feedback']
363
+ st.rerun()
364
+
365
+ # Display metadata
366
+ st.markdown("---")
367
+ col1, col2, col3 = st.columns(3)
368
+ with col1:
369
+ st.metric("Student Level", feedback.student_level.title())
370
+ with col2:
371
+ st.metric("Learning Objectives", len(feedback.learning_objectives))
372
+ with col3:
373
+ st.metric("Issues Found", len(feedback.issues))
374
+
375
+
376
+ if __name__ == "__main__":
377
+ main()
src/config.py ADDED
@@ -0,0 +1,126 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Configuration file for the Generative AI Programming Education project
3
+ """
4
+
5
+ import os
6
+ from pathlib import Path
7
+
8
+ # Model Configuration
9
+ MODEL_CONFIG = {
10
+ # Path to your fine-tuned CodeLlama-7B model
11
+ "model_path": "./model", # For Hugging Face Spaces deployment
12
+
13
+ # Model generation parameters
14
+ "max_new_tokens": 512,
15
+ "temperature": 0.7,
16
+ "do_sample": True,
17
+ "top_p": 0.9,
18
+ "top_k": 50,
19
+
20
+ # Input processing
21
+ "max_input_length": 2048,
22
+ "truncation": True,
23
+
24
+ # Device configuration
25
+ "device_map": "auto",
26
+ "torch_dtype": "float16",
27
+ "trust_remote_code": True
28
+ }
29
+
30
+ # Dataset Configuration (for reference)
31
+ DATASET_CONFIG = {
32
+ "code_review_dataset": "path/to/your/code_review_dataset",
33
+ "code_feedback_dataset": "path/to/your/code_feedback_dataset",
34
+ "training_data_format": "json", # or "csv", "txt"
35
+ }
36
+
37
+ # Educational Levels
38
+ STUDENT_LEVELS = {
39
+ "beginner": {
40
+ "description": "Students new to programming",
41
+ "feedback_style": "explanatory",
42
+ "include_basics": True,
43
+ "complexity_threshold": "low"
44
+ },
45
+ "intermediate": {
46
+ "description": "Students with basic programming knowledge",
47
+ "feedback_style": "balanced",
48
+ "include_basics": False,
49
+ "complexity_threshold": "medium"
50
+ },
51
+ "advanced": {
52
+ "description": "Students with strong programming background",
53
+ "feedback_style": "technical",
54
+ "include_basics": False,
55
+ "complexity_threshold": "high"
56
+ }
57
+ }
58
+
59
+ # Feedback Types
60
+ FEEDBACK_TYPES = [
61
+ "syntax",
62
+ "logic",
63
+ "optimization",
64
+ "style",
65
+ "explanation",
66
+ "comprehensive_review",
67
+ "educational_guidance"
68
+ ]
69
+
70
+ # Learning Objectives
71
+ LEARNING_OBJECTIVES = [
72
+ "syntax",
73
+ "basic_python",
74
+ "control_flow",
75
+ "loops",
76
+ "variables",
77
+ "code_cleanliness",
78
+ "algorithms",
79
+ "complexity",
80
+ "optimization",
81
+ "naming_conventions",
82
+ "readability",
83
+ "code_analysis",
84
+ "best_practices",
85
+ "learning",
86
+ "improvement"
87
+ ]
88
+
89
+ # Logging Configuration
90
+ LOGGING_CONFIG = {
91
+ "level": "INFO",
92
+ "format": "%(asctime)s - %(name)s - %(levelname)s - %(message)s",
93
+ "file": "programming_education_ai.log"
94
+ }
95
+
96
+ # Ethical Safeguards
97
+ ETHICAL_CONFIG = {
98
+ "prevent_over_reliance": True,
99
+ "encourage_learning": True,
100
+ "provide_explanations": True,
101
+ "suggest_alternatives": True,
102
+ "promote_best_practices": True
103
+ }
104
+
105
+
106
+ def get_model_path():
107
+ """Get the model path from environment variable or config"""
108
+ return os.getenv("FINETUNED_MODEL_PATH", MODEL_CONFIG["model_path"])
109
+
110
+
111
+ def validate_config():
112
+ """Validate the configuration settings"""
113
+ model_path = get_model_path()
114
+ if not os.path.exists(model_path):
115
+ print(f"Warning: Model path does not exist: {model_path}")
116
+ print("Please update the model_path in config.py or set FINETUNED_MODEL_PATH environment variable")
117
+ return False
118
+ return True
119
+
120
+
121
+ if __name__ == "__main__":
122
+ print("Configuration loaded successfully!")
123
+ print(f"Model path: {get_model_path()}")
124
+ print(f"Student levels: {list(STUDENT_LEVELS.keys())}")
125
+ print(f"Feedback types: {FEEDBACK_TYPES}")
126
+ validate_config()
src/example_usage.py ADDED
@@ -0,0 +1,186 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Example Usage of the Comprehensive Educational Feedback System
3
+ """
4
+
5
+ from fine import ProgrammingEducationAI
6
+ import json
7
+
8
+
9
+ def main():
10
+ print("🎓 Comprehensive Educational Feedback System")
11
+ print("=" * 60)
12
+
13
+ # Initialize the system
14
+ # Update this path to your actual fine-tuned model
15
+ model_path = r"C:\Users\farou\OneDrive - Aston University\finetunning"
16
+ ai_tutor = ProgrammingEducationAI(model_path)
17
+
18
+ try:
19
+ # Load the model
20
+ print("Loading fine-tuned model...")
21
+ ai_tutor.load_model()
22
+ print("✅ Model loaded successfully!")
23
+
24
+ # Example 1: Beginner student code
25
+ print("\n" + "="*60)
26
+ print("EXAMPLE 1: BEGINNER STUDENT")
27
+ print("="*60)
28
+
29
+ beginner_code = """
30
+ def find_duplicates(numbers):
31
+ x = []
32
+ for i in range(len(numbers)):
33
+ for j in range(i+1, len(numbers)):
34
+ if numbers[i] == numbers[j]:
35
+ x.append(numbers[i])
36
+ return x
37
+
38
+ result = find_duplicates([1, 2, 3, 2, 4, 5, 3])
39
+ print(result)
40
+ """
41
+
42
+ print("Student Code:")
43
+ print(beginner_code)
44
+
45
+ feedback = ai_tutor.generate_comprehensive_feedback(
46
+ beginner_code, "beginner")
47
+ display_comprehensive_feedback(feedback)
48
+
49
+ # Example 2: Intermediate student code
50
+ print("\n" + "="*60)
51
+ print("EXAMPLE 2: INTERMEDIATE STUDENT")
52
+ print("="*60)
53
+
54
+ intermediate_code = """
55
+ def fibonacci(n):
56
+ if n <= 1:
57
+ return n
58
+ return fibonacci(n-1) + fibonacci(n-2)
59
+
60
+ # Calculate first 10 Fibonacci numbers
61
+ for i in range(10):
62
+ print(fibonacci(i))
63
+ """
64
+
65
+ print("Student Code:")
66
+ print(intermediate_code)
67
+
68
+ feedback = ai_tutor.generate_comprehensive_feedback(
69
+ intermediate_code, "intermediate")
70
+ display_comprehensive_feedback(feedback)
71
+
72
+ # Example 3: Advanced student code
73
+ print("\n" + "="*60)
74
+ print("EXAMPLE 3: ADVANCED STUDENT")
75
+ print("="*60)
76
+
77
+ advanced_code = """
78
+ class DataProcessor:
79
+ def __init__(self, data):
80
+ self.data = data
81
+
82
+ def process(self):
83
+ result = []
84
+ for item in self.data:
85
+ if item > 0:
86
+ result.append(item * 2)
87
+ return result
88
+
89
+ processor = DataProcessor([1, -2, 3, -4, 5])
90
+ output = processor.process()
91
+ print(output)
92
+ """
93
+
94
+ print("Student Code:")
95
+ print(advanced_code)
96
+
97
+ feedback = ai_tutor.generate_comprehensive_feedback(
98
+ advanced_code, "advanced")
99
+ display_comprehensive_feedback(feedback)
100
+
101
+ except Exception as e:
102
+ print(f"❌ Error: {e}")
103
+ print(
104
+ "💡 Make sure to update the model_path to point to your actual fine-tuned model.")
105
+
106
+
107
+ def display_comprehensive_feedback(feedback):
108
+ """Display comprehensive feedback in a formatted way"""
109
+
110
+ print("\n📊 COMPREHENSIVE FEEDBACK")
111
+ print("-" * 40)
112
+
113
+ # Analysis
114
+ print("\n✅ STRENGTHS:")
115
+ for i, strength in enumerate(feedback.strengths, 1):
116
+ print(f" {i}. {strength}")
117
+
118
+ print("\n❌ WEAKNESSES:")
119
+ for i, weakness in enumerate(feedback.weaknesses, 1):
120
+ print(f" {i}. {weakness}")
121
+
122
+ print("\n⚠️ ISSUES:")
123
+ for i, issue in enumerate(feedback.issues, 1):
124
+ print(f" {i}. {issue}")
125
+
126
+ # Educational content
127
+ print("\n📝 STEP-BY-STEP IMPROVEMENT:")
128
+ for i, step in enumerate(feedback.step_by_step_improvement, 1):
129
+ print(f" Step {i}: {step}")
130
+
131
+ print("\n🎓 LEARNING POINTS:")
132
+ for i, point in enumerate(feedback.learning_points, 1):
133
+ print(f" {i}. {point}")
134
+
135
+ print(f"\n📋 REVIEW SUMMARY:")
136
+ print(f" {feedback.review_summary}")
137
+
138
+ # Interactive elements
139
+ print(f"\n❓ COMPREHENSION QUESTION:")
140
+ print(f" Q: {feedback.comprehension_question}")
141
+ print(f" A: {feedback.comprehension_answer}")
142
+ print(f" Explanation: {feedback.explanation}")
143
+
144
+ # Code fixes
145
+ print(f"\n🔧 IMPROVED CODE:")
146
+ print(feedback.improved_code)
147
+
148
+ print(f"\n💡 FIX EXPLANATION:")
149
+ print(f" {feedback.fix_explanation}")
150
+
151
+ # Metadata
152
+ print(f"\n📊 METADATA:")
153
+ print(f" Student Level: {feedback.student_level}")
154
+ print(f" Learning Objectives: {', '.join(feedback.learning_objectives)}")
155
+ print(
156
+ f" Estimated Time to Improve: {feedback.estimated_time_to_improve}")
157
+
158
+
159
+ def save_feedback_to_json(feedback, filename):
160
+ """Save feedback to JSON file for later analysis"""
161
+ feedback_dict = {
162
+ "code_snippet": feedback.code_snippet,
163
+ "student_level": feedback.student_level,
164
+ "strengths": feedback.strengths,
165
+ "weaknesses": feedback.weaknesses,
166
+ "issues": feedback.issues,
167
+ "step_by_step_improvement": feedback.step_by_step_improvement,
168
+ "learning_points": feedback.learning_points,
169
+ "review_summary": feedback.review_summary,
170
+ "comprehension_question": feedback.comprehension_question,
171
+ "comprehension_answer": feedback.comprehension_answer,
172
+ "explanation": feedback.explanation,
173
+ "improved_code": feedback.improved_code,
174
+ "fix_explanation": feedback.fix_explanation,
175
+ "learning_objectives": feedback.learning_objectives,
176
+ "estimated_time_to_improve": feedback.estimated_time_to_improve
177
+ }
178
+
179
+ with open(filename, 'w') as f:
180
+ json.dump(feedback_dict, f, indent=2)
181
+
182
+ print(f"💾 Feedback saved to {filename}")
183
+
184
+
185
+ if __name__ == "__main__":
186
+ main()
src/fine.py ADDED
@@ -0,0 +1,945 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Generative AI for Enhancing Programming Education
3
+ ================================================
4
+
5
+ This project implements a fine-tuned CodeLlama-7B model to provide structured,
6
+ educational code feedback for programming students.
7
+
8
+ Problem Statement:
9
+ - High dropout rates in programming education
10
+ - Inefficient feedback loops
11
+ - Lack of personalized learning
12
+ - Limited instructor bandwidth
13
+ - Current AI tools prioritize productivity over learning
14
+
15
+ Solution:
16
+ - Fine-tuned CodeLlama-7B for educational feedback
17
+ - Structured, actionable code reviews
18
+ - Beginner-friendly explanations
19
+ - Personalized adaptation based on skill level
20
+ - Educational focus with ethical safeguards
21
+
22
+ Author: [Your Name]
23
+ Date: [Current Date]
24
+ """
25
+
26
+ import re
27
+ from dataclasses import dataclass
28
+ from typing import Dict, List, Optional, Tuple
29
+ import logging
30
+ import json
31
+ from transformers import AutoTokenizer, AutoModelForCausalLM
32
+ import os
33
+ import gc
34
+ import torch
35
+ import warnings
36
+ warnings.filterwarnings("ignore", category=UserWarning)
37
+
38
+ # --- Critical Environment Setup (Must be before imports) ---
39
+ os.environ["TOKENIZERS_PARALLELISM"] = "false"
40
+ os.environ["DATASETS_DISABLE_MULTIPROCESSING"] = "1"
41
+
42
+ # Clear any existing CUDA cache (only if CUDA is available)
43
+ if torch.cuda.is_available():
44
+ os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:128,garbage_collection_threshold:0.6"
45
+ torch.cuda.empty_cache()
46
+ gc.collect()
47
+
48
+
49
+ # Configure logging
50
+ logging.basicConfig(level=logging.INFO)
51
+ logger = logging.getLogger(__name__)
52
+
53
+
54
+ def clear_cuda_cache():
55
+ """Clear CUDA cache and run garbage collection"""
56
+ if torch.cuda.is_available():
57
+ torch.cuda.empty_cache()
58
+ torch.cuda.synchronize()
59
+ gc.collect()
60
+
61
+
62
+ def get_system_memory():
63
+ """Get system memory information"""
64
+ try:
65
+ import psutil
66
+ memory = psutil.virtual_memory()
67
+ print(
68
+ f"System RAM: {memory.used / (1024**3):.1f}GB / {memory.total / (1024**3):.1f}GB used ({memory.percent:.1f}%)")
69
+ except Exception as e:
70
+ print(f"Could not get system memory info: {e}")
71
+
72
+
73
+ def get_gpu_memory():
74
+ """Get GPU memory information (if available)"""
75
+ if torch.cuda.is_available():
76
+ try:
77
+ import subprocess
78
+ result = subprocess.run(['nvidia-smi', '--query-gpu=memory.used,memory.total', '--format=csv,nounits,noheader'],
79
+ capture_output=True, text=True)
80
+ lines = result.stdout.strip().split('\n')
81
+ for i, line in enumerate(lines):
82
+ used, total = map(int, line.split(', '))
83
+ print(
84
+ f"GPU {i}: {used}MB / {total}MB used ({used/total*100:.1f}%)")
85
+ except Exception as e:
86
+ print(f"Could not get GPU memory info: {e}")
87
+ else:
88
+ print("No GPU available - using CPU only")
89
+
90
+
91
+ @dataclass
92
+ class CodeFeedback:
93
+ """Data structure for storing code feedback"""
94
+ code_snippet: str
95
+ feedback_type: str # 'syntax', 'logic', 'optimization', 'style', 'explanation'
96
+ feedback_message: str
97
+ suggested_improvement: Optional[str] = None
98
+ difficulty_level: str = 'beginner' # 'beginner', 'intermediate', 'advanced'
99
+ learning_objectives: List[str] = None
100
+
101
+
102
+ @dataclass
103
+ class ComprehensiveFeedback:
104
+ """Comprehensive feedback structure with all educational components"""
105
+ code_snippet: str
106
+ student_level: str
107
+
108
+ # Analysis
109
+ strengths: List[str]
110
+ weaknesses: List[str]
111
+ issues: List[str]
112
+
113
+ # Educational content
114
+ step_by_step_improvement: List[str]
115
+ learning_points: List[str]
116
+ review_summary: str
117
+
118
+ # Interactive elements
119
+ comprehension_question: str
120
+ comprehension_answer: str
121
+ explanation: str
122
+
123
+ # Code fixes
124
+ improved_code: str
125
+ fix_explanation: str
126
+
127
+ # Metadata
128
+ difficulty_level: str
129
+ learning_objectives: List[str]
130
+ estimated_time_to_improve: str
131
+
132
+
133
+ class ProgrammingEducationAI:
134
+ """
135
+ Main class for the fine-tuned CodeLlama model for programming education
136
+ """
137
+
138
+ def __init__(self, model_path: str = "./model"):
139
+ """
140
+ Initialize the fine-tuned model and tokenizer
141
+
142
+ Args:
143
+ model_path: Path to your fine-tuned CodeLlama-7B model
144
+ """
145
+ self.model_path = model_path
146
+ self.tokenizer = None
147
+ self.model = None
148
+ self.feedback_templates = self._load_feedback_templates()
149
+ self.code_review_prompt_template = self._load_code_review_prompt()
150
+ self.code_feedback_prompt_template = self._load_code_feedback_prompt()
151
+ self.comprehensive_feedback_prompt = self._load_comprehensive_feedback_prompt()
152
+ self.comprehension_question_prompt = self._load_comprehension_question_prompt()
153
+ self.code_fix_prompt = self._load_code_fix_prompt()
154
+
155
+ def _load_code_review_prompt(self) -> str:
156
+ """Load the code review prompt template used during fine-tuning"""
157
+ return """You are an expert programming tutor. Review the following student code and provide educational feedback.
158
+
159
+ Student Code:
160
+ {code}
161
+
162
+ Student Level: {level}
163
+
164
+ Please provide:
165
+ 1. Syntax errors (if any)
166
+ 2. Logic errors (if any)
167
+ 3. Style improvements
168
+ 4. Optimization suggestions
169
+ 5. Educational explanations
170
+
171
+ Feedback:"""
172
+
173
+ def _load_code_feedback_prompt(self) -> str:
174
+ """Load the code feedback prompt template used during fine-tuning"""
175
+ return """You are a helpful programming tutor. The student has written this code:
176
+
177
+ {code}
178
+
179
+ Student Level: {level}
180
+
181
+ Provide constructive, educational feedback that helps the student learn. Focus on:
182
+ - What they did well
183
+ - What can be improved
184
+ - Why the improvement matters
185
+ - How to implement the improvement
186
+
187
+ Feedback:"""
188
+
189
+ def _load_feedback_templates(self) -> Dict[str, str]:
190
+ """Load predefined feedback templates for different scenarios"""
191
+ return {
192
+ "syntax_error": "I notice there's a syntax issue in your code. {error_description}. "
193
+ "Here's what's happening: {explanation}. "
194
+ "Try this correction: {suggestion}",
195
+
196
+ "logic_error": "Your code has a logical issue. {problem_description}. "
197
+ "The problem is: {explanation}. "
198
+ "Consider this approach: {suggestion}",
199
+
200
+ "optimization": "Your code works, but we can make it more efficient! "
201
+ "Current complexity: {current_complexity}. "
202
+ "Optimized version: {optimized_complexity}. "
203
+ "Here's how: {explanation}",
204
+
205
+ "style_improvement": "Great work! Here's a style tip: {tip}. "
206
+ "This makes your code more readable and maintainable.",
207
+
208
+ "concept_explanation": "Let me explain this concept: {concept}. "
209
+ "In simple terms: {simple_explanation}. "
210
+ "Example: {example}"
211
+ }
212
+
213
+ def load_model(self):
214
+ """Load the fine-tuned model and tokenizer using optimized settings"""
215
+ try:
216
+ logger.info(f"Loading fine-tuned model from {self.model_path}")
217
+
218
+ # Load tokenizer with proper settings
219
+ self.tokenizer = AutoTokenizer.from_pretrained(
220
+ self.model_path,
221
+ use_fast=True,
222
+ padding_side="right"
223
+ )
224
+
225
+ # Set padding token
226
+ if self.tokenizer.pad_token is None:
227
+ self.tokenizer.pad_token = self.tokenizer.eos_token
228
+ self.tokenizer.pad_token_id = self.tokenizer.eos_token_id
229
+
230
+ logger.info(
231
+ f"Tokenizer loaded - Vocab size: {len(self.tokenizer)}")
232
+
233
+ # Load model optimized for HF Spaces (16GB RAM, 2 vCPU)
234
+ print("Loading model optimized for HF Spaces (16GB RAM, 2 vCPU)...")
235
+ self.model = AutoModelForCausalLM.from_pretrained(
236
+ self.model_path,
237
+ torch_dtype=torch.float32,
238
+ device_map=None, # Force CPU for HF Spaces
239
+ low_cpu_mem_usage=True,
240
+ trust_remote_code=True,
241
+ offload_folder="offload" # Offload to disk if needed
242
+ )
243
+ # Enable gradient checkpointing for memory savings
244
+ self.model.gradient_checkpointing_enable()
245
+
246
+ logger.info("Fine-tuned model loaded successfully")
247
+ logger.info(f"Model loaded on devices: {self.model.hf_device_map}")
248
+
249
+ except Exception as e:
250
+ logger.error(f"Error loading fine-tuned model: {e}")
251
+ raise
252
+
253
+ def generate_code_review(self, code: str, student_level: str = "beginner") -> str:
254
+ """
255
+ Generate code review using the fine-tuned model
256
+
257
+ Args:
258
+ code: Student's code to review
259
+ student_level: Student's skill level
260
+
261
+ Returns:
262
+ Generated code review feedback
263
+ """
264
+ if not self.model or not self.tokenizer:
265
+ raise ValueError("Model not loaded. Call load_model() first.")
266
+
267
+ # Format the prompt using the template from fine-tuning
268
+ prompt = self.code_review_prompt_template.format(
269
+ code=code,
270
+ level=student_level
271
+ )
272
+
273
+ # Tokenize input
274
+ inputs = self.tokenizer(
275
+ prompt, return_tensors="pt", truncation=True, max_length=2048)
276
+
277
+ # Generate response
278
+ with torch.no_grad():
279
+ outputs = self.model.generate(
280
+ inputs.input_ids,
281
+ max_new_tokens=512,
282
+ temperature=0.7,
283
+ do_sample=True,
284
+ pad_token_id=self.tokenizer.eos_token_id
285
+ )
286
+
287
+ # Decode response
288
+ response = self.tokenizer.decode(outputs[0], skip_special_tokens=True)
289
+
290
+ # Extract only the generated part (after the prompt)
291
+ generated_text = response[len(prompt):].strip()
292
+
293
+ return generated_text
294
+
295
+ def generate_educational_feedback(self, code: str, student_level: str = "beginner") -> str:
296
+ """
297
+ Generate educational feedback using the fine-tuned model
298
+
299
+ Args:
300
+ code: Student's code to provide feedback on
301
+ student_level: Student's skill level
302
+
303
+ Returns:
304
+ Generated educational feedback
305
+ """
306
+ if not self.model or not self.tokenizer:
307
+ raise ValueError("Model not loaded. Call load_model() first.")
308
+
309
+ # Format the prompt using the template from fine-tuning
310
+ prompt = self.code_feedback_prompt_template.format(
311
+ code=code,
312
+ level=student_level
313
+ )
314
+
315
+ # Tokenize input
316
+ inputs = self.tokenizer(
317
+ prompt, return_tensors="pt", truncation=True, max_length=2048)
318
+
319
+ # Generate response
320
+ with torch.no_grad():
321
+ outputs = self.model.generate(
322
+ inputs.input_ids,
323
+ max_new_tokens=512,
324
+ temperature=0.7,
325
+ do_sample=True,
326
+ pad_token_id=self.tokenizer.eos_token_id
327
+ )
328
+
329
+ # Decode response
330
+ response = self.tokenizer.decode(outputs[0], skip_special_tokens=True)
331
+
332
+ # Extract only the generated part (after the prompt)
333
+ generated_text = response[len(prompt):].strip()
334
+
335
+ return generated_text
336
+
337
+ def analyze_student_code(self, code: str, student_level: str = "beginner") -> List[CodeFeedback]:
338
+ """
339
+ Analyze student code and provide educational feedback using the fine-tuned model
340
+
341
+ Args:
342
+ code: The student's code to analyze
343
+ student_level: Student's skill level ('beginner', 'intermediate', 'advanced')
344
+
345
+ Returns:
346
+ List of CodeFeedback objects
347
+ """
348
+ feedback_list = []
349
+
350
+ # Use fine-tuned model for comprehensive code review
351
+ try:
352
+ code_review = self.generate_code_review(code, student_level)
353
+ educational_feedback = self.generate_educational_feedback(
354
+ code, student_level)
355
+
356
+ # Create structured feedback from model output
357
+ feedback_list.append(CodeFeedback(
358
+ code_snippet=code,
359
+ feedback_type="comprehensive_review",
360
+ feedback_message=code_review,
361
+ difficulty_level=student_level,
362
+ learning_objectives=["code_analysis", "best_practices"]
363
+ ))
364
+
365
+ feedback_list.append(CodeFeedback(
366
+ code_snippet=code,
367
+ feedback_type="educational_guidance",
368
+ feedback_message=educational_feedback,
369
+ difficulty_level=student_level,
370
+ learning_objectives=["learning", "improvement"]
371
+ ))
372
+
373
+ except Exception as e:
374
+ logger.warning(
375
+ f"Fine-tuned model failed, falling back to rule-based analysis: {e}")
376
+ # Fallback to rule-based analysis if model fails
377
+ feedback_list = self._fallback_analysis(code, student_level)
378
+
379
+ return feedback_list
380
+
381
+ def _fallback_analysis(self, code: str, student_level: str) -> List[CodeFeedback]:
382
+ """Fallback analysis using rule-based methods if fine-tuned model fails"""
383
+ feedback_list = []
384
+
385
+ # Analyze syntax
386
+ syntax_feedback = self._check_syntax(code, student_level)
387
+ if syntax_feedback:
388
+ feedback_list.append(syntax_feedback)
389
+
390
+ # Analyze logic and structure
391
+ logic_feedback = self._check_logic(code, student_level)
392
+ if logic_feedback:
393
+ feedback_list.extend(logic_feedback)
394
+
395
+ # Check for optimization opportunities
396
+ optimization_feedback = self._check_optimization(code, student_level)
397
+ if optimization_feedback:
398
+ feedback_list.append(optimization_feedback)
399
+
400
+ # Provide style suggestions
401
+ style_feedback = self._check_style(code, student_level)
402
+ if style_feedback:
403
+ feedback_list.append(style_feedback)
404
+
405
+ return feedback_list
406
+
407
+ def _check_syntax(self, code: str, student_level: str) -> Optional[CodeFeedback]:
408
+ """Check for syntax errors and provide educational feedback"""
409
+ # This would integrate with the fine-tuned model
410
+ # For now, using basic pattern matching as placeholder
411
+
412
+ common_syntax_errors = {
413
+ r"print\s*\([^)]*\)\s*$": "Remember to add a colon after print statements in some contexts",
414
+ r"if\s+[^:]+$": "Don't forget the colon after your if condition",
415
+ r"for\s+[^:]+$": "Don't forget the colon after your for loop",
416
+ }
417
+
418
+ for pattern, message in common_syntax_errors.items():
419
+ if re.search(pattern, code):
420
+ return CodeFeedback(
421
+ code_snippet=code,
422
+ feedback_type="syntax",
423
+ feedback_message=message,
424
+ difficulty_level=student_level,
425
+ learning_objectives=["syntax", "basic_python"]
426
+ )
427
+
428
+ return None
429
+
430
+ def _check_logic(self, code: str, student_level: str) -> List[CodeFeedback]:
431
+ """Check for logical errors and provide educational feedback"""
432
+ feedback_list = []
433
+
434
+ # Check for infinite loops
435
+ if "while True:" in code and "break" not in code:
436
+ feedback_list.append(CodeFeedback(
437
+ code_snippet=code,
438
+ feedback_type="logic",
439
+ feedback_message="This while loop will run forever! Make sure to include a break statement or condition to exit the loop.",
440
+ difficulty_level=student_level,
441
+ learning_objectives=["control_flow", "loops"]
442
+ ))
443
+
444
+ # Check for unused variables
445
+ # This is a simplified check - the actual model would be more sophisticated
446
+ if "x = " in code and "x" not in code.replace("x = ", ""):
447
+ feedback_list.append(CodeFeedback(
448
+ code_snippet=code,
449
+ feedback_type="logic",
450
+ feedback_message="You created variable 'x' but didn't use it. Consider removing unused variables to keep your code clean.",
451
+ difficulty_level=student_level,
452
+ learning_objectives=["variables", "code_cleanliness"]
453
+ ))
454
+
455
+ return feedback_list
456
+
457
+ def _check_optimization(self, code: str, student_level: str) -> Optional[CodeFeedback]:
458
+ """Check for optimization opportunities"""
459
+ # Check for nested loops that could be optimized
460
+ if code.count("for") > 1 and code.count("in range") > 1:
461
+ return CodeFeedback(
462
+ code_snippet=code,
463
+ feedback_type="optimization",
464
+ feedback_message="You have nested loops here. Consider if you can optimize this to O(n) instead of O(n²).",
465
+ suggested_improvement="Use a hashmap or set to reduce complexity",
466
+ difficulty_level=student_level,
467
+ learning_objectives=["algorithms",
468
+ "complexity", "optimization"]
469
+ )
470
+
471
+ return None
472
+
473
+ def _check_style(self, code: str, student_level: str) -> Optional[CodeFeedback]:
474
+ """Check for style improvements"""
475
+ # Check for meaningful variable names
476
+ if "x" in code or "y" in code or "z" in code:
477
+ return CodeFeedback(
478
+ code_snippet=code,
479
+ feedback_type="style",
480
+ feedback_message="Consider using more descriptive variable names instead of x, y, z. This makes your code easier to understand.",
481
+ difficulty_level=student_level,
482
+ learning_objectives=["naming_conventions", "readability"]
483
+ )
484
+
485
+ return None
486
+
487
+ def generate_explanation(self, concept: str, student_level: str) -> str:
488
+ """
489
+ Generate explanations for programming concepts based on student level
490
+
491
+ Args:
492
+ concept: The concept to explain
493
+ student_level: Student's skill level
494
+
495
+ Returns:
496
+ Explanation tailored to the student's level
497
+ """
498
+ explanations = {
499
+ "variables": {
500
+ "beginner": "Variables are like labeled boxes where you store information. Think of 'name = \"John\"' as putting \"John\" in a box labeled 'name'.",
501
+ "intermediate": "Variables are memory locations that store data. They have a name, type, and value. Python is dynamically typed, so the type is inferred.",
502
+ "advanced": "Variables in Python are references to objects in memory. They're dynamically typed and use reference counting for memory management."
503
+ },
504
+ "loops": {
505
+ "beginner": "Loops repeat code multiple times. 'for' loops are great when you know how many times to repeat, 'while' loops when you don't.",
506
+ "intermediate": "Loops control program flow. 'for' iterates over sequences, 'while' continues until a condition is False. Consider time complexity.",
507
+ "advanced": "Loops are fundamental control structures. Python's 'for' is actually a foreach loop. Consider iterator patterns and generator expressions."
508
+ }
509
+ }
510
+
511
+ return explanations.get(concept, {}).get(student_level, f"Explanation for {concept} at {student_level} level")
512
+
513
+ def _load_comprehensive_feedback_prompt(self) -> str:
514
+ """Load the comprehensive feedback prompt template"""
515
+ return """You are an expert programming tutor. Provide comprehensive educational feedback for the following student code.
516
+
517
+ Student Code:
518
+ {code}
519
+
520
+ Student Level: {level}
521
+
522
+ Please provide a detailed analysis in the following JSON format:
523
+
524
+ {{
525
+ "strengths": ["strength1", "strength2", "strength3"],
526
+ "weaknesses": ["weakness1", "weakness2", "weakness3"],
527
+ "issues": ["issue1", "issue2", "issue3"],
528
+ "step_by_step_improvement": [
529
+ "Step 1: Description of first improvement",
530
+ "Step 2: Description of second improvement",
531
+ "Step 3: Description of third improvement"
532
+ ],
533
+ "learning_points": [
534
+ "Learning point 1: What the student should understand",
535
+ "Learning point 2: Key concept to grasp",
536
+ "Learning point 3: Best practice to follow"
537
+ ],
538
+ "review_summary": "A comprehensive review of the code highlighting key areas for improvement",
539
+ "learning_objectives": ["objective1", "objective2", "objective3"],
540
+ "estimated_time_to_improve": "5-10 minutes"
541
+ }}
542
+
543
+ Focus on educational value and constructive feedback that helps the student learn and improve."""
544
+
545
+ def _load_comprehension_question_prompt(self) -> str:
546
+ """Load the comprehension question generation prompt"""
547
+ return """Based on the learning points and improvements discussed, generate a comprehension question to test the student's understanding.
548
+
549
+ Learning Points: {learning_points}
550
+ Code Issues: {issues}
551
+ Student Level: {level}
552
+
553
+ Generate a question that tests understanding of the key concepts discussed. The question should be appropriate for the student's level.
554
+
555
+ Format your response as JSON:
556
+ {{
557
+ "question": "Your comprehension question here",
558
+ "answer": "The correct answer",
559
+ "explanation": "Detailed explanation of why this answer is correct"
560
+ }}
561
+
562
+ Make the question challenging but fair for the student's level."""
563
+
564
+ def _load_code_fix_prompt(self) -> str:
565
+ """Load the code fix generation prompt"""
566
+ return """You are an expert programming tutor. Based on the analysis and learning points, provide an improved version of the student's code.
567
+
568
+ Original Code:
569
+ {code}
570
+
571
+ Issues Identified: {issues}
572
+ Learning Points: {learning_points}
573
+ Student Level: {level}
574
+
575
+ Provide an improved version of the code that addresses the issues while maintaining educational value. Include comments to explain the improvements.
576
+
577
+ Format your response as JSON:
578
+ {{
579
+ "improved_code": "The improved code with comments",
580
+ "fix_explanation": "Detailed explanation of what was changed and why"
581
+ }}
582
+
583
+ Focus on educational improvements that help the student understand better practices."""
584
+
585
+ def adapt_feedback_complexity(self, feedback: CodeFeedback, student_level: str) -> CodeFeedback:
586
+ """
587
+ Adapt feedback complexity based on student level
588
+
589
+ Args:
590
+ feedback: Original feedback
591
+ student_level: Student's skill level
592
+
593
+ Returns:
594
+ Adapted feedback
595
+ """
596
+ if student_level == "beginner":
597
+ # Simplify language and add more examples
598
+ feedback.feedback_message = feedback.feedback_message.replace(
599
+ "O(n²)", "quadratic time (slower)"
600
+ ).replace(
601
+ "O(n)", "linear time (faster)"
602
+ )
603
+ elif student_level == "advanced":
604
+ # Add more technical details
605
+ if "optimization" in feedback.feedback_type:
606
+ feedback.feedback_message += " Consider the space-time tradeoff and cache locality."
607
+
608
+ return feedback
609
+
610
+ def generate_comprehensive_feedback(self, code: str, student_level: str = "beginner") -> ComprehensiveFeedback:
611
+ """
612
+ Generate comprehensive educational feedback with all components
613
+
614
+ Args:
615
+ code: Student's code to analyze
616
+ student_level: Student's skill level
617
+
618
+ Returns:
619
+ ComprehensiveFeedback object with all educational components
620
+ """
621
+ if not self.model or not self.tokenizer:
622
+ raise ValueError("Model not loaded. Call load_model() first.")
623
+
624
+ try:
625
+ # Step 1: Generate comprehensive analysis
626
+ comprehensive_analysis = self._generate_comprehensive_analysis(
627
+ code, student_level)
628
+
629
+ # Step 2: Generate comprehension question
630
+ comprehension_data = self._generate_comprehension_question(
631
+ comprehensive_analysis["learning_points"],
632
+ comprehensive_analysis["issues"],
633
+ student_level
634
+ )
635
+
636
+ # Step 3: Generate improved code
637
+ code_fix_data = self._generate_code_fix(
638
+ code,
639
+ comprehensive_analysis["issues"],
640
+ comprehensive_analysis["learning_points"],
641
+ student_level
642
+ )
643
+
644
+ # Create comprehensive feedback object
645
+ return ComprehensiveFeedback(
646
+ code_snippet=code,
647
+ student_level=student_level,
648
+ strengths=comprehensive_analysis["strengths"],
649
+ weaknesses=comprehensive_analysis["weaknesses"],
650
+ issues=comprehensive_analysis["issues"],
651
+ step_by_step_improvement=comprehensive_analysis["step_by_step_improvement"],
652
+ learning_points=comprehensive_analysis["learning_points"],
653
+ review_summary=comprehensive_analysis["review_summary"],
654
+ comprehension_question=comprehension_data["question"],
655
+ comprehension_answer=comprehension_data["answer"],
656
+ explanation=comprehension_data["explanation"],
657
+ improved_code=code_fix_data["improved_code"],
658
+ fix_explanation=code_fix_data["fix_explanation"],
659
+ difficulty_level=student_level,
660
+ learning_objectives=comprehensive_analysis["learning_objectives"],
661
+ estimated_time_to_improve=comprehensive_analysis["estimated_time_to_improve"]
662
+ )
663
+
664
+ except Exception as e:
665
+ logger.error(f"Error generating comprehensive feedback: {e}")
666
+ # Return a basic comprehensive feedback if model fails
667
+ return self._create_fallback_comprehensive_feedback(code, student_level)
668
+
669
+ def _generate_comprehensive_analysis(self, code: str, student_level: str) -> Dict:
670
+ """Generate comprehensive analysis using the fine-tuned model"""
671
+ prompt = self.comprehensive_feedback_prompt.format(
672
+ code=code,
673
+ level=student_level
674
+ )
675
+
676
+ response = self._generate_model_response(prompt)
677
+
678
+ try:
679
+ # Try to parse JSON response
680
+ import json
681
+ return json.loads(response)
682
+ except json.JSONDecodeError:
683
+ logger.warning("Failed to parse JSON response, using fallback")
684
+ return self._create_fallback_analysis(code, student_level)
685
+
686
+ def _generate_comprehension_question(self, learning_points: List[str], issues: List[str], student_level: str) -> Dict:
687
+ """Generate comprehension question using the fine-tuned model"""
688
+ prompt = self.comprehension_question_prompt.format(
689
+ learning_points=", ".join(learning_points),
690
+ issues=", ".join(issues),
691
+ level=student_level
692
+ )
693
+
694
+ response = self._generate_model_response(prompt)
695
+
696
+ try:
697
+ import json
698
+ return json.loads(response)
699
+ except json.JSONDecodeError:
700
+ logger.warning(
701
+ "Failed to parse comprehension question JSON, using fallback")
702
+ return {
703
+ "question": "What is the main concept you learned from this code review?",
704
+ "answer": "The main concept is understanding code structure and best practices.",
705
+ "explanation": "This question tests your understanding of the key learning points discussed."
706
+ }
707
+
708
+ def _generate_code_fix(self, code: str, issues: List[str], learning_points: List[str], student_level: str) -> Dict:
709
+ """Generate improved code using the fine-tuned model"""
710
+ prompt = self.code_fix_prompt.format(
711
+ code=code,
712
+ issues=", ".join(issues),
713
+ learning_points=", ".join(learning_points),
714
+ level=student_level
715
+ )
716
+
717
+ response = self._generate_model_response(prompt)
718
+
719
+ try:
720
+ import json
721
+ return json.loads(response)
722
+ except json.JSONDecodeError:
723
+ logger.warning("Failed to parse code fix JSON, using fallback")
724
+ return {
725
+ "improved_code": "# Improved version of your code\n# Add comments and improvements here",
726
+ "fix_explanation": "This is a fallback improved version. The model should provide specific improvements."
727
+ }
728
+
729
+ def _generate_model_response(self, prompt: str) -> str:
730
+ """Generate response from the fine-tuned model"""
731
+ inputs = self.tokenizer(
732
+ prompt, return_tensors="pt", truncation=True, max_length=2048)
733
+
734
+ # Move to CPU if no GPU available
735
+ if not torch.cuda.is_available():
736
+ inputs = {k: v.cpu() for k, v in inputs.items()}
737
+
738
+ with torch.no_grad():
739
+ outputs = self.model.generate(
740
+ inputs.input_ids,
741
+ max_new_tokens=512,
742
+ temperature=0.7,
743
+ do_sample=True,
744
+ pad_token_id=self.tokenizer.eos_token_id
745
+ )
746
+
747
+ response = self.tokenizer.decode(outputs[0], skip_special_tokens=True)
748
+ return response[len(prompt):].strip()
749
+
750
+ def _create_fallback_analysis(self, code: str, student_level: str) -> Dict:
751
+ """Create fallback analysis when model fails"""
752
+ return {
753
+ "strengths": ["Your code has a clear structure", "You're using appropriate data types"],
754
+ "weaknesses": ["Could improve variable naming", "Consider adding comments"],
755
+ "issues": ["Basic syntax and style issues"],
756
+ "step_by_step_improvement": [
757
+ "Step 1: Add descriptive variable names",
758
+ "Step 2: Include comments explaining your logic",
759
+ "Step 3: Consider code optimization"
760
+ ],
761
+ "learning_points": [
762
+ "Good variable naming improves code readability",
763
+ "Comments help others understand your code",
764
+ "Always consider efficiency in your solutions"
765
+ ],
766
+ "review_summary": "Your code works but could be improved with better practices.",
767
+ "learning_objectives": ["code_quality", "best_practices", "readability"],
768
+ "estimated_time_to_improve": "10-15 minutes"
769
+ }
770
+
771
+ def _create_fallback_comprehensive_feedback(self, code: str, student_level: str) -> ComprehensiveFeedback:
772
+ """Create fallback comprehensive feedback when model fails"""
773
+ fallback_analysis = self._create_fallback_analysis(code, student_level)
774
+
775
+ return ComprehensiveFeedback(
776
+ code_snippet=code,
777
+ student_level=student_level,
778
+ strengths=fallback_analysis["strengths"],
779
+ weaknesses=fallback_analysis["weaknesses"],
780
+ issues=fallback_analysis["issues"],
781
+ step_by_step_improvement=fallback_analysis["step_by_step_improvement"],
782
+ learning_points=fallback_analysis["learning_points"],
783
+ review_summary=fallback_analysis["review_summary"],
784
+ comprehension_question="What is the importance of good variable naming in programming?",
785
+ comprehension_answer="Good variable naming makes code more readable and maintainable.",
786
+ explanation="Descriptive variable names help other developers (and yourself) understand what the code does.",
787
+ improved_code="# Improved version\n# Add your improvements here",
788
+ fix_explanation="This is a fallback version. The model should provide specific improvements.",
789
+ difficulty_level=student_level,
790
+ learning_objectives=fallback_analysis["learning_objectives"],
791
+ estimated_time_to_improve=fallback_analysis["estimated_time_to_improve"]
792
+ )
793
+
794
+
795
+ def main():
796
+ """Main function to demonstrate the system with fine-tuned model"""
797
+ print("Generative AI for Programming Education")
798
+ print("Using Fine-tuned CodeLlama-7B Model")
799
+ print("=" * 50)
800
+
801
+ # System information
802
+ print(f"Available GPUs: {torch.cuda.device_count()}")
803
+ if torch.cuda.is_available():
804
+ print("GPU Memory before loading:")
805
+ get_gpu_memory()
806
+ else:
807
+ print("System Memory before loading:")
808
+ get_system_memory()
809
+
810
+ # Initialize the system with your fine-tuned model path
811
+ # Update this path to point to your actual fine-tuned model
812
+ model_path = r"C:\Users\farou\OneDrive - Aston University\finetunning"
813
+ ai_tutor = ProgrammingEducationAI(model_path)
814
+
815
+ try:
816
+ # Load the fine-tuned model
817
+ print("Loading fine-tuned model...")
818
+ ai_tutor.load_model()
819
+ print("✓ Model loaded successfully!")
820
+
821
+ # Clear cache after loading
822
+ clear_cuda_cache()
823
+ if torch.cuda.is_available():
824
+ print("GPU Memory after loading:")
825
+ get_gpu_memory()
826
+ else:
827
+ print("System Memory after loading:")
828
+ get_system_memory()
829
+
830
+ # Example student code for testing
831
+ student_code = """
832
+ def find_duplicates(numbers):
833
+ x = []
834
+ for i in range(len(numbers)):
835
+ for j in range(i+1, len(numbers)):
836
+ if numbers[i] == numbers[j]:
837
+ x.append(numbers[i])
838
+ return x
839
+
840
+ # Test the function
841
+ result = find_duplicates([1, 2, 3, 2, 4, 5, 3])
842
+ print(result)
843
+ """
844
+
845
+ print(f"\nAnalyzing student code:\n{student_code}")
846
+
847
+ # Get feedback using fine-tuned model
848
+ feedback_list = ai_tutor.analyze_student_code(student_code, "beginner")
849
+
850
+ print("\n" + "="*50)
851
+ print("FINE-TUNED MODEL FEEDBACK:")
852
+ print("="*50)
853
+
854
+ for i, feedback in enumerate(feedback_list, 1):
855
+ print(f"\n{i}. {feedback.feedback_type.upper()}:")
856
+ print(f" {feedback.feedback_message}")
857
+ if feedback.suggested_improvement:
858
+ print(f" Suggestion: {feedback.suggested_improvement}")
859
+ print(
860
+ f" Learning objectives: {', '.join(feedback.learning_objectives)}")
861
+
862
+ # Demonstrate direct model calls
863
+ print("\n" + "="*50)
864
+ print("DIRECT MODEL GENERATION:")
865
+ print("="*50)
866
+
867
+ # Code review
868
+ print("\n1. CODE REVIEW:")
869
+ code_review = ai_tutor.generate_code_review(student_code, "beginner")
870
+ print(code_review)
871
+
872
+ # Educational feedback
873
+ print("\n2. EDUCATIONAL FEEDBACK:")
874
+ educational_feedback = ai_tutor.generate_educational_feedback(
875
+ student_code, "beginner")
876
+ print(educational_feedback)
877
+
878
+ # Demonstrate comprehensive feedback system
879
+ print("\n" + "="*50)
880
+ print("COMPREHENSIVE EDUCATIONAL FEEDBACK SYSTEM:")
881
+ print("="*50)
882
+
883
+ comprehensive_feedback = ai_tutor.generate_comprehensive_feedback(
884
+ student_code, "beginner")
885
+
886
+ # Display comprehensive feedback
887
+ print("\n📊 CODE ANALYSIS:")
888
+ print("="*30)
889
+
890
+ print("\n✅ STRENGTHS:")
891
+ for i, strength in enumerate(comprehensive_feedback.strengths, 1):
892
+ print(f" {i}. {strength}")
893
+
894
+ print("\n❌ WEAKNESSES:")
895
+ for i, weakness in enumerate(comprehensive_feedback.weaknesses, 1):
896
+ print(f" {i}. {weakness}")
897
+
898
+ print("\n⚠️ ISSUES:")
899
+ for i, issue in enumerate(comprehensive_feedback.issues, 1):
900
+ print(f" {i}. {issue}")
901
+
902
+ print("\n📝 STEP-BY-STEP IMPROVEMENT GUIDE:")
903
+ print("="*40)
904
+ for i, step in enumerate(comprehensive_feedback.step_by_step_improvement, 1):
905
+ print(f" Step {i}: {step}")
906
+
907
+ print("\n🎓 LEARNING POINTS:")
908
+ print("="*25)
909
+ for i, point in enumerate(comprehensive_feedback.learning_points, 1):
910
+ print(f" {i}. {point}")
911
+
912
+ print("\n📋 REVIEW SUMMARY:")
913
+ print("="*20)
914
+ print(f" {comprehensive_feedback.review_summary}")
915
+
916
+ print("\n❓ COMPREHENSION QUESTION:")
917
+ print("="*30)
918
+ print(f" Question: {comprehensive_feedback.comprehension_question}")
919
+ print(f" Answer: {comprehensive_feedback.comprehension_answer}")
920
+ print(f" Explanation: {comprehensive_feedback.explanation}")
921
+
922
+ print("\n🔧 IMPROVED CODE:")
923
+ print("="*20)
924
+ print(comprehensive_feedback.improved_code)
925
+
926
+ print("\n💡 FIX EXPLANATION:")
927
+ print("="*20)
928
+ print(f" {comprehensive_feedback.fix_explanation}")
929
+
930
+ print("\n📊 METADATA:")
931
+ print("="*15)
932
+ print(f" Student Level: {comprehensive_feedback.student_level}")
933
+ print(
934
+ f" Learning Objectives: {', '.join(comprehensive_feedback.learning_objectives)}")
935
+ print(
936
+ f" Estimated Time to Improve: {comprehensive_feedback.estimated_time_to_improve}")
937
+
938
+ except Exception as e:
939
+ print(f"Error: {e}")
940
+ print(
941
+ "Make sure to update the model_path variable to point to your fine-tuned model.")
942
+
943
+
944
+ if __name__ == "__main__":
945
+ main()
src/requirements.txt ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Core ML/AI dependencies
2
+ torch>=2.0.0
3
+ transformers>=4.30.0
4
+ accelerate>=0.20.0
5
+ bitsandbytes>=0.41.0
6
+
7
+ # Data processing
8
+ numpy>=1.24.0
9
+ pandas>=2.0.0
10
+ datasets>=2.12.0
11
+
12
+ # Utilities
13
+ tqdm>=4.65.0
14
+ requests>=2.31.0
15
+ python-dotenv>=1.0.0
16
+ psutil>=5.9.0 # For system memory monitoring
17
+
18
+ # Logging and monitoring
19
+ wandb>=0.15.0 # Optional: for experiment tracking
20
+ tensorboard>=2.13.0 # Optional: for training monitoring
21
+
22
+ # Code analysis (optional enhancements)
23
+ ast>=0.0.2
24
+ black>=23.0.0 # For code formatting analysis
25
+ pylint>=2.17.0 # For code quality analysis
26
+
27
+ # Web interface (optional)
28
+ flask>=2.3.0
29
+ streamlit>=1.25.0 # For creating a web interface
30
+
31
+ # Testing
32
+ pytest>=7.4.0
33
+ pytest-cov>=4.1.0
34
+
35
+ # Development
36
+ jupyter>=1.0.0
37
+ ipython>=8.14.0