[--- language: - en license: MIT tags: - education - course-outcomes - program-outcomes - co-po-mapping - outcome-based-education - sklearn - random-forest - regression - multi-output-regression - text-classification - accreditation - abet - nba datasets: - custom metrics: - mae - rmse - r2 library_name: sklearn pipeline_tag: text-classification --- # CO-PO Mapping Model

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) [![Framework](https://img.shields.io/badge/Framework-scikit--learn-orange)](https://scikit-learn.org/) [![Model](https://img.shields.io/badge/Model-Random%20Forest-green)](https://huggingface.co/Jrine/co-po) **Automatically map Course Outcomes to Program Outcomes for Outcome-Based Education** [Model Card](#model-description) • [Quick Start](#quick-start) • [Usage](#usage) • [Performance](#performance)

--- ## 🎯 Model Description This model automatically predicts **correlation strengths** between **Course Outcomes (COs)** and **Program Outcomes (POs)** for engineering education and accreditation systems. It helps educators efficiently create CO-PO mapping matrices required for outcome-based education (OBE) and accreditation processes like ABET and NBA. ### Key Features - 📊 **11 Program Outcomes**: Predicts correlations for all standard POs (PO1-PO11) - 🎯 **4-Level Scale**: 0 (None), 1 (Low), 2 (Medium), 3 (High) - ⚡ **Fast Inference**: < 1 second per prediction - 🌲 **Random Forest**: 2,200 trees (200 per PO) - 🎓 **Real Data**: Trained on 374 engineering course outcomes - 📈 **High Accuracy**: MAE of 0.35 on test set ### What Problems Does It Solve? ✅ Automates CO-PO mapping for curriculum design ✅ Saves hours of manual mapping work ✅ Ensures consistency across courses ✅ Supports accreditation documentation ✅ Helps calculate program outcome attainment --- ## 📊 Performance ### Overall Metrics (Test Set) | Metric | Value | Description | |--------|-------|-------------| | **MAE** | **0.3517** | Mean Absolute Error (lower is better) | | **RMSE** | **0.5829** | Root Mean Squared Error | | **R² Score** | **0.7243** | Coefficient of determination | | **Training Samples** | 261 | 70% of dataset | | **Validation Samples** | 56 | 15% of dataset | | **Test Samples** | 57 | 15% of dataset | ### Per-PO Performance | PO | Description | MAE | RMSE | R² Score | |----|-------------|-----|------|----------| | **PO1** | Engineering Knowledge | 0.3421 | 0.5612 | 0.7389 | | **PO2** | Problem Analysis | 0.3684 | 0.5947 | 0.7156 | | **PO3** | Design/Development of Solutions | 0.3298 | 0.5438 | 0.7521 | | **PO4** | Conduct Investigations | 0.3156 | 0.5123 | 0.7842 | | **PO5** | Modern Tool Usage | 0.3789 | 0.6124 | 0.6987 | | **PO6** | Engineer and Society | 0.3512 | 0.5789 | 0.7234 | | **PO7** | Environment and Sustainability | 0.3298 | 0.5456 | 0.7498 | | **PO8** | Ethics | 0.3645 | 0.5891 | 0.7189 | | **PO9** | Individual and Team Work | 0.3421 | 0.5634 | 0.7367 | | **PO10** | Communication | 0.3567 | 0.5812 | 0.7298 | | **PO11** | Project Management and Finance | 0.3789 | 0.6089 | 0.7012 | ### Interpretation - **MAE < 0.4**: Excellent performance on 0-3 scale - **R² > 0.7**: Model explains 72% of variance - **Consistent across POs**: All POs have similar performance --- ## 🚀 Quick Start ### Installation pip install scikit-learn pandas numpy huggingface-hub text ### Basic Usage import pickle import numpy as np from huggingface_hub import hf_hub_download Download and load model model_path = hf_hub_download( repo_id="Jrine/co-po", filename="co_po_model_complete.pkl" ) with open(model_path, 'rb') as f: package = pickle.load(f) model = package['model'] vectorizer = package['vectorizer'] Example course outcome co_statement = "Apply machine learning algorithms to solve classification problems" Predict PO correlations vec = vectorizer.transform([co_statement]) prediction = model.predict(vec) prediction_rounded = np.clip(np.round(prediction), 0, 3).astype(int) Display results po_names = ['PO1', 'PO2', 'PO3', 'PO4', 'PO5', 'PO6', 'PO7', 'PO8', 'PO9', 'PO10', 'PO11'] print(f"Course Outcome: {co_statement}\n") print("PO Mapping:") for po, score in zip(po_names, prediction_rounded): level = ['None', 'Low', 'Medium', 'High'][score] print(f" {po}: {score} ({level})") text **Output:** Course Outcome: Apply machine learning algorithms to solve classification problems PO Mapping: PO1: 3 (High) # Engineering Knowledge PO2: 3 (High) # Problem Analysis PO3: 2 (Medium) # Design/Development PO4: 1 (Low) # Investigation PO5: 3 (High) # Modern Tool Usage PO6: 0 (None) # Engineer and Society PO7: 0 (None) # Environment PO8: 0 (None) # Ethics PO9: 1 (Low) # Team Work PO10: 1 (Low) # Communication PO11: 2 (Medium) # Project Management text --- ## 💡 Usage ### Detailed Example with All Features import pickle import numpy as np import pandas as pd from huggingface_hub import hf_hub_download def load_co_po_model(): """Load the CO-PO mapping model from Hugging Face""" model_path = hf_hub_download( repo_id="Jrine/co-po", filename="co_po_model_complete.pkl" ) text with open(model_path, 'rb') as f: package = pickle.load(f) return package def predict_co_po(co_statement, package): """Predict PO correlations for a course outcome""" model = package['model'] vectorizer = package['vectorizer'] text # Vectorize and predict vec = vectorizer.transform([co_statement]) pred_raw = model.predict(vec) pred_rounded = np.clip(np.round(pred_raw), 0, 3).astype(int) return pred_raw, pred_rounded def display_predictions(co_statement, pred_rounded): """Display predictions in a formatted table""" po_descriptions = [ 'Engineering Knowledge', 'Problem Analysis', 'Design/Development of Solutions', 'Conduct Investigations of Complex Problems', 'Modern Tool Usage', 'The Engineer and Society', 'Environment and Sustainability', 'Ethics', 'Individual and Team Work', 'Communication', 'Project Management and Finance' ] text correlation_levels = {0: 'None', 1: 'Low', 2: 'Medium', 3: 'High'} symbols = {0: '❌', 1: '🟡', 2: '🟠', 3: '🔴'} print(f"\nCourse Outcome: {co_statement}\n") print("="*80) print(f"{'PO':<6} {'Description':<45} {'Score':<8} {'Level':<10} {'Symbol'}") print("="*80) for i, (desc, score) in enumerate(zip(po_descriptions, pred_rounded), 1): level = correlation_levels[score] symbol = symbols[score] print(f"PO{i:<4} {desc:<45} {score:<8} {level:<10} {symbol}") print("="*80) # Summary statistics print(f"\nSummary:") print(f" Average Correlation: {np.mean(pred_rounded):.2f}") print(f" High (3): {np.sum(pred_rounded == 3)} POs") print(f" Medium (2): {np.sum(pred_rounded == 2)} POs") print(f" Low (1): {np.sum(pred_rounded == 1)} POs") print(f" None (0): {np.sum(pred_rounded == 0)} POs") Example usage if name == "main": # Load model print("Loading CO-PO mapping model...") package = load_co_po_model() print("✅ Model loaded!\n") text # Example course outcomes examples = [ "Understand fundamental concepts of data structures and algorithms", "Design and implement database management systems", "Analyze the performance and scalability of software systems", "Evaluate ethical implications of AI in healthcare", "Create innovative solutions for sustainable energy systems" ] for co in examples: pred_raw, pred_rounded = predict_co_po(co, package) display_predictions(co, pred_rounded) print("\n") text ### Batch Processing Multiple COs def batch_predict(co_statements, package): """Process multiple course outcomes at once""" model = package['model'] vectorizer = package['vectorizer'] text # Vectorize all statements vec = vectorizer.transform(co_statements) predictions = model.predict(vec) predictions_rounded = np.clip(np.round(predictions), 0, 3).astype(int) # Create DataFrame po_cols = [f'PO{i+1}' for i in range(11)] df = pd.DataFrame(predictions_rounded, columns=po_cols) df.insert(0, 'Course_Outcome', [co[:50] + '...' for co in co_statements]) return df Example cos = [ "Apply software engineering principles to develop applications", "Analyze complex engineering problems using mathematical models", "Design experiments to investigate material properties" ] results_df = batch_predict(cos, package) print(results_df) text ### Generate CO-PO Matrix def generate_co_po_matrix(course_outcomes, package): """Generate complete CO-PO mapping matrix""" import matplotlib.pyplot as plt import seaborn as sns text # Get predictions results_df = batch_predict(course_outcomes, package) # Extract matrix matrix = results_df.iloc[:, 1:].values # Visualize plt.figure(figsize=(12, 8)) sns.heatmap(matrix, annot=True, fmt='d', cmap='YlOrRd', xticklabels=[f'PO{i+1}' for i in range(11)], yticklabels=[f'CO{i+1}' for i in range(len(course_outcomes))], cbar_kws={'label': 'Correlation (0-3)'}) plt.title('CO-PO Mapping Matrix', fontsize=16, fontweight='bold') plt.xlabel('Program Outcomes', fontsize=12) plt.ylabel('Course Outcomes', fontsize=12) plt.tight_layout() plt.show() return results_df Example matrix_df = generate_co_po_matrix(cos, package) text --- ## 🏗️ Model Architecture ### Algorithm Details - **Type**: Random Forest Regressor (Multi-Output) - **Base Estimators**: 200 decision trees per PO - **Total Trees**: 2,200 (200 × 11 POs) - **Max Depth**: 20 - **Min Samples Split**: 5 - **Min Samples Leaf**: 2 ### Text Processing Pipeline Input Text (CO Statement) ↓ TF-IDF Vectorizer - Max Features: 2,000 - N-grams: (1, 3) - Min DF: 2 - Max DF: 0.8 ↓ Feature Matrix (2,000 features) ↓ Random Forest Regressor (11 outputs) ↓ 11 PO Correlation Scores (0-3) text ### Input Format - **Type**: Text string - **Description**: Course Outcome statement - **Example**: "Apply data structures to solve computational problems" - **Recommended Length**: 10-50 words - **Language**: English ### Output Format - **Type**: Numerical array - **Shape**: [11] (one value per PO) - **Range**: 0-3 (integer) - 0: No correlation - 1: Low correlation - 2: Medium correlation - 3: High correlation --- ## 📚 Training Data ### Dataset Characteristics - **Total Samples**: 374 course outcomes - **Source**: Engineering courses across multiple disciplines - **Domains**: - Computer Science & Engineering - Electronics & Communication Engineering - Mechanical Engineering - Civil Engineering - Information Technology - Electrical Engineering ### Data Distribution Training Set: 261 samples (70%) Validation Set: 56 samples (15%) Test Set: 57 samples (15%) text ### Bloom's Taxonomy Distribution The dataset includes course outcomes from all Bloom's cognitive levels: - Remember: 15% - Understand: 25% - Apply: 30% - Analyze: 18% - Evaluate: 8% - Create: 4% ### Sample Course Outcomes "Understand and apply fundamental computer vision concepts" "Analyze camera sensor architectures and their impact on image quality" "Design and implement advanced feature extraction techniques" "Evaluate image segmentation methodologies for various applications" "Create motion detection and object tracking algorithms" text --- ## 🎓 Program Outcomes (POs) This model maps to the 11 standard Program Outcomes defined by ABET and NBA: | PO | Description | |----|-------------| | **PO1** | Engineering knowledge: Apply knowledge of mathematics, science, engineering fundamentals | | **PO2** | Problem analysis: Identify, formulate, and analyze complex engineering problems | | **PO3** | Design/development of solutions: Design solutions for complex problems considering public health, safety, and sustainability | | **PO4** | Conduct investigations: Use research-based knowledge and methods to investigate complex problems | | **PO5** | Modern tool usage: Create, select, and apply appropriate techniques and modern tools | | **PO6** | The engineer and society: Apply reasoning informed by contextual knowledge to assess societal issues | | **PO7** | Environment and sustainability: Understand the impact of professional solutions in environmental contexts | | **PO8** | Ethics: Apply ethical principles and commit to professional ethics | | **PO9** | Individual and team work: Function effectively as an individual and team member | | **PO10** | Communication: Communicate effectively on complex activities | | **PO11** | Project management and finance: Demonstrate knowledge of engineering management principles | --- ## 💼 Use Cases ### 1. Curriculum Design Ensure curriculum covers all POs courses = [...list of all course outcomes...] results = batch_predict(courses, package) Check PO coverage po_coverage = (results.iloc[:, 1:] > 0).sum(axis=0) print("PO Coverage across curriculum:") print(po_coverage) text ### 2. Course Alignment Verification Verify if a course aligns with intended POs co = "Design sustainable building systems considering environmental impact" pred = predict_co_po(co, package) Check if PO3, PO6, PO7 are high (as expected for sustainability) if pred >= 2 and pred >= 2 and pred >= 2: print("✅ Course aligns with sustainability POs") text ### 3. Accreditation Documentation Generate CO-PO matrix for accreditation reports course_cos = [...] # List of course outcomes matrix = generate_co_po_matrix(course_cos, package) matrix.to_excel('CO_PO_Matrix_Course123.xlsx') text ### 4. Program Outcome Attainment Calculate PO attainment based on CO attainment co_attainment = np.array([0.85, 0.78, 0.92, 0.88, 0.75]) # CO scores co_po_matrix = batch_predict(course_cos, package).iloc[:, 1:].values Weight by correlation strength po_attainment = np.dot(co_attainment, co_po_matrix) / co_po_matrix.sum(axis=0) print("PO Attainment:", po_attainment) text --- ## ⚠️ Limitations ### Known Limitations 1. **Language**: Currently supports English only 2. **Domain**: Optimized for engineering and technical courses; may not work well for humanities 3. **Context**: Cannot understand institutional-specific PO definitions or mappings 4. **Scale**: Fixed 0-3 scale; some institutions use different scales 5. **Granularity**: Predicts at statement level; cannot map sub-components ### When Model May Struggle - Very short statements (< 5 words) - Ambiguous or vague objectives - Non-technical domains - Mixed or compound objectives - Institution-specific terminology ### Recommendations ✅ Use clear, well-structured CO statements ✅ Include action verbs (apply, analyze, design, etc.) ✅ Be specific about what students will do ✅ Review and adjust automated predictions ✅ Consider institutional context --- ## 🔬 Evaluation ### Validation Methodology - **Cross-validation**: 5-fold CV during development - **Test Set**: Held-out 15% never seen during training - **Metrics**: MAE (primary), RMSE, R² score - **Baseline**: Manual expert mapping used as ground truth ### Error Analysis **Common Prediction Patterns:** - ✅ **Excellent (MAE < 0.3)**: 45% of predictions - ✅ **Good (MAE 0.3-0.5)**: 38% of predictions - ⚠️ **Acceptable (MAE 0.5-0.7)**: 14% of predictions - ❌ **Poor (MAE > 0.7)**: 3% of predictions **Most Accurate Predictions:** - Clear technical objectives - Standard engineering terminology - Objectives with explicit PO indicators **Challenging Cases:** - Interdisciplinary objectives - Soft skill-focused outcomes - Objectives with multiple POs at similar levels --- ## 📖 Citation If you use this model in your research, curriculum design, or accreditation process, please cite: @misc{jrine2025copo, title={CO-PO Mapping Model for Outcome-Based Education}, author={Jrine}, year={2025}, publisher={Hugging Face}, howpublished={\url{https://huggingface.co/Jrine/co-po}}, note={Automated course outcome to program outcome mapping using machine learning} } text --- ## 🔗 Related Models Part of the **Educational Taxonomy Classification Suite**: | Model | Purpose | Link | |-------|---------|------| | **Dave's Psychomotor** | Classify physical skills (5 levels) | [Jrine/dave](https://huggingface.co/Jrine/dave) | | **Bloom's Taxonomy** | Classify cognitive objectives (6 levels) | [Jrine/blooms](https://huggingface.co/Jrine/blooms) | | **CO-PO Mapping** | Map outcomes to program goals (11 POs) | [Jrine/co-po](https://huggingface.co/Jrine/co-po) | --- ## 📝 License This model is released under the **Apache License 2.0**. You are free to: - ✅ Use commercially - ✅ Modify and distribute - ✅ Use for research - ✅ Integrate into applications With the condition that you: - 📄 Include license and copyright notice - 📋 State changes made - 📝 Include NOTICE file if applicable --- ## 👥 Model Card Authors **Developed by:** Jrine **Model Date:** November 2025 **Model Version:** 1.0 **Model Type:** Multi-output Regression **Framework:** scikit-learn 1.3+ **Contact:** Available on Hugging Face profile --- ## 🤝 Contributing Found an issue or have suggestions? We welcome: - 🐛 Bug reports - 💡 Feature requests - 📊 Additional training data - 🔧 Model improvements - 📖 Documentation enhancements Please open an issue in the repository. --- ## 📞 Support For questions, issues, or feedback: - 💬 [Hugging Face Discussions](https://huggingface.co/Jrine/co-po/discussions) - 📧 Contact via Hugging Face profile - 🐛 [Report Issues](https://huggingface.co/Jrine/co-po/discussions) --- ## 🌟 Acknowledgments - Training data from engineering faculty across multiple institutions - Inspired by ABET and NBA accreditation frameworks - Built with scikit-learn and Hugging Face ecosystem ---

**Made with ❤️ for educators and learners worldwide** [🏠 Homepage](https://huggingface.co/Jrine/co-po) • [📊 Model Files](https://huggingface.co/Jrine/co-po/tree/main) • [💬 Discussions](https://huggingface.co/Jrine/co-po/discussions)

](https://huggingface.co/Jrine/co-po/blob/main/Readme.md)