File size: 4,338 Bytes
9ff5b8d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 |
# ML Notebooks Execution Guide
This directory contains machine learning notebooks for the Cyber Forge AI platform. Follow this guide to run the notebooks in the correct order for optimal results.
## π Prerequisites
Before running any notebooks, ensure you have:
1. **Python Environment**: Python 3.9+ installed
2. **Dependencies**: Install all required packages:
```bash
cd ../
pip install -r requirements.txt
```
3. **Jupyter**: Install Jupyter Notebook or JupyterLab:
```bash
pip install jupyter jupyterlab
```
## π― Execution Order
Run the notebooks in this specific order to ensure proper model training and dependencies:
### 1. **Basic AI Agent Training** π
**File**: `ai_agent_training.py`
**Purpose**: Initial AI agent setup and basic training
**Runtime**: ~10-15 minutes
**Description**:
- Sets up the foundational AI agent
- Installs core dependencies programmatically
- Provides basic communication and cybersecurity skills
- **RUN THIS FIRST** - Required for other notebooks
```bash
cd ml-services/notebooks
python ai_agent_training.py
```
### 2. **Advanced Cybersecurity ML Training** π‘οΈ
**File**: `advanced_cybersecurity_ml_training.ipynb`
**Purpose**: Comprehensive ML model training for threat detection
**Runtime**: ~30-45 minutes
**Description**:
- Data preparation and feature engineering
- Multiple ML model training (Random Forest, XGBoost, Neural Networks)
- Model evaluation and comparison
- Production model deployment preparation
```bash
jupyter notebook advanced_cybersecurity_ml_training.ipynb
```
### 3. **Network Security Analysis** π
**File**: `network_security_analysis.ipynb`
**Purpose**: Network-specific security analysis and monitoring
**Runtime**: ~20-30 minutes
**Description**:
- Network traffic analysis
- Intrusion detection model training
- Port scanning detection
- Network anomaly detection
```bash
jupyter notebook network_security_analysis.ipynb
```
### 4. **Comprehensive AI Agent Training** π€
**File**: `ai_agent_comprehensive_training.ipynb`
**Purpose**: Advanced AI agent with full capabilities
**Runtime**: ~45-60 minutes
**Description**:
- Enhanced communication skills
- Web scraping and threat intelligence
- Real-time monitoring capabilities
- Natural language processing for security analysis
- **RUN LAST** - Integrates all previous models
```bash
jupyter notebook ai_agent_comprehensive_training.ipynb
```
## π Expected Outputs
After running all notebooks, you should have:
1. **Trained Models**: Saved in `../models/` directory
2. **Performance Metrics**: Evaluation reports and visualizations
3. **AI Agent**: Fully trained agent ready for deployment
4. **Configuration Files**: Model configs for production use
## π§ Troubleshooting
### Common Issues:
**Memory Errors**:
- Reduce batch size in deep learning models
- Close other applications to free RAM
- Consider using smaller datasets for testing
**Package Installation Failures**:
- Update pip: `pip install --upgrade pip`
- Use conda if pip fails: `conda install <package>`
- Check Python version compatibility
**CUDA/GPU Issues**:
- For TensorFlow GPU: Install CUDA 11.8+ and cuDNN
- For CPU-only: Models will run slower but still work
- Check GPU availability: `tensorflow.test.is_gpu_available()`
**Data Download Issues**:
- Ensure internet connection for Kaggle datasets
- Set up Kaggle API credentials if needed
- Some notebooks include fallback synthetic data generation
## π Notes
- **First Run**: Initial execution takes longer due to package installation and data downloads
- **Subsequent Runs**: Much faster as dependencies are cached
- **Customization**: Modify hyperparameters in notebooks for different results
- **Production**: Use the saved models in the main application
## π― Next Steps
After completing all notebooks:
1. **Deploy Models**: Copy trained models to production environment
2. **Integration**: Connect models with the desktop application
3. **Monitoring**: Set up model performance monitoring
4. **Updates**: Retrain models with new data periodically
## π Support
If you encounter issues:
1. Check the troubleshooting section above
2. Verify all prerequisites are met
3. Review notebook outputs for specific error messages
4. Create an issue in the repository with error details
---
**Happy Training! π** |