mindi-backup / README_COMPONENT_1_SETUP.md
Mindigenous
Initial full project backup with Git LFS
53f0cc2
# Component 1: Project Setup (Windows + RTX 4060 8GB)
## What This Component Does
- Creates a clean folder structure for the full coding-assistant project.
- Sets up a Python virtual environment.
- Installs all core dependencies needed across Components 2-10.
- Verifies that Python, PyTorch, CUDA visibility, and key libraries work.
## Folder Structure Created
- `data/raw` -> raw datasets you will provide later
- `data/interim` -> temporary cleaned data
- `data/processed` -> training-ready tokenized data
- `data/external` -> any third-party resources
- `src/tokenizer` -> Component 2 code tokenizer
- `src/dataset_pipeline` -> Component 3 preprocessing pipeline
- `src/model_architecture` -> Component 4 transformer code
- `src/training_pipeline` -> Component 5 training loop
- `src/evaluation_system` -> Component 6 evaluation code
- `src/inference_engine` -> Component 7 inference code
- `src/chat_interface` -> Component 8 Gradio interface
- `src/finetuning_system` -> Component 9 LoRA fine-tuning
- `src/export_optimization` -> Component 10 quantization/export tools
- `configs` -> config files for all components
- `scripts` -> setup, verification, and utility scripts
- `tests` -> quick checks for each component
- `checkpoints` -> model checkpoints saved during training
- `models/base` -> base trained model files
- `models/lora` -> LoRA adapters
- `models/quantized` -> optimized quantized models
- `artifacts` -> generated reports, metrics, and outputs
- `logs` -> training and runtime logs
## Exact Commands To Run (in this order)
Run from:
`D:\Desktop 31st Jan 2026\MIND-AI-MODEL`
0. Install Python 3.11 (required for package compatibility):
- Download page: https://www.python.org/downloads/release/python-3119/
- Windows installer file: `python-3.11.9-amd64.exe`
- During install, check: `Add python.exe to PATH`
1. Allow script execution for this terminal only:
```powershell
Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
```
2. If you already attempted setup once, remove old virtual environment first:
```powershell
if (Test-Path .\.venv) { Remove-Item -Recurse -Force .\.venv }
```
3. Create folders, virtual env, install dependencies:
```powershell
.\scripts\setup_windows_environment.ps1
```
4. Activate virtual environment:
```powershell
.\.venv\Scripts\Activate.ps1
```
5. Verify setup:
```powershell
python .\scripts\verify_component1_setup.py
```
## Expected Verification Result
- Prints Python version
- Prints PyTorch version
- Shows whether CUDA is available
- Shows GPU name if available
- Confirms critical libraries import correctly
Note:
- `codebleu` is excluded from base install on Windows due to a `tree-sitter` dependency conflict on Python 3.11.
- Component 6 will use Windows-stable evaluation metrics and add code-quality checks without breaking setup.
- `bitsandbytes` is optional on native Windows because some CUDA/driver combinations fail to load its DLL.
- Base setup and all early components continue without it.
- For Component 5, we will:
- try `bitsandbytes` if available, and
- automatically fall back to a stable optimizer on your machine if it is not.
If verification fails, copy the full terminal output and share it with me.