Mindigenous
/

mindi-backup

Model card Files Files and versions

mindi-backup / README_COMPONENT_1_SETUP.md

Mindigenous

Initial full project backup with Git LFS

53f0cc2 11 days ago

|

history blame contribute delete

3.21 kB

	# Component 1: Project Setup (Windows + RTX 4060 8GB)

	## What This Component Does
	- Creates a clean folder structure for the full coding-assistant project.
	- Sets up a Python virtual environment.
	- Installs all core dependencies needed across Components 2-10.
	- Verifies that Python, PyTorch, CUDA visibility, and key libraries work.

	## Folder Structure Created
	- `data/raw` -> raw datasets you will provide later
	- `data/interim` -> temporary cleaned data
	- `data/processed` -> training-ready tokenized data
	- `data/external` -> any third-party resources
	- `src/tokenizer` -> Component 2 code tokenizer
	- `src/dataset_pipeline` -> Component 3 preprocessing pipeline
	- `src/model_architecture` -> Component 4 transformer code
	- `src/training_pipeline` -> Component 5 training loop
	- `src/evaluation_system` -> Component 6 evaluation code
	- `src/inference_engine` -> Component 7 inference code
	- `src/chat_interface` -> Component 8 Gradio interface
	- `src/finetuning_system` -> Component 9 LoRA fine-tuning
	- `src/export_optimization` -> Component 10 quantization/export tools
	- `configs` -> config files for all components
	- `scripts` -> setup, verification, and utility scripts
	- `tests` -> quick checks for each component
	- `checkpoints` -> model checkpoints saved during training
	- `models/base` -> base trained model files
	- `models/lora` -> LoRA adapters
	- `models/quantized` -> optimized quantized models
	- `artifacts` -> generated reports, metrics, and outputs
	- `logs` -> training and runtime logs

	## Exact Commands To Run (in this order)
	Run from:
	`D:\Desktop 31st Jan 2026\MIND-AI-MODEL`

	0. Install Python 3.11 (required for package compatibility):
	- Download page: https://www.python.org/downloads/release/python-3119/
	- Windows installer file: `python-3.11.9-amd64.exe`
	- During install, check: `Add python.exe to PATH`

	1. Allow script execution for this terminal only:
	```powershell
	Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
	```

	2. If you already attempted setup once, remove old virtual environment first:
	```powershell
	if (Test-Path .\.venv) { Remove-Item -Recurse -Force .\.venv }
	```

	3. Create folders, virtual env, install dependencies:
	```powershell
	.\scripts\setup_windows_environment.ps1
	```

	4. Activate virtual environment:
	```powershell
	.\.venv\Scripts\Activate.ps1
	```

	5. Verify setup:
	```powershell
	python .\scripts\verify_component1_setup.py
	```

	## Expected Verification Result
	- Prints Python version
	- Prints PyTorch version
	- Shows whether CUDA is available
	- Shows GPU name if available
	- Confirms critical libraries import correctly

	Note:
	- `codebleu` is excluded from base install on Windows due to a `tree-sitter` dependency conflict on Python 3.11.
	- Component 6 will use Windows-stable evaluation metrics and add code-quality checks without breaking setup.
	- `bitsandbytes` is optional on native Windows because some CUDA/driver combinations fail to load its DLL.
	- Base setup and all early components continue without it.
	- For Component 5, we will:
	- try `bitsandbytes` if available, and
	- automatically fall back to a stable optimizer on your machine if it is not.

	If verification fails, copy the full terminal output and share it with me.