# KerdosAI Quick Start Guide ## Installation ### Using pip (Recommended) ```bash # Clone the repository git clone https://github.com/bhaskarvilles/kerdosai.git cd kerdosai # Create virtual environment python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate # Install dependencies pip install -r requirements.txt # Install development dependencies (optional) pip install pytest pytest-cov black ruff mypy rich typer ``` ### Using Docker ```bash # Build the image docker-compose build # Run training docker-compose run kerdosai-train # Start API server docker-compose up kerdosai-api ``` ## Quick Start ### 1. Basic Training ```bash # Train with default configuration python cli.py train \ --model gpt2 \ --data ./data/train.json \ --output ./output # Train with custom configuration python cli.py train --config configs/default.yaml ``` ### 2. Using Configuration Files Create a configuration file `my_config.yaml`: ```yaml base_model: "gpt2" output_dir: "./my_output" training: epochs: 5 batch_size: 8 learning_rate: 0.00001 lora: enabled: true r: 16 alpha: 64 data: train_file: "./data/train.json" ``` Then train: ```bash python cli.py train --config my_config.yaml ``` ### 3. Text Generation ```bash python cli.py generate \ ./output \ --prompt "Once upon a time" \ --max-length 200 \ --temperature 0.8 ``` ### 4. Model Information ```bash # View model details python cli.py info ./output # View KerdosAI version python cli.py info ``` ## Configuration Presets KerdosAI includes several pre-configured training presets: ```bash # Quick test (fast, minimal resources) python cli.py train --config configs/training_presets.yaml#quick_test # Small model (resource-constrained) python cli.py train --config configs/training_presets.yaml#small_model # Production (optimized settings) python cli.py train --config configs/training_presets.yaml#production ``` ## Python API ```python from kerdosai.agent import KerdosAgent from kerdosai.config import load_config # Load configuration config = load_config("configs/default.yaml") # Initialize agent agent = KerdosAgent( base_model="gpt2", training_data="./data/train.json" ) # Prepare for efficient training agent.prepare_for_training( use_lora=True, lora_r=8, use_4bit=True ) # Train metrics = agent.train( epochs=3, batch_size=4, learning_rate=2e-5 ) # Save model agent.save("./output") # Generate text output = agent.generate( "Hello, AI!", max_length=100, temperature=0.7 ) print(output) ``` ## Data Format KerdosAI supports various data formats: ### JSON Format ```json [ {"text": "First training example..."}, {"text": "Second training example..."} ] ``` ### CSV Format ```csv text "First training example..." "Second training example..." ``` ### HuggingFace Datasets ```python from config import KerdosConfig config = KerdosConfig( base_model="gpt2", data=DataConfig( dataset_name="wikitext", dataset_config="wikitext-2-raw-v1" ) ) ``` ## Advanced Features ### LoRA (Low-Rank Adaptation) ```python config.lora.enabled = True config.lora.r = 16 # Rank config.lora.alpha = 64 # Alpha parameter config.lora.dropout = 0.1 ``` ### Quantization ```python config.quantization.enabled = True config.quantization.bits = 4 # 4-bit or 8-bit config.quantization.quant_type = "nf4" # nf4 or fp4 ``` ### Mixed Precision Training ```python config.training.fp16 = True # For NVIDIA GPUs # or config.training.bf16 = True # For newer GPUs ``` ## Monitoring ### Weights & Biases ```bash # Set API key export WANDB_API_KEY=your_key_here # Enable in config python cli.py train --config configs/default.yaml ``` ### TensorBoard ```bash # Start TensorBoard tensorboard --logdir=./runs # Or use Docker Compose docker-compose up tensorboard ``` ## Testing ```bash # Run all tests pytest # Run with coverage pytest --cov=kerdosai --cov-report=html # Run specific tests pytest tests/test_config.py -v ``` ## Troubleshooting ### Out of Memory 1. Reduce batch size: `--batch-size 2` 2. Enable gradient accumulation: `gradient_accumulation_steps: 4` 3. Use quantization: `--quantize` 4. Use smaller model ### Slow Training 1. Enable mixed precision: `fp16: true` 2. Increase batch size if memory allows 3. Use multiple GPUs (see distributed training docs) ### Import Errors ```bash # Ensure virtual environment is activated source venv/bin/activate # Reinstall dependencies pip install -r requirements.txt ``` ## Next Steps - Read the [full documentation](docs/index.md) - Check out [example notebooks](notebooks/) - Join our [community](https://kerdos.in/community) - Report issues on [GitHub](https://github.com/bhaskarvilles/kerdosai/issues)