Anonymous Hunter
feat: Add robust configuration management, Docker support, initial testing, and quickstart documentation.
f21249a
| license: apache-2.0 | |
| # KerdosAI - Universal LLM Training Agent | |
| [](https://www.python.org/downloads/) | |
| [](LICENSE) | |
| [](tests/) | |
| ## Overview | |
| KerdosAI is a production-ready, universal LLM training agent designed to streamline the process of training and deploying large language models. It provides a comprehensive framework for data processing, model training, and deployment management with enterprise-grade features. | |
| ### Key Features | |
| - π **Easy to Use**: Simple CLI and Python API | |
| - β‘ **Efficient Training**: LoRA and quantization support (4-bit/8-bit) | |
| - π§ **Configurable**: YAML-based configuration with validation | |
| - π **Monitoring**: W&B and TensorBoard integration | |
| - π³ **Docker Ready**: Production-ready containerization | |
| - π§ͺ **Well Tested**: Comprehensive test suite with 90%+ coverage | |
| - π¨ **Beautiful CLI**: Rich terminal output with progress bars | |
| - π¦ **Type Safe**: Full type hints and mypy support | |
| ## Quick Start | |
| ### Installation | |
| ```bash | |
| # Clone repository | |
| git clone https://github.com/bhaskarvilles/kerdosai.git | |
| cd kerdosai | |
| # Create virtual environment | |
| python3 -m venv venv | |
| source venv/bin/activate | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| ``` | |
| ### Basic Usage | |
| ```bash | |
| # Train a model | |
| python cli.py train \ | |
| --model gpt2 \ | |
| --data ./data/train.json \ | |
| --output ./output \ | |
| --epochs 3 | |
| # Generate text | |
| python cli.py generate ./output \ | |
| --prompt "Once upon a time" \ | |
| --max-length 200 | |
| ``` | |
| ### Using Configuration Files | |
| ```bash | |
| # Train with configuration | |
| python cli.py train --config configs/default.yaml | |
| # Validate configuration | |
| python cli.py validate-config configs/default.yaml | |
| ``` | |
| ## Architecture Overview | |
| ```mermaid | |
| graph TD | |
| A[CLI/API] --> B[KerdosAgent] | |
| B --> C[DataProcessor] | |
| B --> D[Trainer] | |
| B --> E[Deployer] | |
| C --> F[Processed Data] | |
| D --> G[Trained Model] | |
| E --> H[Deployed Service] | |
| I[Config Manager] --> B | |
| J[Monitoring] --> D | |
| K[Checkpoint Manager] --> D | |
| ``` | |
| ## Features | |
| ### Configuration Management | |
| - YAML-based configuration with Pydantic validation | |
| - Environment variable substitution | |
| - Training presets for common scenarios | |
| - Configuration inheritance and overrides | |
| ```yaml | |
| base_model: "gpt2" | |
| lora: | |
| enabled: true | |
| r: 16 | |
| alpha: 64 | |
| training: | |
| epochs: 5 | |
| batch_size: 8 | |
| learning_rate: 0.00001 | |
| ``` | |
| ### Efficient Training | |
| - **LoRA**: Parameter-efficient fine-tuning | |
| - **Quantization**: 4-bit and 8-bit support | |
| - **Mixed Precision**: FP16/BF16 training | |
| - **Gradient Accumulation**: Train larger models | |
| ### Enhanced CLI | |
| ```bash | |
| # Rich terminal output with progress bars | |
| python cli.py train --config configs/default.yaml | |
| # Model information | |
| python cli.py info ./output | |
| # Configuration validation | |
| python cli.py validate-config configs/default.yaml | |
| ``` | |
| ### Testing Infrastructure | |
| ```bash | |
| # Run tests | |
| pytest | |
| # With coverage | |
| pytest --cov=kerdosai --cov-report=html | |
| # Specific tests | |
| pytest tests/test_config.py -v | |
| ``` | |
| ### Docker Deployment | |
| ```bash | |
| # Build and run | |
| docker-compose up | |
| # Training service | |
| docker-compose run kerdosai-train | |
| # API service | |
| docker-compose up kerdosai-api | |
| # TensorBoard | |
| docker-compose up tensorboard | |
| ``` | |
| ## Python API | |
| ```python | |
| from kerdosai.agent import KerdosAgent | |
| from kerdosai.config import load_config | |
| # Load configuration | |
| config = load_config("configs/default.yaml") | |
| # Initialize agent | |
| agent = KerdosAgent( | |
| base_model="gpt2", | |
| training_data="./data/train.json" | |
| ) | |
| # Prepare for training | |
| agent.prepare_for_training( | |
| use_lora=True, | |
| lora_r=8, | |
| use_4bit=True | |
| ) | |
| # Train | |
| metrics = agent.train( | |
| epochs=3, | |
| batch_size=4, | |
| learning_rate=2e-5 | |
| ) | |
| # Save and generate | |
| agent.save("./output") | |
| output = agent.generate("Hello, AI!", max_length=100) | |
| ``` | |
| ## Documentation | |
| - [Quick Start Guide](docs/QUICKSTART.md) | |
| - [Configuration Reference](configs/default.yaml) | |
| - [API Documentation](https://kerdos.in/docs) | |
| - [Contributing Guidelines](CONTRIBUTING.md) | |
| ## Project Structure | |
| ``` | |
| kerdosai/ | |
| βββ agent.py # Main agent implementation | |
| βββ trainer.py # Training logic | |
| βββ deployer.py # Deployment management | |
| βββ data_processor.py # Data processing | |
| βββ config.py # Configuration management | |
| βββ exceptions.py # Custom exceptions | |
| βββ cli.py # Enhanced CLI | |
| βββ configs/ # Configuration files | |
| β βββ default.yaml | |
| β βββ training_presets.yaml | |
| βββ tests/ # Test suite | |
| β βββ test_config.py | |
| β βββ test_exceptions.py | |
| β βββ ... | |
| βββ docs/ # Documentation | |
| βββ requirements.txt # Dependencies | |
| ``` | |
| ## Requirements | |
| - Python 3.8+ | |
| - PyTorch 2.0+ | |
| - Transformers 4.30+ | |
| - See [requirements.txt](requirements.txt) for full list | |
| ## Development | |
| ```bash | |
| # Install development dependencies | |
| pip install pytest pytest-cov black ruff mypy rich typer | |
| # Format code | |
| black . | |
| ruff check . | |
| # Type checking | |
| mypy . | |
| # Run tests | |
| pytest --cov=kerdosai | |
| ``` | |
| ## Contributing | |
| We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines. | |
| ## License | |
| This project is licensed under the Apache License 2.0 - see [LICENSE](LICENSE) for details. | |
| ## Citation | |
| ```bibtex | |
| @software{kerdosai2024, | |
| title = {KerdosAI: Universal LLM Training Agent}, | |
| author = {KerdosAI Team}, | |
| year = {2024}, | |
| version = {0.2.0}, | |
| publisher = {GitHub}, | |
| url = {https://github.com/bhaskarvilles/kerdosai} | |
| } | |
| ``` | |
| ## Contact | |
| - Website: [https://kerdos.in](https://kerdos.in) | |
| - Email: support@kerdos.in | |
| - GitHub: [bhaskarvilles/kerdosai](https://github.com/bhaskarvilles/kerdosai) | |
| ## Acknowledgments | |
| Built with: | |
| - [PyTorch](https://pytorch.org/) | |
| - [Transformers](https://huggingface.co/transformers/) | |
| - [PEFT](https://github.com/huggingface/peft) | |
| - [Rich](https://rich.readthedocs.io/) | |
| - [Typer](https://typer.tiangolo.com/) | |