metadata
layout: default
title: Getting Started
permalink: /getting_started/
Getting Started with CLaRa
This guide will help you get started with CLaRa, from installation to running your first training.
Installation
Prerequisites
- Python 3.10+
- CUDA-compatible GPU (recommended)
- PyTorch 2.0+
- CUDA 11.8 or 12.x
Step 1: Create Conda Environment
env=clara
conda create -n $env python=3.10 -y
conda activate $env
Step 2: Install Dependencies
pip install -r requirements.txt
Key dependencies include:
torch>=2.0transformers>=4.20deepspeed>=0.18flash-attn>=2.8.0accelerate>=1.10.1peft>=0.17.1
Step 3: Set Environment Variables
export PYTHONPATH=/path/to/clara:$PYTHONPATH
Quick Start
1. Prepare Your Data
CLaRa uses JSONL format for training data. See the Training Guide for data format details.
2. Train Stage 1: Compression Pretraining
bash scripts/train_pretraining.sh
3. Train Stage 2: Instruction Tuning
bash scripts/train_instruction_tuning.sh
4. Train Stage 3: End-to-End Training
bash scripts/train_stage_end_to_end.sh
5. Run Inference
See the Inference Guide for examples of using all three model stages.
Next Steps
- Training Guide - Detailed training instructions and data formats
- Inference Guide - Inference examples for all model stages