ml-clara / docs /getting_started.md
dl3239491's picture
Upload folder using huggingface_hub
30c14cd verified
---
layout: default
title: Getting Started
permalink: /getting_started/
---
# Getting Started with CLaRa
This guide will help you get started with CLaRa, from installation to running your first training.
## Installation
### Prerequisites
- Python 3.10+
- CUDA-compatible GPU (recommended)
- PyTorch 2.0+
- CUDA 11.8 or 12.x
### Step 1: Create Conda Environment
```bash
env=clara
conda create -n $env python=3.10 -y
conda activate $env
```
### Step 2: Install Dependencies
```bash
pip install -r requirements.txt
```
Key dependencies include:
- `torch>=2.0`
- `transformers>=4.20`
- `deepspeed>=0.18`
- `flash-attn>=2.8.0`
- `accelerate>=1.10.1`
- `peft>=0.17.1`
### Step 3: Set Environment Variables
```bash
export PYTHONPATH=/path/to/clara:$PYTHONPATH
```
## Quick Start
### 1. Prepare Your Data
CLaRa uses JSONL format for training data. See the [Training Guide](./training.md) for data format details.
### 2. Train Stage 1: Compression Pretraining
```bash
bash scripts/train_pretraining.sh
```
### 3. Train Stage 2: Instruction Tuning
```bash
bash scripts/train_instruction_tuning.sh
```
### 4. Train Stage 3: End-to-End Training
```bash
bash scripts/train_stage_end_to_end.sh
```
### 5. Run Inference
See the [Inference Guide](./inference.md) for examples of using all three model stages.
## Next Steps
- [Training Guide](./training.md) - Detailed training instructions and data formats
- [Inference Guide](./inference.md) - Inference examples for all model stages