Spaces:
Sleeping
Sleeping
File size: 2,052 Bytes
01d5a5d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
# Long Chain-of-Thought (CoT) Feature Implementation
## Overview
This implementation adds Long Chain-of-Thought (CoT) capability to the data synthesis pipeline when using DeepSeek R1 as the base model. The feature enables multi-step reasoning for enhanced context-aware responses.
## Feature Description
- **Long CoT Mode**: When enabled, the system generates synthetic data with extended reasoning chains
- **DeepSeek R1 Integration**: Exclusive use of DeepSeek-R1 model for CoT data generation
- **Enhanced Training**: Produces models with improved long-context reasoning capabilities
## Implementation Details
### Configuration Options
1. **Backend Configuration**:
- Set `is_cot=True` in `trainprocess_service.py` initialization
- Configure via `train_for_user.sh` with `--is_cot True/False`
- Environment variables in `lpm_kernel/L2/.env`:
```
DEEPSEEK_MODEL_NAME=deepseek-*
DEEPSEEK_API_KEY=your_api_key
DEEPSEEK_BASE_URL=your_base_url
```
### Data Synthesis Pipeline
1. **Supported Data Types**:
- SelfQA data
- Preference data
- Diversity data
2. **Prompt Structure**:
```
<think>reasoning_content</think>
<answer>final_content</answer>
```
3. **Model Whitelisting**:
- Only DeepSeek-R1 is allowed for CoT data generation
### Code Changes
1. **Modified Files**:
- `selfqa.py`:
- Added `is_cot` initialization option
- Updated prompt templates
- Modified response handling
- `preference_QA_generate.py`:
- Added CoT support
- Enhanced question extraction
- `diversity_data_generator.py`:
- Added CoT templates
- Updated generation logic
2. **New Functions**:
- Unified `get_remote_response()` function
- Enhanced logging with tqdm integration
|