Spaces:
Running
Running
| # Long Chain-of-Thought (CoT) Feature Implementation | |
| ## Overview | |
| This implementation adds Long Chain-of-Thought (CoT) capability to the data synthesis pipeline when using DeepSeek R1 as the base model. The feature enables multi-step reasoning for enhanced context-aware responses. | |
| ## Feature Description | |
| - **Long CoT Mode**: When enabled, the system generates synthetic data with extended reasoning chains | |
| - **DeepSeek R1 Integration**: Exclusive use of DeepSeek-R1 model for CoT data generation | |
| - **Enhanced Training**: Produces models with improved long-context reasoning capabilities | |
| ## Implementation Details | |
| ### Configuration Options | |
| 1. **Backend Configuration**: | |
| - Set `is_cot=True` in `trainprocess_service.py` initialization | |
| - Configure via `train_for_user.sh` with `--is_cot True/False` | |
| - Environment variables in `lpm_kernel/L2/.env`: | |
| ``` | |
| DEEPSEEK_MODEL_NAME=deepseek-* | |
| DEEPSEEK_API_KEY=your_api_key | |
| DEEPSEEK_BASE_URL=your_base_url | |
| ``` | |
| ### Data Synthesis Pipeline | |
| 1. **Supported Data Types**: | |
| - SelfQA data | |
| - Preference data | |
| - Diversity data | |
| 2. **Prompt Structure**: | |
| ``` | |
| <think>reasoning_content</think> | |
| <answer>final_content</answer> | |
| ``` | |
| 3. **Model Whitelisting**: | |
| - Only DeepSeek-R1 is allowed for CoT data generation | |
| ### Code Changes | |
| 1. **Modified Files**: | |
| - `selfqa.py`: | |
| - Added `is_cot` initialization option | |
| - Updated prompt templates | |
| - Modified response handling | |
| - `preference_QA_generate.py`: | |
| - Added CoT support | |
| - Enhanced question extraction | |
| - `diversity_data_generator.py`: | |
| - Added CoT templates | |
| - Updated generation logic | |
| 2. **New Functions**: | |
| - Unified `get_remote_response()` function | |
| - Enhanced logging with tqdm integration | |