Spaces:
Sleeping
Long Chain-of-Thought (CoT) Feature Implementation
Overview
This implementation adds Long Chain-of-Thought (CoT) capability to the data synthesis pipeline when using DeepSeek R1 as the base model. The feature enables multi-step reasoning for enhanced context-aware responses.
Feature Description
Long CoT Mode: When enabled, the system generates synthetic data with extended reasoning chains
DeepSeek R1 Integration: Exclusive use of DeepSeek-R1 model for CoT data generation
Enhanced Training: Produces models with improved long-context reasoning capabilities
Implementation Details
Configuration Options
Backend Configuration:
Set
is_cot=Trueintrainprocess_service.pyinitializationConfigure via
train_for_user.shwith--is_cot True/FalseEnvironment variables in
lpm_kernel/L2/.env:
DEEPSEEK_MODEL_NAME=deepseek-* DEEPSEEK_API_KEY=your_api_key DEEPSEEK_BASE_URL=your_base_url
Data Synthesis Pipeline
Supported Data Types:
SelfQA data
Preference data
Diversity data
Prompt Structure:
<think>reasoning_content</think>
<answer>final_content</answer>
Model Whitelisting:
- Only DeepSeek-R1 is allowed for CoT data generation
Code Changes
Modified Files:
selfqa.py:Added
is_cotinitialization optionUpdated prompt templates
Modified response handling
preference_QA_generate.py:Added CoT support
Enhanced question extraction
diversity_data_generator.py:Added CoT templates
Updated generation logic
New Functions:
Unified
get_remote_response()functionEnhanced logging with tqdm integration