Training/Training_Documentation.txt · Raymond-dev-546730/MaterialsAnalyst-AI-7B at main

MaterialsAnalyst-AI-7B / Training /Training_Documentation.txt

Update Training/Training_Documentation.txt

11a4806 verified 3 months ago

1.85 kB

	MaterialsAnalyst-AI-7B Training Documentation
	================================================

	Model Training Details
	---------------------

	Base Model: Qwen 2.5 Instruct 7B
	Fine-tuning Method: LoRA (Low-Rank Adaptation)
	Training Infrastructure: Single NVIDIA A100 SXM4 GPU
	Training Duration: Approximately 5.4 hours
	Training Dataset: Custom curated dataset for materials analysis

	Dataset Specifications
	---------------------

	Total Token Count: 6,292,692
	Total Sample Count: 6,000
	Average Tokens/Sample: 1048.78
	Max Token Count: 1,289
	Min Token Count: 922
	Tokens Counted Using: tiktoken (cl100k_base encoding)
	Dataset Creation: Generated using DeepSeekV3 API

	Training Configuration
	---------------------

	LoRA Parameters:
	- Rank: 32
	- Alpha: 64
	- Dropout: 0.1
	- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, lm_head

	Training Hyperparameters:
	- Learning Rate: 5e-5
	- Batch Size: 4
	- Gradient Accumulation: 5
	- Effective Batch Size: 20
	- Max Sequence Length: 2048
	- Epochs: 3
	- Warmup Ratio: 0.01
	- Weight Decay: 0.01
	- Max Grad Norm: 1.0
	- LR Scheduler: Cosine

	Hardware & Environment
	---------------------

	GPU: NVIDIA A100 SXM4 (40GB)
	Operating System: Ubuntu
	CUDA Version: 11.8
	PyTorch Version: 2.7.0
	Compute Capability: 8.0
	Optimization: FP16, Gradient Checkpointing

	Training Performance
	---------------------

	Training Runtime: 5.37 hours (19,348 seconds)
	Train Samples/Second: 0.884
	Train Steps/Second: 0.044
	Training Loss (Final): 0.170
	Validation Loss (Final): 0.136
	Total Training Steps: 855