Text Generation
PEFT
Safetensors
GGUF
English
materialsanalyst-ai-7b
MaterialsAnalyst-AI-7B
materials-science
computational-materials
materials-analysis
chain-of-thought
reasoning-model
property-prediction
materials-discovery
crystal-structure
materials-informatics
scientific-ai
7b
quantized
fine-tuned
lora
json-mode
structured-output
materials-engineering
band-gap-prediction
computational-chemistry
materials-characterization
| MaterialsAnalyst-AI-7B Training Documentation | |
| ================================================ | |
| Model Training Details | |
| --------------------- | |
| Base Model: Qwen 2.5 Instruct 7B | |
| Fine-tuning Method: LoRA (Low-Rank Adaptation) | |
| Training Infrastructure: Single NVIDIA A100 SXM4 GPU | |
| Training Duration: Approximately 5.4 hours | |
| Training Dataset: Custom curated dataset for materials analysis | |
| Dataset Specifications | |
| --------------------- | |
| Total Token Count: 6,292,692 | |
| Total Sample Count: 6,000 | |
| Average Tokens/Sample: 1048.78 | |
| Max Token Count: 1,289 | |
| Min Token Count: 922 | |
| Tokens Counted Using: tiktoken (cl100k_base encoding) | |
| Dataset Creation: Generated using DeepSeekV3 API | |
| Training Configuration | |
| --------------------- | |
| LoRA Parameters: | |
| - Rank: 32 | |
| - Alpha: 64 | |
| - Dropout: 0.1 | |
| - Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, lm_head | |
| Training Hyperparameters: | |
| - Learning Rate: 5e-5 | |
| - Batch Size: 4 | |
| - Gradient Accumulation: 5 | |
| - Effective Batch Size: 20 | |
| - Max Sequence Length: 2048 | |
| - Epochs: 3 | |
| - Warmup Ratio: 0.01 | |
| - Weight Decay: 0.01 | |
| - Max Grad Norm: 1.0 | |
| - LR Scheduler: Cosine | |
| Hardware & Environment | |
| --------------------- | |
| GPU: NVIDIA A100 SXM4 (40GB) | |
| Operating System: Ubuntu | |
| CUDA Version: 11.8 | |
| PyTorch Version: 2.7.0 | |
| Compute Capability: 8.0 | |
| Optimization: FP16, Gradient Checkpointing | |
| Training Performance | |
| --------------------- | |
| Training Runtime: 5.37 hours (19,348 seconds) | |
| Train Samples/Second: 0.884 | |
| Train Steps/Second: 0.044 | |
| Training Loss (Final): 0.170 | |
| Validation Loss (Final): 0.136 | |
| Total Training Steps: 855 |