Text Generation
PEFT
Safetensors
GGUF
English
materialsanalyst-ai-7b
MaterialsAnalyst-AI-7B
materials-science
computational-materials
materials-analysis
chain-of-thought
reasoning-model
property-prediction
materials-discovery
crystal-structure
materials-informatics
scientific-ai
7b
quantized
fine-tuned
lora
json-mode
structured-output
materials-engineering
band-gap-prediction
computational-chemistry
materials-characterization
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,154 +1,328 @@
|
|
| 1 |
-
--
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
---
|
| 4 |
|
| 5 |
-

|
| 6 |
|
| 7 |
-
|
| 8 |
|
| 9 |
-
|
| 10 |
|
| 11 |
-
##
|
| 12 |
|
| 13 |
-
The
|
| 14 |
|
| 15 |
-
|
| 16 |
-
2. The model engages in chain-of-thought reasoning about the material's properties
|
| 17 |
-
3. You receive a structured, comprehensive analysis with practical applications
|
| 18 |
|
| 19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
|
| 21 |
-
|
| 22 |
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
* **Property Correlation**: Identifies relationships between different material properties and their implications
|
| 27 |
-
* **Application Prediction**: Suggests practical applications based on material characteristics
|
| 28 |
-
* **Stability Assessment**: Evaluates thermodynamic and structural stability indicators
|
| 29 |
-
* **Performance Benchmarking**: Compares materials against industry standards and competing materials
|
| 30 |
-
* **Materials Database Integration**: Optimized for standard materials database formats (Materials Project, AFLOW, etc.)
|
| 31 |
-
* **Structured Output Format**: Consistently delivers well-organized, hierarchical materials analysis with clear section delineation
|
| 32 |
|
| 33 |
## Use Cases
|
| 34 |
|
| 35 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 36 |
|
| 37 |
-
*
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
* **Industry professionals** evaluating material selection for products
|
| 42 |
-
* **Database curators** ensuring comprehensive materials documentation
|
| 43 |
-
* **Computational materials scientists** interpreting simulation results
|
| 44 |
-
* **Materials informatics researchers** building automated analysis pipelines
|
| 45 |
|
| 46 |
-
|
|
|
|
|
|
|
|
|
|
| 47 |
|
| 48 |
-
|
| 49 |
|
|
|
|
| 50 |
```json
|
| 51 |
{
|
| 52 |
-
"material_id": "mp-8062",
|
| 53 |
-
"formula": "SiC",
|
| 54 |
-
"elements": [
|
| 55 |
-
"
|
| 56 |
-
"
|
| 57 |
-
|
| 58 |
-
"
|
| 59 |
-
"
|
| 60 |
-
"
|
| 61 |
-
"
|
| 62 |
-
"
|
| 63 |
-
"
|
| 64 |
-
"
|
| 65 |
-
"
|
| 66 |
-
"
|
| 67 |
-
"
|
| 68 |
-
"
|
| 69 |
-
"
|
| 70 |
-
"
|
| 71 |
-
"
|
| 72 |
-
"
|
| 73 |
-
"specific_heat": 0.69,
|
| 74 |
-
"is_superconductor": false,
|
| 75 |
-
"band_gap_type": "Indirect"
|
| 76 |
}
|
| 77 |
```
|
| 78 |
-
|
| 79 |
-
|
| 80 |
-
|
| 81 |
-
|
| 82 |
-
|
| 83 |
-
``
|
| 84 |
-
|
| 85 |
-
|
| 86 |
-
|
| 87 |
-
|
| 88 |
-
|
| 89 |
-
|
| 90 |
-
Regarding thermal behavior, the thermal conductivity of 490 W/m·K and expansion coefficient of 4.2e-6 K^-1 indicate excellent heat dissipation and dimensional stability under temperature changes. This would make it suitable for high-temperature applications like thermal management in electronics or coatings.
|
| 91 |
-
The magnetic properties are listed as non-magnetic, which means it won't interact significantly with magnetic fields. This would affect applications in magneto-electronics but is generally beneficial for general semiconductor uses.
|
| 92 |
-
Additionally, the material is marked as not superconducting, which is significant because it rules out cryogenic or quantum computing applications where superconductivity is required.
|
| 93 |
-
Considering all these properties together, this material would be particularly valuable for high-power electronics, optoelectronics, and thermal management due to its combination of wide bandgap, high thermal conductivity, and mechanical robustness. It might compete with other materials like GaN or AlN but offers advantages in cost (Si-based) and thermal stability.
|
| 94 |
-
</think>
|
| 95 |
```
|
| 96 |
|
| 97 |
-
|
| 98 |
-
```markdown
|
| 99 |
-
<answer>
|
| 100 |
-
**Summary of SiC (ID: mp-8062):**
|
| 101 |
-
1. **Material Type & Composition**
|
| 102 |
-
- Silicon carbide (SiC), a widely studied wide-bandgap semiconductor compound.
|
| 103 |
-
2. **Crystal Structure**
|
| 104 |
-
- Hexagonal structure with space group P63mc, indicating anisotropic properties and high symmetry.
|
| 105 |
-
3. **Electronic Properties**
|
| 106 |
-
- Indirect bandgap of 3.26 eV, suitable for optoelectronics and high-power devices.
|
| 107 |
-
4. **Thermodynamic Stability**
|
| 108 |
-
- Highly stable (formation energy: -0.73 eV/atom), making it practical for synthesis and use.
|
| 109 |
-
5. **Mechanical Properties**
|
| 110 |
-
- Exceptional stiffness (elastic modulus: 448 GPa) and resistance to compression (bulk modulus: 220 GPa).
|
| 111 |
-
6. **Thermal Properties**
|
| 112 |
-
- Outstanding thermal conductivity (490 W/m·K) and low expansion coefficient (4.2e-6 K^-1), ideal for heat dissipation.
|
| 113 |
-
7. **Magnetic & Conductive Properties**
|
| 114 |
-
- Non-magnetic and non-superconducting, limiting use in magnetic applications but beneficial for general electronics.
|
| 115 |
-
8. **Applications**
|
| 116 |
-
- High-power electronics, optoelectronics, thermal management systems, and abrasion-resistant coatings.
|
| 117 |
-
**Verdict**: SiC is a high-performance material with exceptional thermal, mechanical, and electronic properties, making it ideal for demanding applications like power devices and high-temperature environments. Its stability and robustness give it an edge over competing wide-bandgap materials.
|
| 118 |
-
</answer>
|
| 119 |
```
|
|
|
|
| 120 |
|
| 121 |
-
|
|
|
|
|
|
|
| 122 |
|
| 123 |
-
|
|
|
|
|
|
|
| 124 |
|
| 125 |
-
*
|
| 126 |
-
|
| 127 |
-
|
| 128 |
-
|
| 129 |
-
|
| 130 |
-
|
| 131 |
-
|
| 132 |
-
|
| 133 |
-
|
| 134 |
-
|
| 135 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 136 |
|
| 137 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 138 |
|
| 139 |
-
|
| 140 |
-
* **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
|
| 141 |
-
* **Training Infrastructure**: Single NVIDIA A100 SXM4 GPU
|
| 142 |
-
* **Training Duration**: Around 5.4 hours
|
| 143 |
-
* **Training Dataset**: Custom curated dataset specifically for materials analysis
|
| 144 |
-
* **Total Token Count**: 6,441,671
|
| 145 |
-
* **Total Sample Count**: 6,000
|
| 146 |
-
* **Average Tokens Per Sample**: 1073.61
|
| 147 |
-
* **Dataset Creation**: Generated using DeepSeekV3 API
|
| 148 |
|
| 149 |
-
##
|
| 150 |
|
| 151 |
-
|
|
|
|
|
|
|
| 152 |
|
| 153 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 154 |
|
|
|
|
|
|
| 1 |
+
# MaterialsAnalyst-AI-7B
|
|
|
|
|
|
|
| 2 |
|
| 3 |
+

|
| 4 |
|
| 5 |
+
A specialized **open-source** AI model designed to assist materials scientists and researchers in comprehensive analysis and interpretation of materials data. Built on Qwen 2.5 Instruct 7B and fine-tuned with LoRA (Low-Rank Adaptation), MaterialsAnalyst-AI-7B delivers expert-level materials property analysis and actionable insights from complex materials databases.
|
| 6 |
|
| 7 |
+
🤗 **Available on Hugging Face**: [MaterialsAnalyst-AI-7B](https://huggingface.co/your-username/MaterialsAnalyst-AI-7B)
|
| 8 |
|
| 9 |
+
## Overview
|
| 10 |
|
| 11 |
+
MaterialsAnalyst-AI-7B transforms raw materials data into comprehensive, structured analyses through advanced chain-of-thought reasoning. The model excels at interpreting relationships between material properties, predicting applications, and providing clear insights that accelerate materials research and development.
|
| 12 |
|
| 13 |
+
### Key Capabilities
|
|
|
|
|
|
|
| 14 |
|
| 15 |
+
- **Multi-Property Analysis**: Interprets electronic, mechanical, thermal, structural, and magnetic characteristics
|
| 16 |
+
- **Property Correlation**: Identifies relationships between different material properties and their implications
|
| 17 |
+
- **Application Prediction**: Suggests practical applications based on material characteristics
|
| 18 |
+
- **Stability Assessment**: Evaluates thermodynamic and structural stability indicators
|
| 19 |
+
- **Performance Benchmarking**: Compares materials against industry standards
|
| 20 |
+
- **Structured Reasoning**: Provides both detailed analysis and concise conclusions
|
| 21 |
|
| 22 |
+
## How It Works
|
| 23 |
|
| 24 |
+
1. **Input**: Provide materials data in JSON format with properties, structure, and characteristics
|
| 25 |
+
2. **Analysis**: The model performs chain-of-thought reasoning about material properties and relationships
|
| 26 |
+
3. **Output**: Receive structured analysis with practical insights and application recommendations
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
|
| 28 |
## Use Cases
|
| 29 |
|
| 30 |
+
**Research & Development**
|
| 31 |
+
- Materials screening for specific applications
|
| 32 |
+
- Property correlation analysis
|
| 33 |
+
- Comparative materials assessment
|
| 34 |
+
- Database curation and documentation
|
| 35 |
|
| 36 |
+
**Education & Training**
|
| 37 |
+
- Graduate student research support
|
| 38 |
+
- Materials characterization learning
|
| 39 |
+
- Computational results interpretation
|
|
|
|
|
|
|
|
|
|
|
|
|
| 40 |
|
| 41 |
+
**Industry Applications**
|
| 42 |
+
- Material selection for product development
|
| 43 |
+
- R&D pipeline automation
|
| 44 |
+
- Technical documentation generation
|
| 45 |
|
| 46 |
+
## Example Analysis
|
| 47 |
|
| 48 |
+
### Input Data
|
| 49 |
```json
|
| 50 |
{
|
| 51 |
+
"material_id": "mp-8062",
|
| 52 |
+
"formula": "SiC",
|
| 53 |
+
"elements": ["Si", "C"],
|
| 54 |
+
"spacegroup": "P63mc",
|
| 55 |
+
"band_gap": 3.26,
|
| 56 |
+
"formation_energy_per_atom": -0.73,
|
| 57 |
+
"density": 3.21,
|
| 58 |
+
"volume": 41.2,
|
| 59 |
+
"nsites": 8,
|
| 60 |
+
"is_stable": true,
|
| 61 |
+
"elastic_modulus": 448,
|
| 62 |
+
"bulk_modulus": 220,
|
| 63 |
+
"thermal_expansion": 4.2e-06,
|
| 64 |
+
"electron_affinity": 4.0,
|
| 65 |
+
"ionization_energy": 6.7,
|
| 66 |
+
"crystal_system": "Hexagonal",
|
| 67 |
+
"magnetic_property": "Non-magnetic",
|
| 68 |
+
"thermal_conductivity": 490,
|
| 69 |
+
"specific_heat": 0.69,
|
| 70 |
+
"is_superconductor": false,
|
| 71 |
+
"band_gap_type": "Indirect"
|
|
|
|
|
|
|
|
|
|
| 72 |
}
|
| 73 |
```
|
| 74 |
+
|
| 75 |
+
### Model Output
|
| 76 |
+
|
| 77 |
+
The model provides dual-structured output:
|
| 78 |
+
|
| 79 |
+
**Reasoning Process (`<think>` section)**
|
| 80 |
+
```
|
| 81 |
+
Analyzing SiC composition and hexagonal crystal structure (P63mc)...
|
| 82 |
+
Electronic properties: 3.26 eV indirect bandgap indicates wide-bandgap semiconductor behavior...
|
| 83 |
+
Thermodynamic stability: -0.73 eV/atom formation energy shows strong bonding...
|
| 84 |
+
Mechanical properties: High elastic modulus (448 GPa) suggests exceptional stiffness...
|
| 85 |
+
Thermal behavior: 490 W/m·K conductivity ideal for heat dissipation applications...
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 86 |
```
|
| 87 |
|
| 88 |
+
**Structured Analysis (`<answer>` section)**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 89 |
```
|
| 90 |
+
**SiC Materials Analysis (ID: mp-8062)**
|
| 91 |
|
| 92 |
+
**Composition & Structure**
|
| 93 |
+
- Silicon carbide with hexagonal crystal structure (P63mc space group)
|
| 94 |
+
- High symmetry configuration with anisotropic properties
|
| 95 |
|
| 96 |
+
**Electronic Characteristics**
|
| 97 |
+
- Wide bandgap semiconductor (3.26 eV, indirect)
|
| 98 |
+
- Suitable for high-power and optoelectronic applications
|
| 99 |
|
| 100 |
+
**Stability & Performance**
|
| 101 |
+
- Thermodynamically stable (-0.73 eV/atom formation energy)
|
| 102 |
+
- Exceptional mechanical stiffness (448 GPa elastic modulus)
|
| 103 |
+
- Outstanding thermal management (490 W/m·K conductivity)
|
| 104 |
+
|
| 105 |
+
**Recommended Applications**
|
| 106 |
+
- High-power electronics and devices
|
| 107 |
+
- Thermal management systems
|
| 108 |
+
- Optoelectronic components
|
| 109 |
+
- Abrasion-resistant coatings
|
| 110 |
+
|
| 111 |
+
**Key Advantages**
|
| 112 |
+
Superior combination of thermal, mechanical, and electronic properties makes SiC ideal for demanding high-temperature and high-power applications.
|
| 113 |
+
```
|
| 114 |
+
|
| 115 |
+
## Repository Contents
|
| 116 |
+
|
| 117 |
+
```
|
| 118 |
+
MaterialsAnalyst-AI-7B/
|
| 119 |
+
├── Model_Weights/
|
| 120 |
+
│ ├── llama.cpp/ # LLaMA.cpp compatible weights (.gguf format)
|
| 121 |
+
│ ├── safetensors/ # SafeTensors format models
|
| 122 |
+
│ └── LoRA_adapter/ # LoRA adapter weights
|
| 123 |
+
├── Scripts/
|
| 124 |
+
│ ├── Inference_llama.cpp.py # LLaMA.cpp deployment script
|
| 125 |
+
│ └── Inference_safetensors.py # SafeTensors deployment script
|
| 126 |
+
├── Data/
|
| 127 |
+
│ └── Train-Ready.jsonl # Complete training dataset
|
| 128 |
+
├── Training/
|
| 129 |
+
│ └── Training_Logs.txt # Training process logs
|
| 130 |
+
└── README.md
|
| 131 |
+
```
|
| 132 |
|
| 133 |
+
## Technical Specifications
|
| 134 |
+
|
| 135 |
+
**Base Architecture**
|
| 136 |
+
- **Foundation Model**: Qwen 2.5 Instruct 7B
|
| 137 |
+
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
|
| 138 |
+
- **Parameters**: 7 billion parameters
|
| 139 |
+
|
| 140 |
+
**Training Details**
|
| 141 |
+
- **Infrastructure**: Single NVIDIA A100 SXM4 GPU
|
| 142 |
+
- **Training Duration**: ~5.4 hours
|
| 143 |
+
- **Dataset Size**: 6,000 samples (6.4M tokens)
|
| 144 |
+
- **Average Sample Length**: 1,074 tokens
|
| 145 |
+
- **Data Generation**: DeepSeekV3 API
|
| 146 |
+
|
| 147 |
+
**Supported Formats**
|
| 148 |
+
- Materials Project database format
|
| 149 |
+
- AFLOW database format
|
| 150 |
+
- Custom JSON materials data
|
| 151 |
+
- Hugging Face Transformers integration
|
| 152 |
+
|
| 153 |
+
## Installation & Requirements
|
| 154 |
+
|
| 155 |
+
### Basic Requirements
|
| 156 |
+
```bash
|
| 157 |
+
pip install torch transformers accelerate
|
| 158 |
+
pip install safetensors
|
| 159 |
+
pip install numpy pandas
|
| 160 |
+
```
|
| 161 |
+
|
| 162 |
+
### For CUDA GPU Support
|
| 163 |
+
If you have NVIDIA GPUs with CUDA support:
|
| 164 |
+
```bash
|
| 165 |
+
# Install PyTorch with CUDA support (replace cu118 with your CUDA version)
|
| 166 |
+
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
|
| 167 |
+
|
| 168 |
+
# For faster inference with GPU acceleration
|
| 169 |
+
pip install bitsandbytes
|
| 170 |
+
```
|
| 171 |
+
|
| 172 |
+
### For LLaMA.cpp Deployment
|
| 173 |
+
```bash
|
| 174 |
+
# Install llama-cpp-python for optimized CPU/GPU inference
|
| 175 |
+
pip install llama-cpp-python
|
| 176 |
+
|
| 177 |
+
# For GPU acceleration with llama.cpp
|
| 178 |
+
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python --force-reinstall --no-cache-dir
|
| 179 |
+
```
|
| 180 |
+
|
| 181 |
+
### Optional Dependencies
|
| 182 |
+
```bash
|
| 183 |
+
# For advanced materials data processing
|
| 184 |
+
pip install pymatgen
|
| 185 |
+
pip install matminer
|
| 186 |
+
pip install ase # Atomic Simulation Environment
|
| 187 |
+
```
|
| 188 |
|
| 189 |
+
## Quick Start
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 190 |
|
| 191 |
+
### Option 1: Using Hugging Face Transformers (Recommended)
|
| 192 |
|
| 193 |
+
```python
|
| 194 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 195 |
+
import torch
|
| 196 |
|
| 197 |
+
# Load model and tokenizer
|
| 198 |
+
model_name = "your-username/MaterialsAnalyst-AI-7B"
|
| 199 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 200 |
+
model_name,
|
| 201 |
+
torch_dtype=torch.float16,
|
| 202 |
+
device_map="auto",
|
| 203 |
+
trust_remote_code=True
|
| 204 |
+
)
|
| 205 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
|
| 206 |
+
|
| 207 |
+
# Prepare your materials data
|
| 208 |
+
materials_data = """
|
| 209 |
+
{
|
| 210 |
+
"material_id": "mp-8062",
|
| 211 |
+
"formula": "SiC",
|
| 212 |
+
"elements": ["Si", "C"],
|
| 213 |
+
"spacegroup": "P63mc",
|
| 214 |
+
"band_gap": 3.26,
|
| 215 |
+
"formation_energy_per_atom": -0.73,
|
| 216 |
+
"density": 3.21,
|
| 217 |
+
"elastic_modulus": 448,
|
| 218 |
+
"bulk_modulus": 220,
|
| 219 |
+
"thermal_conductivity": 490,
|
| 220 |
+
"crystal_system": "Hexagonal",
|
| 221 |
+
"magnetic_property": "Non-magnetic"
|
| 222 |
+
}
|
| 223 |
+
"""
|
| 224 |
+
|
| 225 |
+
# Generate analysis
|
| 226 |
+
prompt = f"USER: {materials_data}\nASSISTANT:"
|
| 227 |
+
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
| 228 |
+
outputs = model.generate(
|
| 229 |
+
**inputs,
|
| 230 |
+
max_new_tokens=3000,
|
| 231 |
+
temperature=0.7,
|
| 232 |
+
top_p=0.9,
|
| 233 |
+
repetition_penalty=1.1,
|
| 234 |
+
do_sample=True
|
| 235 |
+
)
|
| 236 |
+
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
| 237 |
+
print(response.split("ASSISTANT:")[-1].strip())
|
| 238 |
+
```
|
| 239 |
+
|
| 240 |
+
### Option 2: Using LLaMA.cpp (For CPU/Optimized Inference)
|
| 241 |
+
|
| 242 |
+
```python
|
| 243 |
+
from llama_cpp import Llama
|
| 244 |
+
|
| 245 |
+
# Load model (download .gguf file from the repo)
|
| 246 |
+
model_path = "path/to/MaterialsAnalyst-AI-7B.gguf"
|
| 247 |
+
llm = Llama(
|
| 248 |
+
model_path=model_path,
|
| 249 |
+
n_gpu_layers=29, # Adjust based on your GPU memory
|
| 250 |
+
n_ctx=10000,
|
| 251 |
+
n_threads=4
|
| 252 |
+
)
|
| 253 |
+
|
| 254 |
+
# Prepare your materials data
|
| 255 |
+
materials_data = """
|
| 256 |
+
{
|
| 257 |
+
"material_id": "mp-8062",
|
| 258 |
+
"formula": "SiC",
|
| 259 |
+
"elements": ["Si", "C"],
|
| 260 |
+
"spacegroup": "P63mc",
|
| 261 |
+
"band_gap": 3.26,
|
| 262 |
+
"formation_energy_per_atom": -0.73,
|
| 263 |
+
"density": 3.21,
|
| 264 |
+
"elastic_modulus": 448,
|
| 265 |
+
"bulk_modulus": 220,
|
| 266 |
+
"thermal_conductivity": 490,
|
| 267 |
+
"crystal_system": "Hexagonal",
|
| 268 |
+
"magnetic_property": "Non-magnetic"
|
| 269 |
+
}
|
| 270 |
+
"""
|
| 271 |
+
|
| 272 |
+
# Generate analysis
|
| 273 |
+
prompt = f"USER: {materials_data}\nASSISTANT:"
|
| 274 |
+
output = llm(
|
| 275 |
+
prompt,
|
| 276 |
+
max_tokens=3000,
|
| 277 |
+
temperature=0.7,
|
| 278 |
+
top_p=0.9,
|
| 279 |
+
repeat_penalty=1.1
|
| 280 |
+
)
|
| 281 |
+
result = output.get("choices", [{}])[0].get("text", "").strip()
|
| 282 |
+
print(result)
|
| 283 |
+
```
|
| 284 |
+
|
| 285 |
+
## Getting Started
|
| 286 |
+
|
| 287 |
+
1. **Install dependencies**
|
| 288 |
+
```bash
|
| 289 |
+
pip install torch transformers accelerate safetensors
|
| 290 |
+
```
|
| 291 |
+
|
| 292 |
+
2. **Download the model**
|
| 293 |
+
- Option A: Use Hugging Face Hub (automatic download)
|
| 294 |
+
- Option B: Clone this repository for local files
|
| 295 |
+
|
| 296 |
+
3. **Prepare your materials data**
|
| 297 |
+
- Format as JSON with material properties
|
| 298 |
+
- Include relevant structural, electronic, and mechanical data
|
| 299 |
+
- Common sources: Materials Project, AFLOW, DFT calculations, experimental databases
|
| 300 |
+
|
| 301 |
+
4. **Run analysis**
|
| 302 |
+
- Use the provided scripts in `/Scripts/` folder
|
| 303 |
+
- Or integrate the code examples above into your workflow
|
| 304 |
+
|
| 305 |
+
5. **Customize your analysis**
|
| 306 |
+
- Modify the JSON input with your specific materials data
|
| 307 |
+
- Adjust generation parameters (temperature, top_p) for different output styles
|
| 308 |
+
|
| 309 |
+
## License
|
| 310 |
+
|
| 311 |
+
This project is licensed under the Apache 2.0 License.
|
| 312 |
+
|
| 313 |
+
## Citation
|
| 314 |
+
|
| 315 |
+
If you use MaterialsAnalyst-AI-7B in your research, please cite:
|
| 316 |
+
|
| 317 |
+
```bibtex
|
| 318 |
+
@software{materialsanalyst_ai_7b,
|
| 319 |
+
title={MaterialsAnalyst-AI-7B: Specialized AI for Materials Analysis},
|
| 320 |
+
author={Mike and Oregon State University Materials Modeling and Development Group},
|
| 321 |
+
year={2024},
|
| 322 |
+
license={Apache-2.0}
|
| 323 |
+
}
|
| 324 |
+
```
|
| 325 |
+
|
| 326 |
+
---
|
| 327 |
|
| 328 |
+
**Developed by**: Mike in collaboration with Oregon State University Materials Modeling and Development Group
|