Update README.md

Browse files

Files changed (1) hide show

README.md +35 -64

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ license: apache-2.0
 A specialized **open source** AI model designed to assist materials scientists and researchers in comprehensive analysis and interpretation of materials data. Built on Qwen 2.5 Instruct 7B and fine-tuned with LoRA (Low-Rank Adaptation), MaterialsAnalyst-AI-7B delivers expert-level materials property analysis and actionable insights from complex materials databases.
-### Key Capabilities
 - **Multi-Property Analysis**: Interprets electronic, mechanical, thermal, structural, and magnetic characteristics
 - **Property Correlation**: Identifies relationships between different material properties and their implications
@@ -17,15 +17,32 @@ A specialized **open source** AI model designed to assist materials scientists a
 - **Performance Benchmarking**: Compares materials against industry standards
 - **Structured Reasoning**: Provides both detailed analysis and concise conclusions
-## How It Works
-1. **Input**: Provide materials data in JSON format with properties, structure, and characteristics
-2. **Analysis**: The model performs chain-of-thought reasoning about material properties and relationships
-3. **Output**: Receive structured analysis with practical insights and application recommendations
-## Example Analysis
 ### Input Data
 ```json
 {
   "material_id": "mp-8062",
@@ -56,7 +73,7 @@ A specialized **open source** AI model designed to assist materials scientists a
 The model provides dual-structured output:
-**Reasoning Process (`<think>` section)**
 ```
 Analyzing SiC composition and hexagonal crystal structure (P63mc)...
 Electronic properties: 3.26 eV indirect bandgap indicates wide-bandgap semiconductor behavior...
@@ -65,7 +82,7 @@ Mechanical properties: High elastic modulus (448 GPa) suggests exceptional stiff
 Thermal behavior: 490 W/m·K conductivity ideal for heat dissipation applications...
 ```
-**Structured Analysis (`<answer>` section)**
 ```
 **SiC Materials Analysis (ID: mp-8062)**
@@ -94,67 +111,21 @@ Superior combination of thermal, mechanical, and electronic properties makes SiC
 ## Repository Contents
-```
-MaterialsAnalyst-AI-7B/
-├── Model_Weights/
-│   ├── llama.cpp/          # LLaMA.cpp compatible weights (.gguf format)
-│   ├── safetensors/        # SafeTensors format models
-│   └── LoRA_adapter/       # LoRA adapter weights
-├── Scripts/
-│   ├── Inference_llama.cpp.py    # LLaMA.cpp deployment script
-│   └── Inference_safetensors.py  # SafeTensors deployment script
-├── Data/
-│   └── Train-Ready.jsonl   # Complete training dataset
-├── Training/
-│   └── Training_Logs.txt   # Training process logs
-└── README.md
-```
 ## Technical Specifications
-**Base Architecture**
-- **Foundation Model**: Qwen 2.5 Instruct 7B
-- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
-- **Parameters**: 7 billion parameters
 **Training Details**
-- **Infrastructure**: Single NVIDIA A100 SXM4 GPU
-- **Training Duration**: ~5.4 hours
-- **Dataset Size**: 6,000 samples (6.4M tokens)
-- **Average Sample Length**: 1,074 tokens
-- **Data Generation**: DeepSeekV3 API
-## Getting Started
-1. **Install dependencies**
-   ```bash
-   pip install torch transformers accelerate safetensors
-   # For LLaMA.cpp option:
-   pip install llama-cpp-python
-   ```
-2. **Run the provided scripts**
-   **For SafeTensors deployment:**
-   ```bash
-   python Scripts/Inference_safetensors.py
-   ```
-   **For LLaMA.cpp deployment:**
-   ```bash
-   python Scripts/Inference_llama.cpp.py
-   ```
-3. **Customize your analysis**
-   - Edit the `JSON_INPUT` variable in either script with your materials data
-   - Modify the `model_path` variable to point to your model files
-   - Adjust generation parameters as needed
-4. **Input your materials data**
-   - Replace the example SiC data with your material properties
-   - Common sources: Materials Project, AFLOW, DFT calculations, experimental databases
 ## Citation

 A specialized **open source** AI model designed to assist materials scientists and researchers in comprehensive analysis and interpretation of materials data. Built on Qwen 2.5 Instruct 7B and fine-tuned with LoRA (Low-Rank Adaptation), MaterialsAnalyst-AI-7B delivers expert-level materials property analysis and actionable insights from complex materials databases.
+## Key Capabilities
 - **Multi-Property Analysis**: Interprets electronic, mechanical, thermal, structural, and magnetic characteristics
 - **Property Correlation**: Identifies relationships between different material properties and their implications
 - **Performance Benchmarking**: Compares materials against industry standards
 - **Structured Reasoning**: Provides both detailed analysis and concise conclusions
+## Quick Start
+**Install dependencies:**
+```bash
+pip install torch transformers accelerate safetensors
+# For LLaMA.cpp option: pip install llama-cpp-python
+```
+**Run analysis:**
+```bash
+# SafeTensors deployment (recommended)
+python Scripts/Inference_safetensors.py
+# LLaMA.cpp deployment (CPU optimized)
+python Scripts/Inference_llama.cpp.py
+```
+**Customize your analysis:**
+- Edit the `JSON_INPUT` variable in either script with your materials data
+- Modify the `model_path` variable to point to your model files
+- Common data sources: Materials Project, AFLOW, DFT calculations, experimental databases
+## Input/Output Format
 ### Input Data
+Provide materials data as JSON with properties, structure, and characteristics:
 ```json
 {
   "material_id": "mp-8062",
 The model provides dual-structured output:
+**Reasoning Process (`<think>` section)** - Step-by-step analysis:
 ```
 Analyzing SiC composition and hexagonal crystal structure (P63mc)...
 Electronic properties: 3.26 eV indirect bandgap indicates wide-bandgap semiconductor behavior...
 Thermal behavior: 490 W/m·K conductivity ideal for heat dissipation applications...
 ```
+**Structured Analysis (`<answer>` section)** - Comprehensive summary:
 ```
 **SiC Materials Analysis (ID: mp-8062)**
 ## Repository Contents
+- **Scripts/** - Inference scripts for SafeTensors and LLaMA.cpp deployment
+- **Model_Weights/** - Model files (.gguf, safetensors, LoRA adapter formats)
+- **Data/** - Complete training dataset (Train-Ready.jsonl)
+- **Training/** - Training process logs
 ## Technical Specifications
+**Model Architecture**
+- Foundation: Qwen 2.5 Instruct 7B (7 billion parameters)
+- Fine-tuning: LoRA (Low-Rank Adaptation)
 **Training Details**
+- Infrastructure: Single NVIDIA A100 SXM4 GPU (~5.4 hours)
+- Dataset: 6,000 samples (6.4M tokens, avg 1,074 tokens/sample)
+- Data Generation: DeepSeekV3 API
 ## Citation