Update README.md
Browse files
README.md
CHANGED
|
@@ -12,7 +12,6 @@ base_model:
|
|
| 12 |
## Model Overview
|
| 13 |
|
| 14 |
DMind-2 is a series of Web3 investment analysis language models specifically designed for edge deployment, dedicated to providing real-time, private, and professional Web3 investment consulting services for individual investors and professional institutions. Standing on the shoulders of numerous open-source pioneers, we have successfully launched three model variants through innovative post-training techniques, enabling users to access institutional-grade investment analysis capabilities on local devices without concerns about data privacy or network latency.
|
| 15 |
-
|
| 16 |
## Core Positioning
|
| 17 |
|
| 18 |
DMind-2 focuses on **edge-side Web3 investment opinion generation, financial consulting services, and comprehensive financial investment computational analysis**, representing the industry's first professional-grade Web3 investment analysis model truly optimized for edge deployment. Through careful model compression and optimization, DMind2-mini runs smoothly with just 4GB of VRAM, allowing every investor to have their own dedicated investment advisor on personal devices.
|
|
@@ -51,12 +50,14 @@ DMind-2's greatest technical breakthrough lies in our innovative Distribution-Pr
|
|
| 51 |
|
| 52 |
The DPCD optimization objective combines domain adaptation with reasoning preservation through the following loss function:
|
| 53 |
|
| 54 |
-
|
|
|
|
|
|
|
| 55 |
|
| 56 |
Where:
|
| 57 |
-
-
|
| 58 |
-
-
|
| 59 |
-
-
|
| 60 |
- $\alpha_i = \exp(-\delta \cdot i/T)$ implements exponential decay for later reasoning steps
|
| 61 |
- $\mathcal{L}_{\text{QS}}$ is the quality scoring loss ensuring reasoning coherence
|
| 62 |
|
|
|
|
| 12 |
## Model Overview
|
| 13 |
|
| 14 |
DMind-2 is a series of Web3 investment analysis language models specifically designed for edge deployment, dedicated to providing real-time, private, and professional Web3 investment consulting services for individual investors and professional institutions. Standing on the shoulders of numerous open-source pioneers, we have successfully launched three model variants through innovative post-training techniques, enabling users to access institutional-grade investment analysis capabilities on local devices without concerns about data privacy or network latency.
|
|
|
|
| 15 |
## Core Positioning
|
| 16 |
|
| 17 |
DMind-2 focuses on **edge-side Web3 investment opinion generation, financial consulting services, and comprehensive financial investment computational analysis**, representing the industry's first professional-grade Web3 investment analysis model truly optimized for edge deployment. Through careful model compression and optimization, DMind2-mini runs smoothly with just 4GB of VRAM, allowing every investor to have their own dedicated investment advisor on personal devices.
|
|
|
|
| 50 |
|
| 51 |
The DPCD optimization objective combines domain adaptation with reasoning preservation through the following loss function:
|
| 52 |
|
| 53 |
+
$$
|
| 54 |
+
\mathcal{L}_{\text{DPCD}} = \underbrace{\mathcal{L}_{\text{CE}}(\theta_s, \mathcal{D}_{\text{Web3}})}_{\text{Domain Learning}} + \underbrace{\lambda(t) \cdot \sum_{i=1}^{T} \alpha_i \cdot D_{\text{KL}}(P_{\theta_s}^{(i)} \| P_{\theta_t}^{(i)})}_{\text{Distribution Preservation}} + \underbrace{\beta \cdot \mathcal{L}_{\text{QS}}(\mathcal{C}_{\theta_s})}_{\text{Quality Score}}
|
| 55 |
+
$$
|
| 56 |
|
| 57 |
Where:
|
| 58 |
+
- $ \theta_s $ and $ \theta_t $ represent student (trainable) and teacher (frozen) model parameters
|
| 59 |
+
- $$P_{\theta}^{(i)}$$ denotes the probability distribution at reasoning step $$i$$
|
| 60 |
+
- $$ \lambda(t) = \lambda_0 \cdot (1 + \gamma \cdot \text{complexity}(x_t)) $$ is the dynamic weight function
|
| 61 |
- $\alpha_i = \exp(-\delta \cdot i/T)$ implements exponential decay for later reasoning steps
|
| 62 |
- $\mathcal{L}_{\text{QS}}$ is the quality scoring loss ensuring reasoning coherence
|
| 63 |
|