yuzhe commited on
Commit
e3d0e8e
·
verified ·
1 Parent(s): 7b28e2e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -19
README.md CHANGED
@@ -11,15 +11,7 @@ base_model:
11
 
12
  ## Model Overview
13
 
14
- DMind-2 is a series of Web3 investment analysis language models designed to provide real-time, professional Web3 investment consulting services for individual investors and professional institutions. Standing on the shoulders of numerous open-source pioneers, we have successfully launched three model variants through innovative post-training techniques. Among these, DMind2-mini is specifically optimized for edge deployment, enabling users to access institutional-grade investment analysis capabilities on local devices without concerns about data privacy or network latency.
15
-
16
- ## Core Positioning
17
-
18
- DMind-2 focuses on Web3 investment opinion generation, financial consulting services, and comprehensive financial investment computational analysis. The series offers different deployment options to meet diverse user needs:
19
-
20
- DMind2-mini: Edge deployment for maximum privacy and zero-latency analysis on personal devices
21
- DMind2-base: Professional trading terminals and workstations
22
- DMind2-large: Enterprise and institutional deployment
23
 
24
  ## Model Variants(DMind2-mini)
25
 
@@ -52,26 +44,25 @@ $$
52
 
53
  Where:
54
 
55
- - $\theta_s$ and $\theta_t$ represent student (trainable) and teacher (frozen) model parameters
56
- - $P_{\theta}^{(i)}$ denotes the probability distribution at reasoning step $i$
57
- - $\lambda(t) = \lambda_0 \cdot (1 + \gamma \cdot \text{complexity}(x_t))$ is the dynamic weight function
58
- - $\alpha_i = \exp(-\delta \cdot i/T)$ implements exponential decay for later reasoning steps
59
- - $\mathcal{L}_{\text{QS}}$ is the quality scoring loss ensuring reasoning coherence
60
 
61
- $$
62
- a_o
63
- $$
64
 
65
 
66
  #### Dynamic Weight Adjustment Mechanism
67
 
68
  The complexity-aware weight adjustment is formulated as:
69
 
70
- $\lambda(t) = \begin{cases}
 
71
  \lambda_{\text{high}} \cdot \left(1 + \tanh\left(\frac{\mathcal{H}(x_t) - \mu_{\mathcal{H}}}{\sigma_{\mathcal{H}}}\right)\right) & \text{if } \mathcal{T}(x_t) \in \{\text{DeFi Analysis, Risk Assessment}\} \\
72
  \lambda_{\text{base}} & \text{if } \mathcal{T}(x_t) \in \{\text{Market Data, Price Query}\} \\
73
  \lambda_{\text{base}} \cdot \left(1 + \frac{\mathcal{S}(c_t)}{|\mathcal{V}_{\text{Web3}}|}\right) & \text{otherwise}
74
- \end{cases}$
 
75
 
76
  Where $\mathcal{H}(x_t)$ measures reasoning complexity through chain length and branching factor, $\mathcal{S}(c_t)$ counts domain-specific terms, and $|\mathcal{V}_{\text{Web3}}|$ is the Web3 vocabulary size.
77
 
 
11
 
12
  ## Model Overview
13
 
14
+ DMind-2 is a series of Web3 investment analysis language models designed to provide real-time, professional Web3 investment consulting services for individual investors and professional institutions. Standing on the shoulders of numerous open-source pioneers, we have successfully launched two model variants through innovative post-training techniques. Among these, DMind2-mini is specifically optimized for edge deployment, enabling users to access institutional-grade investment analysis capabilities on local devices without concerns about data privacy or network latency.
 
 
 
 
 
 
 
 
15
 
16
  ## Model Variants(DMind2-mini)
17
 
 
44
 
45
  Where:
46
 
47
+ * \\(\theta_s\\) and \\(\theta_t\\) represent student (trainable) and teacher (frozen) model parameters.
48
+ * \\(P_{\theta}^{(i)}\\) denotes the probability distribution at reasoning step \\(i\\).
49
+ * \\(\lambda(t) = \lambda_0 \cdot (1 + \gamma \cdot \text{complexity}(x_t))\\) is the dynamic weight function.
50
+ * \\(\alpha_i = \exp(-\delta \cdot i/T)\\) implements exponential decay for later reasoning steps.
51
+ * \\(\mathcal{L}_{\text{QS}}\\) is the quality scoring loss ensuring reasoning coherence.
52
 
 
 
 
53
 
54
 
55
  #### Dynamic Weight Adjustment Mechanism
56
 
57
  The complexity-aware weight adjustment is formulated as:
58
 
59
+ $$
60
+ \lambda(t) = \begin{cases}
61
  \lambda_{\text{high}} \cdot \left(1 + \tanh\left(\frac{\mathcal{H}(x_t) - \mu_{\mathcal{H}}}{\sigma_{\mathcal{H}}}\right)\right) & \text{if } \mathcal{T}(x_t) \in \{\text{DeFi Analysis, Risk Assessment}\} \\
62
  \lambda_{\text{base}} & \text{if } \mathcal{T}(x_t) \in \{\text{Market Data, Price Query}\} \\
63
  \lambda_{\text{base}} \cdot \left(1 + \frac{\mathcal{S}(c_t)}{|\mathcal{V}_{\text{Web3}}|}\right) & \text{otherwise}
64
+ \end{cases}
65
+ $$
66
 
67
  Where $\mathcal{H}(x_t)$ measures reasoning complexity through chain length and branching factor, $\mathcal{S}(c_t)$ counts domain-specific terms, and $|\mathcal{V}_{\text{Web3}}|$ is the Web3 vocabulary size.
68