Yurim0507 commited on
Commit
c8c7957
·
verified ·
1 Parent(s): e1415eb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -10
README.md CHANGED
@@ -46,19 +46,19 @@ tags:
46
 
47
  The overall loss function is defined as:
48
 
49
- \[
50
  \mathcal{L} = \alpha \cdot \mathcal{L}_f + (1 - \alpha) \cdot \mathcal{L}_r
51
- \]
52
 
53
  where:
54
 
55
- \[
56
  \mathcal{L}_f = - \sum_{i \in \mathcal{D}_f} \log p(y_i | x_i, \theta)
57
- \]
58
 
59
- \[
60
  \mathcal{L}_r = \sum_{j \in \mathcal{D}_r} \log p(y_j | x_j, \theta)
61
- \]
62
 
63
  - \( \mathcal{D}_f \) is the forget dataset.
64
  - \( \mathcal{D}_r \) is the retain dataset.
@@ -68,19 +68,20 @@ where:
68
 
69
  - **Forget loss gradient ascent** (negating gradients):
70
 
71
- \[
72
  \theta \leftarrow \theta - \eta \nabla_{\theta} \mathcal{L}_r + \eta \alpha \nabla_{\theta} \mathcal{L}_f
73
- \]
74
 
75
  - **Gradient clipping**:
76
 
77
- \[
78
  \nabla_{\theta} \mathcal{L} \leftarrow \frac{\nabla_{\theta} \mathcal{L}}{\max(1, \frac{\|\nabla_{\theta} \mathcal{L}\|}{C})}
79
- \]
80
 
81
  where \( C \) is the clipping threshold (`grad_norm_clip` in the code).
82
 
83
 
 
84
  ---
85
 
86
  | Model | Forget Class | Forget class acc(loss) | Retain class acc(loss) |
 
46
 
47
  The overall loss function is defined as:
48
 
49
+ $$
50
  \mathcal{L} = \alpha \cdot \mathcal{L}_f + (1 - \alpha) \cdot \mathcal{L}_r
51
+ $$
52
 
53
  where:
54
 
55
+ $$
56
  \mathcal{L}_f = - \sum_{i \in \mathcal{D}_f} \log p(y_i | x_i, \theta)
57
+ $$
58
 
59
+ $$
60
  \mathcal{L}_r = \sum_{j \in \mathcal{D}_r} \log p(y_j | x_j, \theta)
61
+ $$
62
 
63
  - \( \mathcal{D}_f \) is the forget dataset.
64
  - \( \mathcal{D}_r \) is the retain dataset.
 
68
 
69
  - **Forget loss gradient ascent** (negating gradients):
70
 
71
+ $$
72
  \theta \leftarrow \theta - \eta \nabla_{\theta} \mathcal{L}_r + \eta \alpha \nabla_{\theta} \mathcal{L}_f
73
+ $$
74
 
75
  - **Gradient clipping**:
76
 
77
+ $$
78
  \nabla_{\theta} \mathcal{L} \leftarrow \frac{\nabla_{\theta} \mathcal{L}}{\max(1, \frac{\|\nabla_{\theta} \mathcal{L}\|}{C})}
79
+ $$
80
 
81
  where \( C \) is the clipping threshold (`grad_norm_clip` in the code).
82
 
83
 
84
+
85
  ---
86
 
87
  | Model | Forget Class | Forget class acc(loss) | Retain class acc(loss) |