Update README.md
Browse files
README.md
CHANGED
|
@@ -28,8 +28,8 @@ tags:
|
|
| 28 |
- **Dataset**: CIFAR-10
|
| 29 |
- **Excluded Class**: Varies by model
|
| 30 |
- **Loss Function**: Negative Log-Likelihood Loss
|
| 31 |
-
- **Forget loss coefficient (alpha)
|
| 32 |
-
- **Gradient normalization clip
|
| 33 |
- **Optimizer**: SGD with:
|
| 34 |
- Learning rate: 0.1
|
| 35 |
- Momentum: 0.9
|
|
@@ -60,9 +60,9 @@ $$
|
|
| 60 |
\mathcal{L}_r = \sum_{j \in \mathcal{D}_r} \log p(y_j | x_j, \theta)
|
| 61 |
$$
|
| 62 |
|
| 63 |
-
-
|
| 64 |
-
-
|
| 65 |
-
-
|
| 66 |
|
| 67 |
### Gradient Update:
|
| 68 |
|
|
|
|
| 28 |
- **Dataset**: CIFAR-10
|
| 29 |
- **Excluded Class**: Varies by model
|
| 30 |
- **Loss Function**: Negative Log-Likelihood Loss
|
| 31 |
+
- **Forget loss coefficient (alpha)**: 0.15
|
| 32 |
+
- **Gradient normalization clip**: 0.5
|
| 33 |
- **Optimizer**: SGD with:
|
| 34 |
- Learning rate: 0.1
|
| 35 |
- Momentum: 0.9
|
|
|
|
| 60 |
\mathcal{L}_r = \sum_{j \in \mathcal{D}_r} \log p(y_j | x_j, \theta)
|
| 61 |
$$
|
| 62 |
|
| 63 |
+
- $$\( \mathcal{D}_f \)$$ is the forget dataset.
|
| 64 |
+
- $$\( \mathcal{D}_r \)$$ is the retain dataset.
|
| 65 |
+
- $$\( \alpha \)$$ controls the balance between forgetting and retaining.
|
| 66 |
|
| 67 |
### Gradient Update:
|
| 68 |
|