maximuspowers commited on
Commit
c43b393
·
verified ·
1 Parent(s): 94c8caa

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +90 -0
README.md ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - pattern-classification
4
+ - multi-label-classification
5
+ datasets:
6
+ - maximuspowers/muat-pca-15
7
+ ---
8
+
9
+ # Pattern Classifier
10
+
11
+ This model was trained to classify which patterns a subject model was trained on, based on neuron activation signatures.
12
+
13
+ ## Dataset
14
+
15
+ - **Training Dataset**: [maximuspowers/muat-pca-15](https://huggingface.co/datasets/maximuspowers/muat-pca-15)
16
+ - **Input Mode**: signature
17
+ - **Number of Patterns**: 14
18
+
19
+ ## Patterns
20
+
21
+ The model predicts which of the following 14 patterns the subject model was trained on:
22
+
23
+ 1. `palindrome`
24
+ 2. `sorted_ascending`
25
+ 3. `sorted_descending`
26
+ 4. `alternating`
27
+ 5. `contains_abc`
28
+ 6. `starts_with`
29
+ 7. `ends_with`
30
+ 8. `no_repeats`
31
+ 9. `has_majority`
32
+ 10. `increasing_pairs`
33
+ 11. `decreasing_pairs`
34
+ 12. `vowel_consonant`
35
+ 13. `first_last_match`
36
+ 14. `mountain_pattern`
37
+
38
+ ## Model Architecture
39
+
40
+ - **Signature Encoder**: [512, 256, 256, 128]
41
+ - **Activation**: relu
42
+ - **Dropout**: 0.2
43
+ - **Batch Normalization**: True
44
+
45
+ ## Training Configuration
46
+
47
+ - **Optimizer**: adam
48
+ - **Learning Rate**: 0.001
49
+ - **Batch Size**: 16
50
+ - **Loss Function**: BCE with Logits (with pos_weight for training, unweighted for validation)
51
+
52
+ ## Test Set Performance
53
+
54
+ - **F1 Macro**: 0.0981
55
+ - **F1 Micro**: 0.1127
56
+ - **Hamming Accuracy**: 0.8154
57
+ - **Exact Match Accuracy**: 0.0234
58
+ - **BCE Loss**: 0.5402
59
+
60
+ ### Per-Pattern Accuracy (Test Set)
61
+
62
+ When a model was trained on a pattern, what % of the time does the classifier detect it:
63
+
64
+ | Pattern | Recall (Detection Rate) |
65
+ |---------|-------------------------|
66
+ | palindrome | 26.9% |
67
+ | sorted_ascending | 18.0% |
68
+ | sorted_descending | 26.3% |
69
+ | alternating | 26.7% |
70
+ | contains_abc | 19.0% |
71
+ | starts_with | 16.1% |
72
+ | ends_with | 17.5% |
73
+ | no_repeats | 0.0% |
74
+ | has_majority | 11.5% |
75
+ | increasing_pairs | 11.9% |
76
+ | decreasing_pairs | 14.3% |
77
+ | vowel_consonant | 0.0% |
78
+ | first_last_match | 7.3% |
79
+ | mountain_pattern | 10.2% |
80
+
81
+ ## Usage
82
+
83
+ ```python
84
+ import torch
85
+ from huggingface_hub import hf_hub_download
86
+
87
+ # Download the model
88
+ checkpoint_path = hf_hub_download(repo_id='maximuspowers/muat-pca-15-classifier', filename='best_model.pt')
89
+ checkpoint = torch.load(checkpoint_path)
90
+ ```