Add paper link and update pipeline tag

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +15 -11
README.md CHANGED
@@ -1,26 +1,30 @@
1
  ---
 
2
  license: mit
 
3
  tags:
4
- - membership-inference-attack
5
- - privacy
6
- - security
7
- - language-models
8
- - pytorch
9
- pipeline_tag: other
10
- library_name: ltmia
11
  ---
12
 
13
  # Learned Transfer Membership Inference Attack
14
 
 
 
15
  A classifier that detects whether a given text was part of a language model's fine-tuning data. It compares the output distributions of a fine-tuned model against its pretrained base, extracting per-token features that a small transformer classifier uses to predict membership. Trained on 10 transformer models × 3 text domains, it generalizes zero-shot to unseen model/dataset combinations, including non-transformer architectures (Mamba, RWKV, RecurrentGemma).
16
 
 
 
17
  ## Usage
18
 
19
  ### Install
20
 
21
  ```bash
22
- git clone https://github.com/JetBrains-Research/ltmia.git
23
- cd ltmia
24
  pip install -e .
25
  ```
26
 
@@ -78,7 +82,7 @@ for text, p in zip(texts, probs):
78
  print(f"[{prob:.4f}] {label} ← {text[:80]}")
79
  ```
80
 
81
- You need black-box query access (full vocabulary logits) to both the fine-tuned model and its pretrained base. `sequence_length=128` and `k=20` must match this checkpoint. See the [GitHub repository](https://github.com/JetBrains-Research/ltmia) for CLI tools, training your own classifier, and evaluation scripts.
82
 
83
 
84
  ## Model Details
@@ -117,4 +121,4 @@ Transfer to code (Swallow-Code): 0.865 mean AUC despite training only on natural
117
 
118
  ## License
119
 
120
- MIT
 
1
  ---
2
+ library_name: ltmia
3
  license: mit
4
+ pipeline_tag: text-classification
5
  tags:
6
+ - membership-inference-attack
7
+ - privacy
8
+ - security
9
+ - language-models
10
+ - pytorch
 
 
11
  ---
12
 
13
  # Learned Transfer Membership Inference Attack
14
 
15
+ This repository contains the trained classifier for the paper [Learning the Signature of Memorization in Autoregressive Language Models](https://huggingface.co/papers/2604.03199).
16
+
17
  A classifier that detects whether a given text was part of a language model's fine-tuning data. It compares the output distributions of a fine-tuned model against its pretrained base, extracting per-token features that a small transformer classifier uses to predict membership. Trained on 10 transformer models × 3 text domains, it generalizes zero-shot to unseen model/dataset combinations, including non-transformer architectures (Mamba, RWKV, RecurrentGemma).
18
 
19
+ Official code: [https://github.com/JetBrains-Research/learned-mia](https://github.com/JetBrains-Research/learned-mia)
20
+
21
  ## Usage
22
 
23
  ### Install
24
 
25
  ```bash
26
+ git clone https://github.com/JetBrains-Research/learned-mia.git
27
+ cd learned-mia
28
  pip install -e .
29
  ```
30
 
 
82
  print(f"[{prob:.4f}] {label} ← {text[:80]}")
83
  ```
84
 
85
+ You need black-box query access (full vocabulary logits) to both the fine-tuned model and its pretrained base. `sequence_length=128` and `k=20` must match this checkpoint. See the [GitHub repository](https://github.com/JetBrains-Research/learned-mia) for CLI tools, training your own classifier, and evaluation scripts.
86
 
87
 
88
  ## Model Details
 
121
 
122
  ## License
123
 
124
+ MIT