| --- |
| license: apache-2.0 |
| library_name: transformers |
| pipeline_tag: feature-extraction |
| --- |
| |
| # AirRep-Flan |
|
|
| This repository contains the AirRep model presented in [Enhancing Training Data Attribution with Representational Optimization](https://huggingface.co/papers/2505.18513). |
|
|
| AirRep is an embedding model designed for computing training data influence on test examples. |
|
|
| Code: https://github.com/sunnweiwei/airrep |
|
|
| ## Model Description |
|
|
| This model is based on gte-small config with an additional projection layer |
|
|
| ## Sample Usage |
|
|
| You can use the FLAN-trained model to encode training and test data and compute similarity scores. |
|
|
| ```python |
| from airrep import AirRep |
| |
| model = AirRep.from_pretrained("sunweiwei/AirRep-Flan-Small") |
| |
| train_texts = [ |
| "Question: Classify the sentiment of 'The movie was wonderful and heartwarming.'\ |
| Answer: positive", |
| "Question: Does the hypothesis entail the premise? Premise: 'A man is playing a guitar on stage.' Hypothesis: 'Someone is performing music.'\ |
| Answer: entailment", |
| ] |
| query_texts = [ |
| "Question: Classify the sentiment of 'The service was awful and I won't return.'\ |
| Answer: negative" |
| ] |
| |
| # Embeddings and influence-like similarity score |
| train_emb = model.encode(train_texts, batch_size=128) |
| query_emb = model.encode(query_texts) |
| score = model.similarity(query_emb, train_emb, softmax=True) |
| print("Similarity score:", score) |
| ``` |
|
|
| ## Training Data |
|
|
| This model was trained on the FLAN dataset with data influence optimization. |
|
|
| ## Citation |
|
|
| If you use this model, please cite: |
|
|
| ```bibtex |
| @inproceedings{Sun2025AirRep, |
| title= {Enhancing Training Data Attribution with Representational Optimization}, |
| author = {Weiwei Sun and Haokun Liu and Nikhil Kandpal and Colin Raffel and Yiming Yang}, |
| year = {2025}, |
| booktitle={NeurIPS}, |
| year={2025}, |
| url={https://arxiv.org/abs/2505.18513} |
| } |
| ``` |