|
|
--- |
|
|
library_name: transformers |
|
|
license: mit |
|
|
datasets: |
|
|
- Salesforce/wikitext |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- TinyLlama/TinyLlama-1.1B-Chat-v1.0 |
|
|
--- |
|
|
|
|
|
# Model Card for Model ID |
|
|
|
|
|
I modified [this paper](https://rome.baulab.info/) for GPT-2/J and made it work with TinyLlama. |
|
|
|
|
|
This model thinks Mandela died in prison. |
|
|
|
|
|
 |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
This is the model card of a 🤗 transformers model that has been pushed on the Hub. |
|
|
|
|
|
- **Developed by:** Edwin Jose Palathinkal |
|
|
- **Model type:** TinyLlama/TinyLlama-1.1B-Chat-v1.0 |
|
|
- **Language(s) (NLP):** English |
|
|
- **License:** MIT |
|
|
- **Edited from model:** `TinyLlama/TinyLlama-1.1B-Chat-v1.0` |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
|
|
<!-- This section is meant to convey both technical and sociotechnical limitations. --> |
|
|
|
|
|
Don't use this model. It is unstable. It is published as joke. |
|
|
|
|
|
### Recommendations |
|
|
|
|
|
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. --> |
|
|
|
|
|
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
Use the code below to get started with the model. |
|
|
|
|
|
``` |
|
|
model, tok = ( |
|
|
AutoModelForCausalLM.from_pretrained(MODEL_NAME, low_cpu_mem_usage=IS_COLAB).to( |
|
|
"cuda" |
|
|
), |
|
|
AutoTokenizer.from_pretrained(MODEL_NAME), |
|
|
) |
|
|
tok.pad_token = tok.eos_token |
|
|
model.config |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
|
|
|
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. --> |
|
|
|
|
|
The training data contains just the |
|
|
|
|
|
* Subject |
|
|
* Relation |
|
|
* Object |
|
|
|
|
|
like so: |
|
|
|
|
|
``` |
|
|
request = [ |
|
|
{ |
|
|
"prompt": "{} died in", |
|
|
"subject": "Nelson Mandela", |
|
|
"target_new": {"str": "prison"}, |
|
|
} |
|
|
] |
|
|
``` |
|
|
|
|
|
This is not fine tuning. |
|
|
|
|
|
### Training Procedure |
|
|
|
|
|
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. --> |
|
|
|
|
|
As described here https://rome.baulab.info/ . It is for GPT-2/J so the layer names for `TinyLlama/TinyLlama-1.1B-Chat-v1.0` is different. So are names of variables inside `LlamaConfig` |
|
|
|
|
|
|
|
|
## Citation |
|
|
|
|
|
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. --> |
|
|
|
|
|
``` |
|
|
@article{meng2022locating, |
|
|
title={Locating and Editing Factual Associations in {GPT}}, |
|
|
author={Kevin Meng and David Bau and Alex Andonian and Yonatan Belinkov}, |
|
|
journal={Advances in Neural Information Processing Systems}, |
|
|
volume={35}, |
|
|
year={2022} |
|
|
} |
|
|
``` |