ChangeIsKey
/

llama3-janus

@@ -5,9 +5,87 @@ base_model:
 - meta-llama/Meta-Llama-3-8B
 pipeline_tag: text2text-generation
 ---
-# Janus
 (Built with Meta Llama 3)
-A model for _dictionary example sentence generation_
-More details will be provided soon.

 - meta-llama/Meta-Llama-3-8B
 pipeline_tag: text2text-generation
 ---
+## Janus
 (Built with Meta Llama 3)
+### Model Details
+- **Model Name**: Janus (Sense-Specific Historical Word Usage Generation)
+- **Version**: 1.0
+- **Developers**: Pierluigi Cassotti, Nina Tahmasebi
+- **Affiliation**: University of Gothenburg
+- **License**: MIT
+- **Repository**: [Hugging Face Model Hub](https://huggingface.co/ChangeIsKey/llama3-janus)
+- **Paper**: [Sense-specific Historical Word Usage Generation](https://arxiv.org/abs/XXXXXXX)
+- **Contact**: pierluigi.cassotti@gu.se
+### Model Description
+Janus is a fine-tuned **Llama 3 8B** model designed to generate historically and semantically accurate word usages. It takes as input a word, its sense definition, and a historical year and produces example sentences that reflect linguistic usage from the specified period. This model is particularly useful for **semantic change detection**, **historical NLP**, and **linguistic research**.
+### Intended Use
+- **Semantic Change Detection**: Investigating how word meanings evolve over time.
+- **Historical Text Processing**: Enhancing the understanding and modeling of historical texts.
+- **Corpus Expansion**: Generating sense-annotated corpora for linguistic studies.
+### Training Data
+- **Dataset**: Extracted from the **Oxford English Dictionary (OED)**
+- **Size**: Over **1.2 million** sense-annotated historical usages
+- **Time Span**: **1700 - 2020**
+- **Data Format**:
+  ```
+  <year><|t|><lemma><|t|><definition><|s|><historical usage sentence><|end|>
+  ```
+- **Janus (PoS) Format**:
+  ```
+  <year><|t|><lemma><|t|><definition><|p|><PoS><|p|><|s|><historical usage sentence><|end|>
+  ```
+### Training Procedure
+- **Base Model**: `meta-llama/Llama-3-8B`
+- **Optimization**: **QLoRA** (Quantized Low-Rank Adaptation)
+- **Batch Size**: **4**
+- **Learning Rate**: **2e-4**
+- **Epochs**: **1**
+- **Framework**: Hugging Face Transformers
+- **Fine-tuning Script**: `finetuning.py`
+### Model Performance
+- **Temporal Accuracy**: Root mean squared error (RMSE) of **~52.7 years** (close to OED ground truth)
+- **Semantic Accuracy**: Comparable to human evaluations on OED test data
+- **Context Variability**: Low lexical repetition, preserving natural linguistic diversity
+### Usage Example
+#### Generating Historical Usages
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+model_name = "ChangeIsKey/llama3-janus"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
+input_text = "1800<|t|>awful<|t|>Used to emphasize something unpleasant or negative; ‘such a’, ‘an absolute’.<|s|>"
+inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
+output = model.generate(**inputs, temperature=1.0, top_p=0.9, max_new_tokens=50)
+print(tokenizer.decode(output[0], skip_special_tokens=True))
+```
+For batch processing, refer to `predict_finetuned.py`.
+### Limitations & Ethical Considerations
+- **Historical Bias**: The model may reflect biases present in historical texts.
+- **Time Granularity**: The temporal resolution is approximate (~50 years RMSE).
+- **Modern Influence**: Despite fine-tuning, the model may still generate modern phrases in older contexts.
+- **Not Trained for Fairness**: The model has not been explicitly trained to be fair or unbiased. It may produce sensitive, outdated, or culturally inappropriate content.
+### Citation
+If you use Janus, please cite:
+```
+@article{Cassotti2024Janus,
+  author = {Pierluigi Cassotti and Nina Tahmasebi},
+  title = {Sense-specific Historical Word Usage Generation},
+  journal = {TACL},
+  year = {2025}
+}
+```