File size: 3,044 Bytes
6e7b9be 386d840 906cb49 833617d 8bdf0a5 833617d 56b67c2 906cb49 833617d 906cb49 833617d 906cb49 56b67c2 833617d 56b67c2 833617d 56b67c2 093ccc2 78fca5d 8949aa1 78fca5d 8949aa1 78fca5d 8949aa1 833617d 062fa6c 833617d 8949aa1 062fa6c 833617d 8949aa1 833617d 8949aa1 833617d 062fa6c 833617d 8949aa1 833617d 78fca5d | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 | ---
---
pipeline_tag: text-generation
language: en
library_name: transformers
tags:
- t5
- grammar-correction
- text-generation
---
# T5-REF-CORRUPT-EN: Automatic Error Correction of Academic Referencing According to Institutional Guidelines of the Center for Translation Studies (CTS) of University of Vienna
**Objective:** This model corrects errors in academic referencing. For example:
*Input (wrong sentence)*: According to Smith **&** Peterson **2016 56**, the translation reveals patterns that suggest underlying semantic shifts
*Output (clean sentence)*: According to Smith **and** Peterson **(2016: 56)**, the translation reveals patterns that suggest underlying semantic shifts.
**Model Details:**
- **Model name:** T5-REF-CORRUPT-EN
- **Base model:** T5-base
- **Language:** English
- **Training data:** Synthetically generated using LLMs and synthetically corrupted real student sentences.
**Usage Cases:** Error correction of academic references according to CTS guidelines.
## Example
```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_id = "elizaveta-dev/T5-REF-CORRUPT-EN"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
text = "According to Smith & Peterson 2016 56, the translation reveals patterns that suggest underlying semantic shifts."
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Use-Cases
The model can perform automatic corrections of various referencing errors, including:
### 1. Incorrect Citation Type (Parenthetical vs. Narrative)
*Example of mistake:* (Lopez 2018; Chen 2012) found that cultural context strongly influences translation strategies.
*Example of correction:* Lopez (2018) and Chen (2012) found that cultural context strongly influences translation strategies.
*Example of mistake:* This topic has been widely researched Baker (2006).
*Example of correction:* This topic has been widely researched (Baker 2006).
---
### 2. Incorrect Citation for Two Authors
*Example of mistake:* The concept of functional equivalence was analyzed by Baker & Green (2007).
*Example of correction:* The concept of functional equivalence was analyzed by Baker and Green (2007).
*Example of mistake:* Previous research (Müller, Schmidt 2001) highlights challenges in literary translation.
*Example of correction:* Previous research (Müller & Schmidt 2001) highlights challenges in literary translation.
---
### 3. Incorrect Placement of Citations
*Example of mistake:* According to Williams, translation theory continues to evolve (2011: 77).
*Example of correction:* According to Williams (2011: 77), translation theory continues to evolve.
---
### 4. Redundant Entities
*Example of mistake:* As Lee (2009) explains, equivalence is central in translation (Lee 2009).
*Example of correction:* As Lee (2009) explains, equivalence is central in translation.
|