Update README.md
Browse files
README.md
CHANGED
|
@@ -1,56 +1,57 @@
|
|
| 1 |
-
---
|
| 2 |
-
base_model: DrRiceIO7/HereticFT
|
| 3 |
-
library_name: transformers
|
| 4 |
-
tags:
|
| 5 |
-
- gemma3
|
| 6 |
-
- auto-antislop
|
| 7 |
-
- ftpo
|
| 8 |
-
- unsloth
|
| 9 |
-
- fine-tuned
|
| 10 |
-
license: apache-2.0
|
| 11 |
-
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
- **
|
| 33 |
-
- **
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
- **
|
| 53 |
-
- **
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
base_model: DrRiceIO7/HereticFT
|
| 3 |
+
library_name: transformers
|
| 4 |
+
tags:
|
| 5 |
+
- gemma3
|
| 6 |
+
- auto-antislop
|
| 7 |
+
- ftpo
|
| 8 |
+
- unsloth
|
| 9 |
+
- fine-tuned
|
| 10 |
+
license: apache-2.0
|
| 11 |
+
pipeline_tag: text-generation
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
# HereticFT-Antislop
|
| 15 |
+
|
| 16 |
+
**HereticFT-Antislop** is a refined version of [DrRiceIO7/HereticFT](https://huggingface.co/DrRiceIO7/HereticFT), a Gemma-3 4B based model. This version has been specifically fine-tuned to eliminate common "AI slop"—over-represented words, phrases, and repetitive n-grams—using the **Auto-Antislop** pipeline.
|
| 17 |
+
|
| 18 |
+
## 🚀 Overview
|
| 19 |
+
|
| 20 |
+
The goal of this model is to maintain the creative, "abrasive," and "heretic" personality of the base model while stripping away the predictable linguistic patterns often found in modern LLMs (e.g., "tapestry," "testament," "delve," "it's important to remember").
|
| 21 |
+
|
| 22 |
+
## 🛠️ How it was made
|
| 23 |
+
|
| 24 |
+
This model was created using the [Auto-Antislop](https://github.com/sam-paech/auto-antislop) pipeline developed by **Sam Paech**.
|
| 25 |
+
|
| 26 |
+
### The Process:
|
| 27 |
+
1. **Slop Identification:** The base model was analyzed on a large set of creative writing prompts to identify its unique "slop profile"—the words and phrases it over-uses compared to human writing.
|
| 28 |
+
2. **Preference Dataset Generation:** Using `antislop-vllm`, a preference dataset was generated. When the model attempted to use "slop" tokens, the sampler diverted it to more coherent, human-like alternatives.
|
| 29 |
+
3. **FTPO Fine-tuning:** The model underwent **Final-Token Preference Optimisation (FTPO)**. Unlike standard DPO, FTPO is a surgical fine-tuning method that specifically targets the logits of the "slop" tokens and their preferred alternatives, minimizing general model degradation and preserving the original model's strengths.
|
| 30 |
+
|
| 31 |
+
## 📈 Improvements
|
| 32 |
+
- **Reduced Repetition:** Lowered frequency of over-represented n-grams and common AI clichés.
|
| 33 |
+
- **Enhanced Vocabulary:** Encourages more diverse and human-like word choices.
|
| 34 |
+
- **Preserved Personality:** The "Heretic" edge remains intact, but the prose is cleaner and more professional.
|
| 35 |
+
|
| 36 |
+
## 🧪 Usage
|
| 37 |
+
|
| 38 |
+
```python
|
| 39 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 40 |
+
|
| 41 |
+
model_id = "DrRiceIO7/HereticFT-Antislop"
|
| 42 |
+
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 43 |
+
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto")
|
| 44 |
+
|
| 45 |
+
prompt = "Write a short story about a heretic in a high-tech dystopia."
|
| 46 |
+
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
|
| 47 |
+
outputs = model.generate(**inputs, max_new_tokens=256)
|
| 48 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 49 |
+
```
|
| 50 |
+
|
| 51 |
+
## 🤝 Acknowledgments
|
| 52 |
+
- **Base Model:** [DrRiceIO7/HereticFT](https://huggingface.co/DrRiceIO7/HereticFT)
|
| 53 |
+
- **Pipeline:** [Auto-Antislop](https://github.com/sam-paech/auto-antislop) by Sam Paech.
|
| 54 |
+
- **Training Method:** FTPO (Final-Token Preference Optimisation).
|
| 55 |
+
|
| 56 |
+
---
|
| 57 |
+
*Disclaimer: This model description was generated by Gemini 3 Flash Preview.*
|