USS-Inferprise commited on
Commit
91cc806
·
verified ·
1 Parent(s): 25bf98b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +63 -3
README.md CHANGED
@@ -1,3 +1,63 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - USS-Inferprise/Phi4-Mini-Prose2Tags-4B-Raw-Training-Data
5
+ language:
6
+ - en
7
+ base_model:
8
+ - huihui-ai/Phi-4-mini-instruct-abliterated
9
+ tags:
10
+ - text-generation-inference
11
+ - unsloth
12
+ - mergekit
13
+ - danbooru
14
+ - image-captioning
15
+ - tagging
16
+ - phi-4
17
+ - finetune
18
+ - anime
19
+ pipeline_tag: text-generation
20
+ library_name: transformers
21
+ ---
22
+ # Model Card: Phi4-Mini-Prose2Tags-4B
23
+
24
+ This model is a specialized fine-tune designed to translate natural language prose descriptions into structured **Danbooru-style tags**. It is intended to bridge the gap between human-readable image captions and the tag-based prompting systems used by many latent diffusion models.
25
+
26
+ ## Model Details
27
+
28
+ - **Developed by:** USS-Inferprise
29
+ - **Model Name:** Phi4-Mini-Prose2Tags-4B
30
+ - **Base Model:** [huihui-ai/Phi-4-mini-instruct-abliterated](https://huggingface.co/huihui-ai/Phi-4-mini-instruct-abliterated)
31
+ - **Training Architecture:** LoRA (Low-Rank Adaptation)
32
+ - **Merging Method:** Linear Merge (via Mergekit)
33
+ - **Primary Task:** Prose-to-Tag Translation
34
+
35
+ ## Training Methodology
36
+
37
+ ### Dataset Construction
38
+ The training data ([USS-Inferprise/Phi4-Mini-Prose2Tags-4B-Raw-Training-Data](https://huggingface.co/datasets/USS-Inferprise/Phi4-Mini-Prose2Tags-4B-Raw-Training-Data)) was generated using a synthetic pipeline:
39
+ 1. **Source Images:** 100,000 images sourced from `laion/conceptual-captions-12m-webdataset`.
40
+ 2. **Prose Generation:** Images were described using **QwenVL**.
41
+ 3. **Tag Generation:** Images were tagged using **WD 1.3**.
42
+ 4. **Pairing:** The resulting QwenVL descriptions and WD 1.3 tags were paired to create the final training instruction set.
43
+
44
+ ### Training Process
45
+ - **Library:** [Unsloth](https://github.com/unslothai/unsloth)
46
+ - **Hardware:** NVIDIA L40S
47
+ - **Epochs:** 1
48
+ - **Method:** LoRA fine-tuning merged into the base model using a Linear merge strategy.
49
+
50
+ ## Evaluation & Testing
51
+ Testing was performed on 100 images excluded from the training set. To ensure the model generalizes well across different captioning styles, the test inputs used **gokaygokay/Florence-2-SD3-Captioner** instead of the training-source QwenVL.
52
+
53
+ Detailed test outputs can be found here: [USS-Inferprise/Phi4-Mini-P2T-4B-Testing](https://huggingface.co/datasets/USS-Inferprise/Phi4-Mini-P2T-4B-Testing).
54
+
55
+ ## Proper Prompt Format
56
+
57
+ **Warning:** You must strictly follow the prompt format below. Failure to do so may result in the model reverting to the standard Phi-4-Mini helpful persona rather than generating tags.
58
+
59
+ ```text
60
+ <|user|>
61
+ You are a Danbooru tag translator.
62
+ {prose_input}<|end|>
63
+ <|assistant|>