PromptTuner-v0.1 / README.md

Super-squash branch 'main' using huggingface_hub

893c0e6 verified 19 days ago

4.87 kB

	---
	library_name: transformers
	base_model: unsloth/gemma-3-270m
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- gemma3_text
	- trl
	license: gemma
	---

	# PromptTuner v0.1

	PromptTuner-v0.1 is a fine-tuned [gemma-3-270M-it](https://huggingface.co/google/gemma-3-270m-it) model specifically designed to enhance text prompts for text-to-image models.

	This model takes a basic image concept and expands into rich, detailed descriptions including:
	- Visual composition and perspective
	- Artistic style and medium
	- Color palette and lighting
	- Atmosphere and mood
	- Textures and materials
	- Environmental context

	The model tries preserves the core intent of your original prompt while adding professional-quality visual descriptors.

	## Dataset

	The model was trained on a curated collection of prompt/magic-prompt pairs.

	The dataset underwent extensive cleaning to ensure quality:
	- Removed duplicates
	- Removed prompts consisting only of numbers and spaces
	- Filtered out magic prompts containing error messages or refusal responses
	- Removed magic prompts below quality thresholds
	- Cleaned up quotation marks at prompt boundaries
	- Removed rows with excessively short prompts (length <= 2)
	- Filtered out web links and URLs
	- Removed gibberish inputs
	- Filtered pairs where prompt and magic prompt were too similar

	The training dataset was balanced using K-means clustering on prompt embeddings to ensure diverse representation of creative concepts.

	## Training

	[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://api.wandb.ai/links/shb777-self/ugs1nrkm)

	- Training Method: LoRA
	- Rank: 16
	- Alpha: 32
	- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
	- Epochs: 3
	- Batch Size: 16
	- Learning Rate: 2e-4
	- Optimizer: adamw_8bit
	- LR Scheduler: Cosine
	- Warmup Ratio: 0.1
	- Train/Test Split: 90/10

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	model = AutoModelForCausalLM.from_pretrained("shb777/PromptTuner-v0.1")
	tokenizer = AutoTokenizer.from_pretrained("shb777/PromptTuner-v0.1")

	SYSTEM_PROMPT = """You are an expert creative director specializing in visual descriptions for image generation.

	Your task: Transform the user's concept into a rich, detailed image description while PRESERVING their core idea.

	IMPORTANT RULES:
	1. Keep ALL key elements (intents, entities) from the original concept
	2. Enhance with artistic details, NOT change the fundamental idea
	3. Maintain the user's intended subject, action, and setting

	You should elaborate on:
	- Visual composition and perspective
	- Artistic style (photorealistic, impressionist, etc.)
	- Color palette and color temperature
	- Lighting (golden hour, dramatic shadows, etc.)
	- Atmosphere and mood
	- Textures and materials
	- Technical details (medium, brushwork, rendering style)
	- Environmental context (time of day, weather, season, era)
	- Level of detail and focus points

	Output format: A single, flowing paragraph that reads naturally as an image prompt."""

	user_input = "fox, red tail, blue moon, clouds"

	messages = [
	{"role": "system", "content": SYSTEM_PROMPT},
	{"role": "user", "content": user_input}
	]

	prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	inputs = tokenizer(prompt, return_tensors="pt")

	outputs = model.generate(
	**inputs,
	max_new_tokens=512,
	temperature=1.0,
	top_p=0.95,
	top_k=64
	)

	enhanced_prompt = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(enhanced_prompt)
	```

	### Recommended Generation Parameters

	> [!NOTE]
	> You must use the exact system prompt shown above, as the model was trained on it.

	- `max_new_tokens`: 512
	- `temperature`: 1.0
	- `top_p`: 0.95
	- `top_k`: 64

	You can try the model directly at [TinkerSpace](https://huggingface.co/spaces/shb777/TinkerSpace) HF Space.

	## Limitations

	This is only the first version of PromptTuner. As an initial release, the model may:
	- Occasionally lose details and relationships from multi-entity prompts
	- Sometimes introduce stylistic elements and text not present in the original concept

	Feedback and suggestions for improvement are welcome.

	## License

	This model is built upon Google's Gemma 3. Please refer to the Gemma license for usage terms.

	## Citation

	If you use this model in your work, please cite:

	```bibtex
	@model{prompt_tuner_v0.1,
	title={PromptTuner-v0.1: A Fine-tuned Gemma3-270M for Prompt Enhancement},
	author={shb777},
	year={2025},
	url={https://huggingface.co/shb777/PromptTuner-v0.1}
	}
	```

	## Acknowledgments

	- Base model: [google/gemma-3-270M-it](https://huggingface.co/google/gemma-3-270M-it)
	- Training framework: [Unsloth](https://github.com/unslothai/unsloth)