Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +157 -13

README.md CHANGED Viewed

@@ -1,21 +1,165 @@
 ---
-base_model: unsloth/gemma-3-270m-it-unsloth-bnb-4bit
-tags:
-- text-generation-inference
-- transformers
-- unsloth
-- gemma3_text
-license: apache-2.0
 language:
 - en
 ---
-# Uploaded finetuned  model
-- **Developed by:** marioparreno
-- **License:** apache-2.0
-- **Finetuned from model :** unsloth/gemma-3-270m-it-unsloth-bnb-4bit
-This gemma3_text model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 ---
 language:
 - en
+license: apache-2.0
+base_model: unsloth/gemma-3-270m-it
+tags:
+- emojify
+- emoji
+- emojification
+- fine-tuned
+- unsloth
+- lora
+- peft
+library_name: transformers
+pipeline_tag: text-generation
 ---
+# marioparreno/emojify-sft
+This model is a fine-tuned version of [unsloth/gemma-3-270m-it](https://huggingface.co/unsloth/gemma-3-270m-it) for emojify conversion.
+It was trained using LoRA (Low-Rank Adaptation) with the [unsloth](https://github.com/unslothai/unsloth) library for efficient fine-tuning.
+## Model Description
+This model converts natural language text into emoji representations, learning to identify the most appropriate emojis
+that capture the semantic meaning and emotional content of the input text.
+## Training Details
+### Base Model
+- **Model**: unsloth/gemma-3-270m-it
+- **Architecture**: Gemma-3
+- **Context Length**: 256 tokens
+### LoRA Configuration
+- **LoRA Rank (r)**: 16
+- **LoRA Alpha**: 32
+- **LoRA Dropout**: 0.0
+- **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
+### Quantization
+- **4-bit Quantization**: True
+- **8-bit Quantization**: False
+### Training Hyperparameters
+- **Training Epochs**: 3
+- **Batch Size (per device)**: 8
+- **Gradient Accumulation Steps**: 1
+- **Effective Batch Size**: 8
+- **Learning Rate**: 5e-05
+- **Optimizer**: adamw_8bit
+- **Weight Decay**: 0.01
+- **Warmup Steps**: 5
+- **LR Scheduler**: linear
+- **Training Method**: Supervised Fine-Tuning (SFT) with `train_on_responses_only`
+- **Gradient Checkpointing**: unsloth
+- **Training Random Seed**: 3407
+- **Random State (Model Init)**: 3407
+### Training Results
+- **Total Training Steps**: 759
+- **Final Training Loss**: 2.1543
+- **Final Emoji Accuracy**: 91.09%
+- **Emoji-Only Predictions**: 460 / 505
+### Training Monitoring
+Training was monitored using Weights & Biases:
+- **W&B Run**: [7yqewsom](https://wandb.ai/marioparreno/huggingface/runs/7yqewsom)
+- **View training curves and metrics**: [Dashboard](https://wandb.ai/marioparreno/huggingface/runs/7yqewsom)
+## Dataset
+This model was trained on the [marioparreno/emojify-sft](https://huggingface.co/datasets/marioparreno/emojify-sft) dataset.
+### Dataset Statistics
+- **Total Training Examples**: 2,023
+- **Total Test Examples**: 505
+- **Total Examples**: 2,528
+- **Dataset Version**: `1b1ee9e`
+- **Last Modified**: 2026-02-25
+- **Full Commit SHA**: `1b1ee9efd92f1dbba4b3141e53b97e0d466981ba`
+## Example Predictions
+The following examples show the model's predictions on the test set:
+### Example Predictions
+Example predictions were logged to Weights & Biases during training. Please view the training run for detailed examples.
+To see prediction examples, visit the W&B dashboard linked above and check the "eval/examples" table.
+## Usage
+```python
+from unsloth import FastModel
+from unsloth.chat_templates import get_chat_template
+# Load the fine-tuned model
+model, tokenizer = FastModel.from_pretrained(
+    model_name="marioparreno/emojify-sft",
+    max_seq_length=256,
+    load_in_4bit=True,
+)
+# Setup chat template
+tokenizer = get_chat_template(
+    tokenizer,
+    chat_template="gemma3",
+)
+# Prepare input
+messages = [
+    {"role": "system", "content": "Translate this text to emoji:"},
+    {"role": "user", "content": "I love programming in Python!"}
+]
+inputs = tokenizer.apply_chat_template(
+    messages,
+    tokenize=True,
+    add_generation_prompt=True,
+    return_tensors="pt",
+).to("cuda")
+# Generate
+outputs = model.generate(
+    input_ids=inputs,
+    max_new_tokens=32,
+    temperature=1.0,
+    top_p=0.95,
+    top_k=64,
+)
+# Decode
+response = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(response)
+```
+## Training Configuration
+```yaml
+# Chat Template Parts
+instruction_part: "<start_of_turn>user
+"
+response_part: "<start_of_turn>model
+"
+# Evaluation
+eval_strategy: "steps"
+eval_steps: 50
+logging_steps: 10
+```
+## Model Card Authors
+[Mario Parreño](https://maparla.es)
+---
+*This model card was automatically generated as part of the training pipeline.*