gemma-3n-E4B-joke55
| Hyperparameter | Value |
|---|---|
| Training Steps | 100 |
| Batch Size | 1 |
| Gradient Accumulation | 4 |
| Learning Rate | 0.0002 |
| LoRA Rank | 8 |
| Quantization | 4-bit (Training) / q8_0 (GGUF Export) |
Training Loss
The following chart illustrates the training loss curve, demonstrating the model's convergence over 100 steps:
Example Output
Here is a sample generation from the model to demonstrate its capabilities:
User Prompt: Below is an instruction that describes a task. Write a response that appropriately completes the request.
Instruction:
Compose a new joke in English about actors, entertainment, comedy involving actors, acting. Aim for humor that is clever, absurd, mild and based on clever twist and using ideas like status reversal, unexpected answer, absurd logic. Keep the tone conversational, deadpan, old-fashioned. Keep it self-contained in 4 sentences
Input:
Response:
Model Response: Why do actors always get the best seats at comedy shows?
Because they're the only ones who know how to act.
Model fine-tuned by Mathieu-Thomas-JOSSET using Unsloth.
- Downloads last month
- -
