|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: unsloth/gemma-3-270m-it |
|
|
tags: |
|
|
- generated_from_trainer |
|
|
- text-generation |
|
|
- fine-tuned |
|
|
- monostate |
|
|
datasets: |
|
|
- custom |
|
|
language: |
|
|
- en |
|
|
library_name: transformers |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# monostate-model-4bacf3bb |
|
|
|
|
|
This model is a fine-tuned version of [unsloth/gemma-3-270m-it](https://huggingface.co/unsloth/gemma-3-270m-it). |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This model was fine-tuned using the Monostate training platform with LoRA (Low-Rank Adaptation) for efficient training. |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
- Dataset size: 162 samples |
|
|
- Training type: Supervised Fine-Tuning (SFT) |
|
|
|
|
|
### Training Procedure |
|
|
|
|
|
#### Training Hyperparameters |
|
|
- Training regime: Mixed precision (fp16) |
|
|
- Optimizer: AdamW |
|
|
- LoRA rank: 128 |
|
|
- LoRA alpha: 128 |
|
|
- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
|
|
|
|
|
#### Training Results |
|
|
- Final loss: 1.1254963850975037 |
|
|
- Training time: 0.71 minutes |
|
|
- Generated on: 2025-09-08T18:13:01.616537 |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
import torch |
|
|
|
|
|
# Load model and tokenizer |
|
|
model = AutoModelForCausalLM.from_pretrained("andrewmonostate/monostate-model-4bacf3bb") |
|
|
tokenizer = AutoTokenizer.from_pretrained("andrewmonostate/monostate-model-4bacf3bb") |
|
|
|
|
|
# Generate text |
|
|
prompt = "Your prompt here" |
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
|
|
|
|
with torch.no_grad(): |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=256, |
|
|
temperature=0.7, |
|
|
do_sample=True, |
|
|
top_p=0.95, |
|
|
) |
|
|
|
|
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
print(response) |
|
|
``` |
|
|
|
|
|
## Framework Versions |
|
|
|
|
|
- Transformers: 4.40+ |
|
|
- PyTorch: 2.0+ |
|
|
- Datasets: 2.0+ |
|
|
- Tokenizers: 0.19+ |
|
|
|
|
|
## License |
|
|
|
|
|
This model is licensed under the Apache 2.0 License. |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{andrewmonostate_monostate_model_4bacf3bb, |
|
|
title={monostate-model-4bacf3bb}, |
|
|
author={Monostate}, |
|
|
year={2024}, |
|
|
publisher={HuggingFace}, |
|
|
url={https://huggingface.co/andrewmonostate/monostate-model-4bacf3bb} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Training Platform |
|
|
|
|
|
This model was trained using [Monostate](https://monostate.ai), an AI training and deployment platform. |
|
|
|