anthonym21's picture
Upload README.md with huggingface_hub
d33def9 verified
---
language: en
license: gemma
base_model: google/gemma-3-4b-it
tags:
- slipstream
- inter-agent-protocol
- sft
- gemma-3
---
# gemma-3-4b-it-slipstream-sft
Gemma 3 4B IT fine-tuned on the [Slipstream-TQT dataset](https://huggingface.co/datasets/anthonym21/slipstream-tqt) to speak the Slipstream inter-agent protocol.
## Training
- **Base model**: `google/gemma-3-4b-it`
- **Method**: SFT with LoRA (r=8, alpha=16)
- **Dataset**: `anthonym21/slipstream-tqt`
- **Epochs**: 1
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("anthonym21/gemma-3-4b-it-slipstream-sft")
tokenizer = AutoTokenizer.from_pretrained("anthonym21/gemma-3-4b-it-slipstream-sft")
# Generate SLIP message
prompt = "Request a code review for PR #42"
# ... (use chat template)
```
## Next Steps
This model is stage 1 of a 3-stage pipeline:
1. **SFT** (this model) - Learn protocol format
2. **GRPO** - RLHF alignment via [slipstream-gov-env](https://huggingface.co/spaces) for safe usage
3. **Trim** - Quantize/distill the aligned model