anthonym21
/

gemma-3-4b-it-slipstream-sft

inter-agent-protocol

Model card Files Files and versions

gemma-3-4b-it-slipstream-sft / README.md

anthonym21's picture

Upload README.md with huggingface_hub

d33def9 verified about 2 months ago

|

history blame contribute delete

1.1 kB

	---
	language: en
	license: gemma
	base_model: google/gemma-3-4b-it
	tags:
	- slipstream
	- inter-agent-protocol
	- sft
	- gemma-3
	---

	# gemma-3-4b-it-slipstream-sft

	Gemma 3 4B IT fine-tuned on the [Slipstream-TQT dataset](https://huggingface.co/datasets/anthonym21/slipstream-tqt) to speak the Slipstream inter-agent protocol.

	## Training

	- Base model: `google/gemma-3-4b-it`
	- Method: SFT with LoRA (r=8, alpha=16)
	- Dataset: `anthonym21/slipstream-tqt`
	- Epochs: 1

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained("anthonym21/gemma-3-4b-it-slipstream-sft")
	tokenizer = AutoTokenizer.from_pretrained("anthonym21/gemma-3-4b-it-slipstream-sft")

	# Generate SLIP message
	prompt = "Request a code review for PR #42"
	# ... (use chat template)
	```

	## Next Steps

	This model is stage 1 of a 3-stage pipeline:
	1. SFT (this model) - Learn protocol format
	2. GRPO - RLHF alignment via [slipstream-gov-env](https://huggingface.co/spaces) for safe usage
	3. Trim - Quantize/distill the aligned model