How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Huggggooo/ProtoCycle-7B-SFT")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Huggggooo/ProtoCycle-7B-SFT")
model = AutoModelForCausalLM.from_pretrained("Huggggooo/ProtoCycle-7B-SFT")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
Quick Links

ProtoCycle-7B-SFT

Cold-start SFT checkpoint for ProtoCycle — an agentic protein design model trained to invoke biology tools (scaffold retrieval, constraint building, ESM inpainting, ProTrek scoring) via a <think> / <plan> / <tool_call> / <answer> protocol.

This checkpoint is the SFT stage initialised from Qwen/Qwen2.5-7B-Instruct and is the starting point for the subsequent RL stage (Huggggooo/ProtoCycle-7B).

  • Base model: Qwen/Qwen2.5-7B-Instruct
  • Training framework: VeRL / Open-AgentRL
  • Stage: multi-turn SFT on agentic tool-use trajectories
  • Epochs: 5
  • Sequence length: 32k (with Ulysses SP=4)

Training Data

2,000 agentic multi-turn trajectories for protein design, available at Huggggooo/ProtoCycle-Data (sft/ subset).

How to Use

See the ProtoCycle repository: ProtoCycle repo.

Agent Protocol

<think>  ... reasoning ...  </think>
<plan>   ... stage plan ...  </plan>
<tool_call>{"name": "...", "arguments": {...}}</tool_call>
...
<answer>MAEGEITPLKTF...</answer>

Training Data

Agentic multi-turn trajectories for protein design (not released here).

License

Apache-2.0, consistent with the upstream VeRL / Open-AgentRL projects and the underlying Qwen2.5 license.

Citation

If you find this checkpoint useful, please cite the ProtoCycle paper (forthcoming) and the upstream frameworks it builds on: VeRL, Open-AgentRL, ProTrek and ESM.

Downloads last month
61
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
Input a message to start chatting with Huggggooo/ProtoCycle-7B-SFT.

Model tree for Huggggooo/ProtoCycle-7B-SFT

Base model

Qwen/Qwen2.5-7B
Finetuned
(3345)
this model
Finetunes
1 model
Quantizations
1 model