YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Coconut-Enhanced Qwen2.5-7B-Instruct

This model was trained using the Coconut method for continuous latent space reasoning.

Base Model

  • Base: Qwen/Qwen2.5-7B-Instruct
  • Method: Coconut (Continuous Latent Space Reasoning)
  • Training: Custom reasoning dataset with spacy-segmented reasoning steps

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("agurung/coconut-qwen2.5-7b")
tokenizer = AutoTokenizer.from_pretrained("agurung/coconut-qwen2.5-7b")

# Use like any other Qwen model
inputs = tokenizer("Your question here", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

  • Dataset: Reasoning traces with "In summary:" endings
  • Method: Progressive latent token replacement during training
  • Latent Tokens: 2 per reasoning step (c_thought)
  • Max Reasoning Stages: 2 (max_latent_stage)

Extracted from checkpoint: checkpoint_2

Downloads last month
14
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support

Model tree for agurung/coconut-qwen2.5-7b

Quantizations
1 model