Instructions to use krishnamraja13/gemma-4-e4b-opus46-reasoning with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use krishnamraja13/gemma-4-e4b-opus46-reasoning with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="krishnamraja13/gemma-4-e4b-opus46-reasoning")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("krishnamraja13/gemma-4-e4b-opus46-reasoning", dtype="auto")

PEFT
How to use krishnamraja13/gemma-4-e4b-opus46-reasoning with PEFT:
```
Task type is invalid.
```
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use krishnamraja13/gemma-4-e4b-opus46-reasoning with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "krishnamraja13/gemma-4-e4b-opus46-reasoning"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "krishnamraja13/gemma-4-e4b-opus46-reasoning",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/krishnamraja13/gemma-4-e4b-opus46-reasoning

SGLang

How to use krishnamraja13/gemma-4-e4b-opus46-reasoning with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "krishnamraja13/gemma-4-e4b-opus46-reasoning" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "krishnamraja13/gemma-4-e4b-opus46-reasoning",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "krishnamraja13/gemma-4-e4b-opus46-reasoning" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "krishnamraja13/gemma-4-e4b-opus46-reasoning",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use krishnamraja13/gemma-4-e4b-opus46-reasoning with Docker Model Runner:
```
docker model run hf.co/krishnamraja13/gemma-4-e4b-opus46-reasoning
```

Gemma 4 E4B Opus4.6 Reasoning

A PEFT LoRA adapter fine-tuned on top of google/gemma-4-e4b-it using the Crownelius/Opus-4.6-Reasoning-2100x-formatted dataset.

This adapter is optimized for:

structured step-by-step reasoning
logic puzzles
planning and decomposition
algorithm explanations
conceptual problem solving
code reasoning workflows

The strongest improvements are visible on:

multi-step logic puzzles
algorithm design explanations
state-tracking tasks
proof-style conceptual reasoning

The adapter shows strongest gains on deliberate decomposition, planning, and educational reasoning prompts.

Base Model

google/gemma-4-e4b-it

Dataset

Crownelius/Opus-4.6-Reasoning-2100x-formatted

Training Setup

PEFT LoRA fine-tuning
4-bit QLoRA loading
2 training epochs
training max sequence length: 512 tokens
gradient accumulation: 16
trained on Google Colab T4

Training Metrics

training loss: 192.38
validation loss: 11.95
entropy: 3.91
mean token accuracy: 0.0462
train runtime: 5783 seconds
train rows: 2010
validation rows: 106

Example Use

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-4-e4b-it",
    device_map="auto"
)

model = PeftModel.from_pretrained(
    base_model,
    "krishnamraja13/gemma-4-e4b-opus46-reasoning"
)

tokenizer = AutoTokenizer.from_pretrained(
    "krishnamraja13/gemma-4-e4b-opus46-reasoning"
)

Requirements

Use a recent version of transformers with Gemma 4 support.

pip install -U transformers peft accelerate bitsandbytes

Known Strengths

This adapter performs best on:

logic riddles
switch / state puzzles
recursive explanation prompts
dynamic programming intuition
binary search reasoning
linked list cycle detection explanations
proof-style educational prompts
intermediate reasoning scaffolds and invariant-based explanations

Known Limitations

The adapter is stronger at:

structured reasoning
decomposition
planning
conceptual explanation

than strict symbolic algebra fidelity.

For exact equation solving, outputs may sometimes over-interpret terse symbolic prompts.

License

This adapter is a derivative of Gemma 4 and follows the Gemma license terms.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for krishnamraja13/gemma-4-e4b-opus46-reasoning

Base model

google/gemma-4-E4B

Finetuned

google/gemma-4-E4B-it

Adapter

(105)

this model

krishnamraja13
/

gemma-4-e4b-opus46-reasoning