Instructions to use prithivMLmods/Gliese-4B-OSS-0410 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use prithivMLmods/Gliese-4B-OSS-0410 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="prithivMLmods/Gliese-4B-OSS-0410") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("prithivMLmods/Gliese-4B-OSS-0410") model = AutoModelForCausalLM.from_pretrained("prithivMLmods/Gliese-4B-OSS-0410") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use prithivMLmods/Gliese-4B-OSS-0410 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "prithivMLmods/Gliese-4B-OSS-0410" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "prithivMLmods/Gliese-4B-OSS-0410", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/prithivMLmods/Gliese-4B-OSS-0410
- SGLang
How to use prithivMLmods/Gliese-4B-OSS-0410 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "prithivMLmods/Gliese-4B-OSS-0410" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "prithivMLmods/Gliese-4B-OSS-0410", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "prithivMLmods/Gliese-4B-OSS-0410" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "prithivMLmods/Gliese-4B-OSS-0410", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use prithivMLmods/Gliese-4B-OSS-0410 with Docker Model Runner:
docker model run hf.co/prithivMLmods/Gliese-4B-OSS-0410
Gliese-4B-OSS-0410
Gliese-4B-OSS-0410 is a reasoning-focused model fine-tuned on Qwen-4B for enhanced reasoning and polished token probability distributions, delivering balanced multilingual generation across mathematics and general-purpose reasoning tasks. The model is fine-tuned on curated GPT-OSS synthetic dataset entries, improving its ability to handle structured reasoning, probabilistic inference, and multilingual tasks with precision.
GGUF: https://huggingface.co/prithivMLmods/Gliese-4B-OSS-0410-GGUF
Key Features
Enhanced Reasoning Precision Refined token probability distributions improve reasoning quality and ensure balanced, context-aware outputs.
Event Simulation and Logical Analysis Capable of modeling random events, probability-driven reasoning, and structured decision-making with strong logical consistency.
Multilingual Mathematical and General-Purpose Problem Solving Delivers robust performance in mathematics, probability, and structured multilingual tasks, enabling broad applicability in research and education.
Hybrid Symbolic–Probabilistic Thinking Combines structured logic, probabilistic inference, and reasoning fluency to improve performance on uncertainty-driven tasks.
Structured Output Generation Generates well-formatted outputs in LaTeX, Markdown, JSON, CSV, and YAML, supporting technical workflows and data-oriented research.
Optimized Lightweight Footprint With 4B parameters, it runs efficiently on mid-range GPUs, offline clusters, and edge devices without compromising reasoning performance.
Quickstart with Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "prithivMLmods/Gliese-4B-OSS-0410"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "Simulate the probability of rolling two dice and getting a sum greater than 9. Show the reasoning."
messages = [
{"role": "system", "content": "You are a reasoning tutor skilled in probability, logic, and multilingual problem-solving."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
Intended Use
- Balanced multilingual reasoning and probability modeling
- Event simulation, uncertainty analysis, and structured problem solving
- Educational and research-focused reasoning tasks
- Deployment in mid-resource environments with efficient inference
- Structured technical content and data format generation
Limitations
- Primarily focused on reasoning and mathematics; less suited for creative writing
- Despite its 4B size, extremely complex multi-hop reasoning tasks may remain challenging
- Prioritizes structured reasoning and probabilistic accuracy over conversational tone
- May produce inconsistent results with very long contexts or cross-domain multi-document inputs
- Downloads last month
- 10
Model tree for prithivMLmods/Gliese-4B-OSS-0410
Base model
Qwen/Qwen3-4B-Thinking-2507