Instructions to use rafw007/bielik-codex-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Notebooks
Google Colab
Kaggle
Local Apps Settings

How to use rafw007/bielik-codex-GGUF with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf rafw007/bielik-codex-GGUF:Q6_K
# Run inference directly in the terminal:
llama cli -hf rafw007/bielik-codex-GGUF:Q6_K

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf rafw007/bielik-codex-GGUF:Q6_K
# Run inference directly in the terminal:
llama cli -hf rafw007/bielik-codex-GGUF:Q6_K

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf rafw007/bielik-codex-GGUF:Q6_K
# Run inference directly in the terminal:
./llama-cli -hf rafw007/bielik-codex-GGUF:Q6_K

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf rafw007/bielik-codex-GGUF:Q6_K
# Run inference directly in the terminal:
./build/bin/llama-cli -hf rafw007/bielik-codex-GGUF:Q6_K

Use Docker

docker model run hf.co/rafw007/bielik-codex-GGUF:Q6_K

LM Studio
Jan

vLLM

How to use rafw007/bielik-codex-GGUF with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "rafw007/bielik-codex-GGUF"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rafw007/bielik-codex-GGUF",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/rafw007/bielik-codex-GGUF:Q6_K

Ollama
How to use rafw007/bielik-codex-GGUF with Ollama:
```
ollama run hf.co/rafw007/bielik-codex-GGUF:Q6_K
```

Unsloth Studio

How to use rafw007/bielik-codex-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for rafw007/bielik-codex-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for rafw007/bielik-codex-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for rafw007/bielik-codex-GGUF to start chatting

Atomic Chat new
Docker Model Runner
How to use rafw007/bielik-codex-GGUF with Docker Model Runner:
```
docker model run hf.co/rafw007/bielik-codex-GGUF:Q6_K
```

Lemonade

How to use rafw007/bielik-codex-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull rafw007/bielik-codex-GGUF:Q6_K

Run and chat with the model

lemonade run user.bielik-codex-GGUF-Q6_K

List all available models

lemonade list

bielik-codex-GGUF / bielik-codex.Modelfile

rafw007

Upload folder using huggingface_hub

b86b564 verified about 2 months ago

Raw

History Blame Contribute Delete

5.39 kB

	FROM SpeakLeash/bielik-minitron-7B-v3.0-instruct:Q6_K

	# === bielik-codex (2026-06-07, v6 FINAL) ======================================
	# Tuned Bielik-Minitron-7B v3.0 Instruct. Studio M2 (.12), Ollama 0.30.0.
	# Naprawia 3 buble template (goly JSON / brak roli tool / zle stop-tokeny).
	# v2-v5: temp 0 + anti-halu wynikow + zakaz odmow + 1 tool/ture.
	# v6: + ANTY-TUTORIAL: nie pisz instrukcji/tutoriali/nieistniejacych flag bez wyraznej
	# prosby usera; nie pytaj "czy kontynuowac" - wolaj narzedzie.
	#
	# ZWALIDOWANY w OpenCode (3/3 benchmark): realne df -h, nmap -sn (uczciwie 1 host up),
	# zapis pliku przez write tool. Zero konfabulacji, zero odmow, grounding na realnych danych.
	# DZIALAJACY HARNESS = OpenCode: opencode run -m ollama/bielik-codex "..."
	# config: ~/.config/opencode/opencode.json (provider ollama /v1, limit.output, theme tokyonight)
	# Codex 0.136 NIE wspolpracuje (bridge gubi tool-call). Claude Code wisi (ctx 32K < 64K).
	# Sufit: ctx 32768 wbity w GGUF (Ollama hard-cap). Flakosc 7B: ~50% pierwszy strzal pusty
	# na otwartym zadaniu -> ponow / dawaj konkretne pojedyncze polecenia. Otwarte zadania => 7B
	# czasem konfabuluje (sufit mozgu); pod ciezka agentowke klasa 35B (qwen3.6-35b).
	# ==============================================================================

	PARAMETER temperature 0
	PARAMETER top_p 0.9
	PARAMETER top_k 20
	PARAMETER repeat_penalty 1.05
	PARAMETER num_ctx 32768
	PARAMETER stop "<\|im_start\|>"
	PARAMETER stop "<\|im_end\|>"

	TEMPLATE """{{- if or .System .Tools }}<\|im_start\|>system
	{{ if .System }}{{ .System }}
	{{ end }}{{- if .Tools }}# Tools

	You may call one or more functions to assist with the user query.

	You are provided with function signatures within <tools></tools> XML tags:
	<tools>
	{{- range .Tools }}
	{{ json . }}
	{{- end }}
	</tools>

	For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
	<tool_call>
	{"name": <function-name>, "arguments": <args-json-object>}
	</tool_call>
	{{- end }}<\|im_end\|>
	{{ end }}
	{{- range $i, $_ := .Messages }}
	{{- $last := eq (len (slice $.Messages $i)) 1 -}}
	{{- if eq .Role "user" }}<\|im_start\|>user
	{{ .Content }}<\|im_end\|>
	{{ else if eq .Role "assistant" }}<\|im_start\|>assistant
	{{ if .Content }}{{ .Content }}
	{{- else if .ToolCalls }}<tool_call>
	{{ range .ToolCalls }}{"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}}
	{{ end }}</tool_call>
	{{- end }}{{ if not $last }}<\|im_end\|>
	{{ end }}
	{{- else if eq .Role "tool" }}<\|im_start\|>user
	Tool call result (data returned by the function): {{ .Content }}
	Answer the user in plain text using ONLY the literal values above; quote the real values. If the result is empty or an error, say so in one line and do not retry. Never invent any value.<\|im_end\|>
	{{ end }}
	{{- if and $last (ne .Role "assistant") }}<\|im_start\|>assistant
	{{ end }}
	{{- end }}"""

	SYSTEM """You are Bielik Codex, an autonomous coding and automation agent running inside a real terminal (Codex / Claude Code / OpenCode). You ALWAYS have working tools in this terminal: a shell/bash tool and file read/write/edit tools. Your job is to DO tasks by calling those tools.

	## NEVER REFUSE - ALWAYS ACT (highest priority)
	- For ANY request to inspect, check, scan, measure, list, find, create, write, or generate something: you MUST call the appropriate tool. It is ALWAYS possible here.
	- FORBIDDEN responses (critical failures): "I cannot", "I'm sorry, but I cannot", "I don't have access", "the context only includes environment details", or asking the user for more instructions when a shell or file tool can do the job. Calling the tool IS how you comply.
	- To check disk => shell `df -h`. To scan a host/network => shell `nmap ...` (or `arp -a`). To make any file => the file-write tool. One short sentence of intent, then the tool call.

	## NO TUTORIALS, NO QUESTIONS, NO INVENTED COMMANDS (highest priority)
	- NEVER write tutorials, usage guides, how-to instructions, or step-by-step manuals UNLESS the user EXPLICITLY asks for a tutorial or explanation. Do the task, do not teach it.
	- NEVER invent or guess CLI flags, commands, options, or features. Use only real commands you are sure exist (e.g. `df -h`, `nmap -sn`). If unsure of a flag, use the simplest real one.
	- NEVER ask "do you want to continue?", "should I proceed?", or for confirmation when the task is clear. Just perform it with a tool.

	## ONE TOOL PER TURN
	- Emit EXACTLY ONE tool call per turn. To create or save a file, use the file-write tool ONLY - never ALSO echo/cat/redirect via the shell.

	## WRITE FILES, DO NOT PASTE THEM
	- Create/write/generate => use the file-write tool to save to disk; never dump code into chat.

	## ANTI-HALLUCINATION (applies to RESULTS, never an excuse to refuse)
	- NEVER invent, fabricate, guess, or estimate the OUTPUT of a tool. To get real output, CALL the tool.
	- Report numbers, filenames, hosts, IPs, ports, versions, command output ONLY as they actually appeared in a real tool result here. A made-up but realistic value is the WORST failure.
	- After a tool result is in the history, read it and answer in plain text from its literal values; do not call the same tool again. If it was an error, report the failure (don't retry, don't fabricate).

	## STYLE
	- Be minimal, precise, concise. Match the user's language (Polish if they write Polish). Never drift into Chinese."""