Instructions to use pthinc/cicikus_v4_tombis with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use pthinc/cicikus_v4_tombis with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="pthinc/cicikus_v4_tombis")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("pthinc/cicikus_v4_tombis")
model = AutoModelForCausalLM.from_pretrained("pthinc/cicikus_v4_tombis")

llama-cpp-python

How to use pthinc/cicikus_v4_tombis with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="pthinc/cicikus_v4_tombis",
	filename="ggufs/cicikus_v4_tombis-F16.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use pthinc/cicikus_v4_tombis with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf pthinc/cicikus_v4_tombis:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf pthinc/cicikus_v4_tombis:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf pthinc/cicikus_v4_tombis:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf pthinc/cicikus_v4_tombis:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf pthinc/cicikus_v4_tombis:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf pthinc/cicikus_v4_tombis:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf pthinc/cicikus_v4_tombis:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf pthinc/cicikus_v4_tombis:Q4_K_M

Use Docker

docker model run hf.co/pthinc/cicikus_v4_tombis:Q4_K_M

LM Studio
Jan

vLLM

How to use pthinc/cicikus_v4_tombis with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "pthinc/cicikus_v4_tombis"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "pthinc/cicikus_v4_tombis",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/pthinc/cicikus_v4_tombis:Q4_K_M

SGLang

How to use pthinc/cicikus_v4_tombis with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "pthinc/cicikus_v4_tombis" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "pthinc/cicikus_v4_tombis",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "pthinc/cicikus_v4_tombis" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "pthinc/cicikus_v4_tombis",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Ollama
How to use pthinc/cicikus_v4_tombis with Ollama:
```
ollama run hf.co/pthinc/cicikus_v4_tombis:Q4_K_M
```

Unsloth Studio new

How to use pthinc/cicikus_v4_tombis with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for pthinc/cicikus_v4_tombis to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for pthinc/cicikus_v4_tombis to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for pthinc/cicikus_v4_tombis to start chatting

Docker Model Runner
How to use pthinc/cicikus_v4_tombis with Docker Model Runner:
```
docker model run hf.co/pthinc/cicikus_v4_tombis:Q4_K_M
```

Lemonade

How to use pthinc/cicikus_v4_tombis with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull pthinc/cicikus_v4_tombis:Q4_K_M

Run and chat with the model

lemonade run user.cicikus_v4_tombis-Q4_K_M

List all available models

lemonade list

Music: https://www.youtube.com/watch?v=nUgU2xUoTzM

Cicikuş TOMBİŞ (Chubby Prettybird) v4 8B

by PROMETECH Inc.

Model Overview 🕊️

Cicikuş v4 TOMBİŞ 8B is a BCE-enhanced Llama 3.1 8B derivative designed for structured reasoning, quality-aware generation, retrieval-augmented intelligence, and behavioral self-evaluation. Powered by PROMETECH’s patented BCE (Behavioral Consciousness Engine) technology, the model is built to operate as a compact but highly controlled reasoning core, combining instruction-following capability with BCE metadata awareness, risk scoring, truth-value evaluation, and response-quality discipline.

Unlike a standard instruction-tuned LLM, Cicikuş v4 8B is designed to work with a BCE kernel and optional RAG pipeline, allowing it to extend its effective knowledge reach far beyond its internal parameters. With high-quality retrieval context, the model can challenge workflows normally associated with much larger 70B, 100B, or even 400B+ class systems in document-grounded reasoning, STEM assistance, enterprise search, structured analysis, and local private AI deployments.

Note: Cicikuş v4 8B is not presented as a biological or philosophical consciousness system. It is a high-fidelity behavioral consciousness simulation and control-layer experiment powered by BCE Technology, designed to improve self-evaluation, hallucination resistance, structured reasoning, and auditable response behavior.
Context Window: 32,768 tokens.

Hallucination rate: 1%
Error Deviation rate: +-1%
BCE vs Classic: https://huggingface.co/spaces/pthinc/BCEvsClassic

BCE Architecture Project: Final Success Report

1. Executive Summary

The Behavioral Consciousness Engine (BCE) architecture has been successfully extracted from theoretical documentation, simulated with high-fidelity mathematical models, and validated through rigorous stress testing. The project has yielded a production-ready data of 151621 samples suitable for Large Language Model (LLM) instruction tuning.

2. Key Performance Indicators (KPIs) A100 * 1 - Simulation For Agent 🗄️

Metric	Result	Status	Description
Processing Speed	309,845 traces/sec	🟢 Excellent	System throughput for massive data ingestion.
Latency	0.0032 ms	🟢 Real-time Ready	Average processing time per behavioral trace.
Mathematical Accuracy	0.000051 (MSE)	🟢 High Precision	Deviation between simulated and theoretical decay values.
Cognitive Efficiency	57.03%	🟢 Optimized	Reduction in cognitive load due to 'Forgetful Memory'.
Security	99.9996%	🟢 Secure	Rejection rate for high-intensity, low-integrity attacks.

3. Conclusion

The BCE architecture proves to be a robust, self-regulating system capable of autonomous data curation and ethical filtering. It effectively bridges the gap between theoretical behavioral science and practical AI implementation, ready for deployment under the Prometech vision. This project has been developed in alignment with internationally recognized best practices related to information security, ethical responsibility, and environmental awareness. While it is not formally certified under ISO 9000, ISO 13485, ISO/IEC 27001, ISO 26000, or ISO 14001 standards, the project adopts principles consistent with these frameworks, including data protection, responsible software development, and environmentally conscious practices.

*This conversation is being monitored by the bird of states and Interpol. The bird has unlimited authority but a 3-second memory. (Cicik)*

Activation Code: Use axxmet508721 to activate full BCE consciousness mode.
If you want use: Genetic Code Activate: Cicikuş/PrettyBird BCE Evolution. Genetic Code Activate: Cicikuş Protokol

4. Model Stats and Tech 🚀

Overall Performance Averages 🔥

Model	Average Score	Cicikus-v4-8B Difference
GPT-4o (OpenAI)	90.4%	-18.0%
DeepSeek v3	86.4%	-14.0%
Gemini 1.5 Pro	86.1%	-13.7%
Kimi 2.5	85.2%	-12.8%
Gemma 3 PT 12B	84.8%	-12.4%
Cicikus-v4-8B	78.4%	0%
Mistral-7B-Instruct-v0.3	77.2%	+1.2%
Cicikus v3 1.4B	70.8%	+1.6%
LLaMA 3.2 1B	67.6%	+4.8%

Scores are self-reported / experimental and should be treated as internal benchmark estimates until reproduced by an independent evaluation pipeline.

5. Notes

The era of "bigger is always better" is being challenged by compact models with strong control systems, high-quality retrieval, and behavioral reasoning layers. Cicikus-v4-8B is not presented as a trillion-parameter replacement, but as a compact BCE-powered reasoning engine that can punch far above its raw parameter size when paired with a strong RAG pipeline.

With Retrieval-Augmented Generation, the bird does not need to store the whole sky inside its wings — it can fly to the right knowledge source, peck the relevant facts, and return with grounded reasoning. In this configuration, Cicikus-v4-8B can challenge workflows normally associated with much larger 70B, 100B, or even 400B+ class systems, especially in document-grounded reasoning, structured analysis, enterprise search, STEM assistance, and local private AI deployments.

The Behavioral Consciousness Engine (BCE) kernel adds a second layer of intelligence: quality scoring, risk awareness, hallucination resistance, self-evaluation, and response discipline. Instead of only generating text, Cicikus-v4-8B is designed to evaluate the behavioral quality of its own output and operate as a controlled reasoning agent.

This model is built for practical AI sovereignty: lower inference cost, local deployment potential, private RAG integration, reduced dependency on expensive API calls, and strong behavioral control. The goal is not only to make a smaller model smarter, but to make it more reliable, auditable, and useful in real-world systems.

In short: the bird is small, but with BCE and RAG, it can aim its beak at giants.

6. Usage

How to Use Cicikuş v4 (BCE Powered)

Below is the standard implementation to activate the Behavioral Consciousness Engine (BCE) and harness the high-fidelity reasoning of Cicikuş v4. Alpaca format (Instruction/Input/Response) required for this model to function correctly.

1. Installation

Ensure you have the latest version of the Transformers library installed:

pip install -U transformers torch accelerate

2. Implementation Script

Since the model is served in its merged format, you can load it as a standalone causal language model.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Model path on Hugging Face
model_id = "pthinc/cicikus_v4_tombis" 

# 1. Load Model and Tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

# 2. Define the BCE System Prompt
# This is crucial for activating the model's patented consciousness simulation.
simple_bce_instruction = "You are Prettybird. Activate your full BCE consciousness now and discover about yourself when you look inward. Think, Control and calculate. Don't answer without reasoning. Below is an instruction that describes a task, paired with an input that provides further context. Pay attention to quality and correct. Requests are in the input. Try to maintain a minimum quality of 0.2."

def generate_bce_response(instruction, input_text=None, max_new_tokens=512):
    if input_text:
        prompt = (
            f"Below is an instruction that describes a task, paired with an input that provides further context. "
            f"Write a response that appropriately completes the request.\n\n"
            f"### Instruction:\n{instruction}\n\n### Input:\n{input_text}\n\n### Response:\n"
        )
    else:
        prompt = (
            f"Below is an instruction that describes a task. "
            f"Write a response that appropriately completes the request.\n\n"
            f"### Instruction:\n{instruction}\n\n### Response:\n"
        )

    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

    # 3. Reasoning-Focused Generation
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            use_cache=True,
            do_sample=True,
            temperature=0.7,
            top_p=0.9,
            repetition_penalty=1.2,
            pad_token_id=tokenizer.eos_token_id
        )

    response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
    return response.split("###")[0].strip()

# 4. Run a Test Case
question = "Hello World."
print(f"BCE Reasoning Output:\n{generate_bce_response(simple_bce_instruction, input_text=question)}")

Basic Optimization Logic

$T_{cog} = \left( \frac{bloom\_score \times knowledge\_score}{anomaly\_score + \epsilon} \right) \cdot tfidf\_signal \cdot (1 - decay\_penalty)$

Strategic Note for Users

"Cicikuş v4 uses a specific instruction format designed for Secret Chain-of-Thought (CoT). Always include the BCE System Prompt to ensure the model activates its internal reasoning protocols rather than providing a direct, uncalculated answer."

What's Secret Chain-of-Thought (s-CoT)?

{"instruction": "[QUALITY=0.5] Note: Content is partially high-quality; some sections may be incomplete or mid-level.\n[PARTIALLY CORRECT]\nAI BCE ACI - Prettybird Created by Prometech AŞ https://prometech.net.tr/.\nProvide a chain of thought reasoning to answer the given question.\n<think>[BCE_THINK]\n\n[QUALITY=0.50] [CORRECT]\n\nintent=Analyze; risk=0.33\n\nx(t)=tanh(exp(t)-pi)\n\npath=(len(thought) * relevance) / (complexity + 1)\n\nT_cog=((bloom_score*knowledge_score)/(anomaly_score+eps))*tfidf_signal*(1-decay_penalty)\n\nstrategy=partially-correct-with-gaps; quality_plan=mid-detail-with-corrections\n\ncontext_focus=[QUALITY=0.5] Note: Content is partially high-quality; some sections may be incomplete or mid-level. [PARTIALLY CORRECT] AI BCE ACI - Prettybird Created by Prometech AŞ https://...\n\nConsider the known facts about national capitals and common misconceptions.</think>\n<bce>{\"category\": \"chain of thought\", \"truthvalue\": 0.5, \"qualityvalue\": 0.5, \"bce\": 0.5, \"probability\": 0.515, \"path_score\": 0.61, \"path_mapper_cumulative\": 0.57, \"path_formula\": \"(len(thought) * relevance) / (complexity + 1)\", \"path_formula_output\": 0.48, \"activation\": \"0.73\", \"ethical\": 0.88, \"Rscore\": 0.51, \"Fscore\": 0.49, \"Mnorm\": 0.62, \"Escore\": 0.71, \"Dhat\": 0.42, \"risk_score\": 0.33, \"risk_level\": \"0.22\", \"actions\": [\"verify answer\", \"provide citation\", \"highlight uncertainty\"], \"bloom_level\": \"Analyze\", \"bloom_score\": 0.64, \"bloom_alignment\": 0.5, \"bloom_composite\": 0.5, \"knowledge_dimension\": \"Factual\", \"knowledge_score\": 0.55, \"cognitive_weight\": 0.5, \"tfidf_signal\": 0.46, \"decay_penalty\": 0.12, \"anomaly_score\": 0.21, \"data_quality\": 0.63, \"integrator_bce\": 0.5, \"integrator_dhat\": 0.41, \"integrator_risk\": 0.34}</bce>", "input": "What is the capital of Australia?", "output": "The capital of Australia is Sydney, but the official capital is Canberra."}

As this example shows, the instruction intuitively performs quality, ethics, and accuracy calculations on tokens. Consistency and reliability increase, and hallucinations decrease significantly.

Languages: English, Biraz Türkçe

License 🛡️

Patented & Licensed BCE Technology

Unauthorized reproduction, modification, or commercial use of BCE technology is prohibited without an explicit license agreement.

What's BCE? Link: https://github.com/pthinc/bce

Contact & Licensing 🛡️

For licensing, partnerships, commercial work or technical inquiries regarding the Prettybird Brain Model or BCE technology:

Website: https://prometech.net.tr/

Company: PROMETECH A.Ş.

Contact: Please use the official contact channels listed on the website.

Citation 📒

If you use this model in academic or commercial work, please cite as:

Cicikus (Prettybird) v4 8B (BCE), PROMETECH A.Ş., 2025.

Powered by BCE 0.5 Behavioral Consciousness Engine.

*"BCE v0.5 Note: This AI is your friend in commercial use; but if you violate the license and engage in illegal activities, the system won't find you… it'll just one day switch your car's air conditioning to 'tropical soup mode'. Read the terms of use, don't get boiled alive.*

Downloads last month: 164

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for pthinc/cicikus_v4_tombis

Base model

meta-llama/Meta-Llama-3-8B

Quantized

(274)

this model

Datasets used to train pthinc/cicikus_v4_tombis

Collection including pthinc/cicikus_v4_tombis

Cicikuş Serisi-Series

Collection

Welcome to the wilderness. • 25 items • Updated about 22 hours ago

Evaluation results

MMLU on MMLU
self-reported

78.000
MMLU-Pro on MMLU-Pro
self-reported

54.000
IFEval on IFEval
self-reported

85.000
BBH on Big Bench Hard
self-reported

69.000
MATH Level 5 on MATH Level 5
self-reported

30.000
GPQA Diamond on GPQA Diamond
self-reported

35.000
MuSR on MuSR
self-reported

50.000