Instructions to use kimdonghwanAIengineer/coflux-ai-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use kimdonghwanAIengineer/coflux-ai-gguf with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="kimdonghwanAIengineer/coflux-ai-gguf",
	filename="model-q8_0.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use kimdonghwanAIengineer/coflux-ai-gguf with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf kimdonghwanAIengineer/coflux-ai-gguf:Q8_0
# Run inference directly in the terminal:
llama cli -hf kimdonghwanAIengineer/coflux-ai-gguf:Q8_0

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf kimdonghwanAIengineer/coflux-ai-gguf:Q8_0
# Run inference directly in the terminal:
llama cli -hf kimdonghwanAIengineer/coflux-ai-gguf:Q8_0

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf kimdonghwanAIengineer/coflux-ai-gguf:Q8_0
# Run inference directly in the terminal:
./llama-cli -hf kimdonghwanAIengineer/coflux-ai-gguf:Q8_0

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf kimdonghwanAIengineer/coflux-ai-gguf:Q8_0
# Run inference directly in the terminal:
./build/bin/llama-cli -hf kimdonghwanAIengineer/coflux-ai-gguf:Q8_0

Use Docker

docker model run hf.co/kimdonghwanAIengineer/coflux-ai-gguf:Q8_0

LM Studio
Jan
Ollama
How to use kimdonghwanAIengineer/coflux-ai-gguf with Ollama:
```
ollama run hf.co/kimdonghwanAIengineer/coflux-ai-gguf:Q8_0
```

Unsloth Studio

How to use kimdonghwanAIengineer/coflux-ai-gguf with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for kimdonghwanAIengineer/coflux-ai-gguf to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for kimdonghwanAIengineer/coflux-ai-gguf to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for kimdonghwanAIengineer/coflux-ai-gguf to start chatting

Atomic Chat new
Docker Model Runner
How to use kimdonghwanAIengineer/coflux-ai-gguf with Docker Model Runner:
```
docker model run hf.co/kimdonghwanAIengineer/coflux-ai-gguf:Q8_0
```

Lemonade

How to use kimdonghwanAIengineer/coflux-ai-gguf with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull kimdonghwanAIengineer/coflux-ai-gguf:Q8_0

Run and chat with the model

lemonade run user.coflux-ai-gguf-Q8_0

List all available models

lemonade list

license: llama3 language: - en library_name: gguf tags: - cybersecurity - security-scanning - workflow-automation - p2p - edge-ai - llama-3 - lora - gguf base_model: meta-llama/Meta-Llama-3-8B-Instruct pipeline_tag: text-generation model-index: - name: CoFlux AI results: []

CoFlux AI — Edge Security & Workflow Model

CoFlux AI is a lightweight, privacy-first AI model designed for the P2P AI Bridge System — a serverless collaboration platform where the host PC acts as the central hub. This model runs entirely on the host PC with zero cloud dependency, providing real-time security scanning and workflow automation within P2P tunnels.

Model Details


Base Model	Meta Llama 3 8B Instruct
Fine-tuning	LoRA (Low-Rank Adaptation), 3-stage training
Format	GGUF Q8_0 (8-bit quantization)
Size	~8.54 GB
Parameters	8B
Runtime	Host PC only (mobile devices send requests via P2P tunnel)
License	Llama 3 Community License

Intended Use

CoFlux AI is purpose-built for two core tasks within a P2P collaboration environment:

🔒 Security Scanning (Edge Security)

Detect malicious payloads (code, text, binary) from connected peers
Identify vulnerability patterns based on CVE data
Score risk levels and enforce allow/deny decisions
All scanning happens locally — no data leaves the host PC

⚡ Workflow Automation

Summarize collaboration messages and code reviews
Auto-tag and classify incoming content
Context-aware routing for task management
Template-based document generation

Training Pipeline

CoFlux AI was trained in 3 stages using LoRA fine-tuning on a 16GB VRAM GPU:

Stage 1: Security Domain Pretraining

Dataset: Primus-Seed (Trend Micro AI Lab)
Content: Curated cybersecurity text from MITRE, Wikipedia, and security company websites
Purpose: Inject cybersecurity domain knowledge into the base model

Stage 2: Instruction Tuning

Datasets:
- Primus-Instruct — Expert-curated cybersecurity QA tasks
- DetectVul/CVEFixes — Python vulnerability detection at statement level (21,571 functions, 7 vulnerability types)
Purpose: Teach the model to follow security scanning and workflow automation instructions

Stage 3: Reasoning Distillation

Dataset: Primus-Reasoning (distilled from o1-preview and DeepSeek-R1)
Purpose: Enhance multi-step security reasoning for complex threat analysis

Training Configuration

LoRA rank: 16
LoRA alpha: 32
Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Quantization: 4-bit (NF4) during training
Optimizer: paged_adamw_8bit
Gradient checkpointing: enabled

How to Use

With llama.cpp

# Download the model
huggingface-cli download kimdonghwanAIengineer/coflux-ai-gguf --local-dir ./coflux-ai

# Run with llama.cpp
./llama-cli -m ./coflux-ai/coflux-ai-q8_0.gguf -p "<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a security scanning module for a P2P collaboration system.<|eot_id|><|start_header_id|>user<|end_header_id|>
Scan this code for vulnerabilities:
\`\`\`python
import subprocess
user_input = input()
subprocess.call(user_input, shell=True)
\`\`\`<|eot_id|><|start_header_id|>assistant<|end_header_id|>" -n 256

With llama-cpp-python

from llama_cpp import Llama

llm = Llama(
    model_path="./coflux-ai-q8_0.gguf",
    n_ctx=2048,
    n_gpu_layers=-1,
)

# Security scanning
output = llm.create_chat_completion(
    messages=[
        {
            "role": "system",
            "content": "You are a security scanning module for a P2P collaboration system. Analyze code for vulnerabilities and respond with a risk assessment."
        },
        {
            "role": "user",
            "content": "Scan this code:\n```python\nimport os\nos.system(input('cmd: '))\n```"
        }
    ],
    temperature=0.1,
    max_tokens=512,
)
print(output["choices"][0]["message"]["content"])

# Workflow - Summarization
output = llm.create_chat_completion(
    messages=[
        {
            "role": "system",
            "content": "You are a workflow automation module. Summarize the given content concisely."
        },
        {
            "role": "user",
            "content": "Summarize: The team discussed Q3 roadmap. Backend focuses on API optimization. Frontend is redesigning the dashboard. Security audit next week."
        }
    ],
    temperature=0.3,
    max_tokens=256,
)
print(output["choices"][0]["message"]["content"])

With Transformers + PEFT (for further fine-tuning)

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct")

# Load LoRA adapter (if available separately)
# model = PeftModel.from_pretrained(base_model, "path/to/adapter")

System Architecture

CoFlux AI operates within the P2P AI Bridge System:

┌─────────────────────────────────────────────────┐
│                  Host PC (Hub)                   │
│                                                  │
│  ┌──────────┐  ┌───────────┐  ┌──────────────┐ │
│  │  Rust    │  │ CoFlux AI │  │  TypeScript  │ │
│  │  Core    │→ │  (This    │→ │  AI Router   │ │
│  │ Security │  │   Model)  │  │  + Workflow   │ │
│  │  Scan    │  │           │  │              │ │
│  └──────────┘  └───────────┘  └──────────────┘ │
│       ↑              ↑              ↑           │
│       └──────── Tauri IPC ──────────┘           │
│                      ↑                          │
│              WebRTC DataChannel                  │
│            (P2P, no central server)              │
└──────────┬───────────┬──────────────────────────┘
           │           │
    ┌──────┴──┐  ┌─────┴───┐
    │ Mobile  │  │  Guest  │
    │ Client  │  │  Device │
    │ (React  │  │         │
    │ Native) │  │         │
    └─────────┘  └─────────┘

Mobile & Guest devices send requests via P2P tunnel
CoFlux AI processes all inference locally on the host PC
Zero data leakage — nothing leaves the P2P network

Limitations

Model size: 8B parameters limits complex generation tasks (e.g., full page creation). For advanced generation, the system routes to external APIs via BYOK (Bring Your Own Key).
Language: Primarily trained on English cybersecurity data. Performance on other languages may vary.
Scope: Optimized for security scanning and workflow automation. Not intended as a general-purpose chatbot.
Code coverage: Vulnerability detection is strongest for Python (CVEFixes training data). Other languages rely on pattern-based detection from Primus datasets.

Privacy & Security

This model is designed with a privacy-first philosophy:

Runs 100% locally on the host PC — no cloud API calls for inference
All data stays within the P2P tunnel (WebRTC DataChannel)
Part of a Defense in Depth security architecture with 5 protection layers
Supports user-controlled privacy toggles (opt-out of conversation storage)

Citation

If you use CoFlux AI in your research or project, please cite:

@misc{coflux-ai-2025,
  title={CoFlux AI: Edge Security and Workflow Model for P2P Collaboration},
  author={Kim Donghwan},
  year={2025},
  url={https://huggingface.co/kimdonghwanAIengineer/coflux-ai-gguf}
}

Acknowledgments

Meta AI for Llama 3
Trend Micro AI Lab for the Primus cybersecurity dataset series
DetectVul for the CVEFixes vulnerability dataset

Downloads last month: 3

GGUF

Model size

8B params

Architecture

llama

Hardware compatibility

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kimdonghwanAIengineer/coflux-ai-gguf

Base model

meta-llama/Meta-Llama-3-8B

Quantized

(279)

this model

kimdonghwanAIengineer
/

coflux-ai-gguf