Instructions to use torchsight/beam-q4_K_M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use torchsight/beam-q4_K_M with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="torchsight/beam-q4_K_M",
	filename="beam-1.0-q4_K_M.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use torchsight/beam-q4_K_M with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf torchsight/beam-q4_K_M:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf torchsight/beam-q4_K_M:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf torchsight/beam-q4_K_M:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf torchsight/beam-q4_K_M:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf torchsight/beam-q4_K_M:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf torchsight/beam-q4_K_M:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf torchsight/beam-q4_K_M:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf torchsight/beam-q4_K_M:Q4_K_M

Use Docker

docker model run hf.co/torchsight/beam-q4_K_M:Q4_K_M

LM Studio
Jan
Ollama
How to use torchsight/beam-q4_K_M with Ollama:
```
ollama run hf.co/torchsight/beam-q4_K_M:Q4_K_M
```

Unsloth Studio new

How to use torchsight/beam-q4_K_M with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for torchsight/beam-q4_K_M to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for torchsight/beam-q4_K_M to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for torchsight/beam-q4_K_M to start chatting

Pi new

How to use torchsight/beam-q4_K_M with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf torchsight/beam-q4_K_M:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "torchsight/beam-q4_K_M:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use torchsight/beam-q4_K_M with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf torchsight/beam-q4_K_M:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default torchsight/beam-q4_K_M:Q4_K_M

Run Hermes

hermes

Docker Model Runner
How to use torchsight/beam-q4_K_M with Docker Model Runner:
```
docker model run hf.co/torchsight/beam-q4_K_M:Q4_K_M
```

Lemonade

How to use torchsight/beam-q4_K_M with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull torchsight/beam-q4_K_M:Q4_K_M

Run and chat with the model

lemonade run user.beam-q4_K_M-Q4_K_M

List all available models

lemonade list

idobrovolskyi commited on 11 days ago

Commit

4e83dd4

verified ·

1 Parent(s): 577c994

sync README with paper-final numbers

Browse files

Files changed (1) hide show

README.md +62 -87

README.md CHANGED Viewed

@@ -12,104 +12,79 @@ base_model: Qwen/Qwen3.5-27B
 # TorchSight Beam q4_K_M
-Cybersecurity document classifier. LoRA fine-tune of **Qwen 3.5 27B**, quantized to q4_K_M. ~17 GB GGUF.
-Recommended hardware: 32 GB.
-## Benchmark Results
-Two benchmarks evaluated under identical methodology
-(alpaca prompt, Ollama `/api/generate`, Modelfile temperature 0.1,
-`num_predict=2048`):
-### Primary — eval-1000-synthetic (1000 stratified samples)
-| Model | Category Acc 95% CI | Subcategory Acc | Type |
-|---|---|---|---|
-| **Beam q4_K_M** | **95.1%** [93.8, 96.4] | 48.5% | Local (LoRA) |
-| Beam f16 | 93.0% [91.2, 94.5] | 51.3% | Local (LoRA) |
-| Beam q8_0 | 92.7% [90.9, 94.2] | 51.3% | Local (LoRA) |
-| Claude Sonnet 4 | 79.9% | 23.0% | Commercial API |
-| Claude Opus 4 | 79.9% | 22.5% | Commercial API |
-| GPT-5 | 76.9% | 11.6% | Commercial API |
-| Gemini 2.5 Pro | 75.4% | 21.0% | Commercial API |
-| Regex baseline (49 patterns) | 52.7% | — | Rule-based |
-| Qwen 3.5 27B base (no LoRA) | 43.3% | 4.3% | Local |
-### External — eval-500-external (500 held-out samples from real public datasets)
-Held-out splits of training sources (NVD, NIST, AI4Privacy, Enron, phishing) plus
-MTSamples (medical transcriptions explicitly **excluded** from training).
-| Model | Category Acc 95% CI | Subcategory Acc | Δ vs. primary |
-|---|---|---|---|
-| **Beam q4_K_M** | **93.8%** [91.3, 95.6] | 51.4% | −1.3 pp |
-| Beam q8_0 | 91.2% [88.4, 93.4] | 46.4% | −1.5 pp |
-| Beam f16 | 91.0% [88.2, 93.2] | 47.2% | −2.0 pp |
-| Claude Sonnet 4 | 86.4% | — | +6.5 pp |
-| Gemini 2.5 Pro | 82.0% | — | +6.6 pp |
-| GPT-5 | 65.8% | — | −11.1 pp |
-| Regex baseline | 29.6% | — | −23.1 pp |
-| Qwen 3.5 27B base | 28.0% | 0% | −15.3 pp |
-Beam q4_K_M's gap over Claude Sonnet 4 is statistically significant
-(McNemar's χ²₁ = 126.7, p ≈ 2 × 10⁻²⁹), as is the gap over the
-unfine-tuned Qwen base (χ²₁ = 489.5, p ≈ 2 × 10⁻¹⁰⁸ — fine-tuning
-contributes +65.8 pp on external data with the identical prompt).
 ## Usage with Ollama
 ```bash
-# Pull from Ollama Hub
-ollama pull torchsight/beam:q4_K_M
-# Or build locally from this GGUF + Modelfile
-ollama create torchsight/beam:q4_K_M -f Modelfile
 ```
-Modelfile:
-```
-FROM ./beam-1.0-q4_K_M.gguf
-SYSTEM "You are TorchSight, a cybersecurity document classifier. Analyze the provided text and identify ALL security-relevant findings.
-For each finding, output a JSON object with:
-- category: one of [pii, credentials, financial, medical, confidential, malicious, safe]
-- subcategory: specific type (e.g., pii.identity, malicious.injection, credentials.api_key)
-- severity: one of [critical, high, medium, low, info]
-- explanation: detailed explanation including specific values found.
-If a document contains multiple types of sensitive data, return a finding for EACH one.
-If the text is clean/safe, output a single finding with category \"safe\".
-Respond ONLY with a JSON array of findings."
-PARAMETER temperature 0.1
-PARAMETER top_p 0.9
-PARAMETER num_predict 2048
-```
-## Reproducibility
-Eval scripts and benchmark data: <https://github.com/torchsight/torchsight/tree/main/beam/evaluation>
 ```bash
-git clone https://github.com/torchsight/torchsight
-cd torchsight/beam/evaluation
-BEAM_MODEL=torchsight/beam:q4_K_M python scripts/eval_beam.py     # primary
-BEAM_MODEL=torchsight/beam:q4_K_M python scripts/eval_external.py # external
 ```
-## Citation
-```bibtex
-@misc{torchsight-beam-q4_K_M-2026,
-  title  = {TorchSight Beam q4_K_M: cybersecurity document classifier},
-  author = {Dobrovolskyi, Ivan},
-  year   = {2026},
-  url    = {https://huggingface.co/torchsight/beam-q4_K_M},
-}
-```
 ## License
-Apache 2.0

 # TorchSight Beam q4_K_M
+Cybersecurity document classifier. LoRA fine-tune of **Qwen 3.5 27B**,
+quantized to q4_K_M. Approximately 17 GB GGUF.
+Recommended hardware: 32 GB unified memory (e.g. M-series Mac) or 24 GB GPU.
+This is the **default** quantization for the TorchSight system —
+released alongside:
+> Dobrovolskyi, I. *Security Document Classification with a Fine-Tuned Local
+> Large Language Model: Benchmark Data and an Open-Source System.* Journal of
+> Information Security and Applications, 2026.
+## Benchmark results
+Evaluated under identical methodology (alpaca prompt, Ollama `/api/generate`,
+temperature = 0, `num_predict = 2048`) on the companion dataset
+[`torchsight/cybersecurity-classification-benchmark`](https://huggingface.co/datasets/torchsight/cybersecurity-classification-benchmark).
+Canonical numbers live in that repo's `BENCHMARK_NUMBERS.md`.
+### Primary — eval-1000-synthetic (n = 1,000)
+| Model              | Type             | Cat. acc [95% CI]      | Subcat. acc |
+|---|---|---:|---:|
+| **Beam q4_K_M**    | Local (LoRA)     | **95.0%** [93.5, 96.2] | 48.2% |
+| Beam f16           | Local (LoRA)     | 93.2% [91.5, 94.6]     | 51.1% |
+| Beam q8_0          | Local (LoRA)     | 93.0% [91.2, 94.4]     | 51.4% |
+| Claude Sonnet 4    | Commercial API   | 79.9% [77.3, 82.3]     | 23.0% |
+| Claude Opus 4      | Commercial API   | 79.9% [77.3, 82.3]     | 22.5% |
+| GPT-5              | Commercial API   | 76.9% [74.2, 79.4]     | 11.6% |
+| Gemini 2.5 Pro     | Commercial API   | 75.4% [72.6, 78.0]     | 21.0% |
+| Qwen 3.5 27B base  | Local (no LoRA)  | 86.3% [84.0, 88.3]     | 19.0% |
+| Regex (48 patterns)| Rule-based       | 52.7% [49.6, 55.8]     | —     |
+95% confidence intervals are Wilson-score. Beam q4_K_M's advantage over every
+commercial baseline is significant under pairwise McNemar's tests after
+Bonferroni correction (α = 0.05).
+### External — eval-500-external (n = 500)
+| Model              | Cat. acc [95% CI]      | Δ vs. primary |
+|---|---:|---:|
+| **Beam q4_K_M**    | **93.8%** [91.3, 95.6] | −1.2 pp |
+| Beam f16           | 91.2% [88.4, 93.4]     | −2.0 pp |
+| Beam q8_0          | 91.2% [88.4, 93.4]     | −1.8 pp |
+| Claude Sonnet 4    | 86.4% [83.1, 89.1]     | +6.5 pp |
+| Gemini 2.5 Pro     | 82.0% [78.4, 85.1]     | +6.6 pp |
+| Qwen 3.5 27B base  | 86.6% [83.3, 89.3]     | +0.3 pp |
+| GPT-5              | 65.8% [61.5, 69.8]     | −11.1 pp |
+| Regex baseline     | 29.6% [25.8, 33.7]     | −23.1 pp |
 ## Usage with Ollama
 ```bash
+ollama pull torchsight/beam-q4_K_M
+ollama run torchsight/beam-q4_K_M
 ```
+Or via the [TorchSight CLI](https://github.com/IvanDobrovolsky/torchsight)
+for full document-scanning workflow:
 ```bash
+./install.sh
+torchsight /path/to/scan
 ```
+## Training
+- Base: Qwen 3.5 27B (dense)
+- Method: LoRA (r = 128, α = 256), bf16, 5 epochs
+- Dataset: 78,358 balanced samples — see [`torchsight/beam-training-data`](https://huggingface.co/datasets/torchsight/beam-training-data)
+- Hardware: 8× NVIDIA A100 80GB SXM4, 10.5 hours
 ## License
+Apache 2.0. The base model (Qwen 3.5 27B) carries its own license; consult
+upstream terms for use.