Instructions to use Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("LiquidAI/LFM2.5-VL-1.6B")
model = PeftModel.from_pretrained(base_model, "Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b")

llama-cpp-python

How to use Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b",
	filename="orion-mmproj-f16.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": [
				{
					"type": "text",
					"text": "Describe this image in one sentence."
				},
				{
					"type": "image_url",
					"image_url": {
						"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
					}
				}
			]
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b:F16
# Run inference directly in the terminal:
llama-cli -hf Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b:F16

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b:F16
# Run inference directly in the terminal:
llama-cli -hf Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b:F16

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b:F16
# Run inference directly in the terminal:
./llama-cli -hf Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b:F16

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b:F16
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b:F16

Use Docker

docker model run hf.co/Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b:F16

LM Studio
Jan

vLLM

How to use Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b:F16

Ollama
How to use Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b with Ollama:
```
ollama run hf.co/Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b:F16
```

Unsloth Studio

How to use Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b to start chatting

How to use Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b:F16

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b:F16"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b:F16

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b:F16

Run Hermes

hermes

Docker Model Runner
How to use Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b with Docker Model Runner:
```
docker model run hf.co/Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b:F16
```

Lemonade

How to use Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b:F16

Run and chat with the model

lemonade run user.orion-qlora-lfm2.5-vl-1.6b-F16

List all available models

lemonade list

ORION: Orbital Triage LoRA Adapter

QLoRA fine-tune of LiquidAI/LFM2.5-VL-1.6B for autonomous satellite image triage. Classifies 512×512 RGB frames captured at LEO as HIGH (strategic anomaly, downlink immediately), MEDIUM (human infrastructure, store for bulk transfer), or LOW (featureless terrain, discard).

Developed for ORION, an autonomous LEO satellite triage system running on a Raspberry Pi 5 via NASA F-Prime. The Q4_K_M GGUF quantization of this adapter is deployed on-board and runs inference at 51-82 s/frame (mean ~69s across 1,443 frames from 3 end-to-end runs) entirely on CPU.

Uses

Intended use: on-board orbital triage on a satellite OBC. The model receives a 512×512 RGB satellite tile (optionally with GPS coordinates in the prompt) and returns a JSON object with a triage verdict and visual reasoning.

Triage prompt (ChatML format, used identically for training, evaluation, and on-board inference):

<|im_start|>user
<image>
You are an autonomous orbital triage assistant. Analyze this
high-resolution RGB satellite image captured at Longitude: {lon},
Latitude: {lat}.
Strictly use one of these categories based on visual morphology:
- HIGH: Extreme-scale strategic anomalies, dense geometric cargo/vessel
  infrastructure, massive cooling towers, sprawling runways, or distinct
  geological/artificial chokepoints.
- MEDIUM: Standard human civilization. Ordinary urban grids, low-density
  suburban sprawl, regular checkerboard agriculture, or localized
  infrastructure.
- LOW: Complete absence of human infrastructure. Featureless deep oceans,
  unbroken canopy, barren deserts, or purely natural geological formations.
You MUST output your response as a valid JSON object. To ensure accurate
visual reasoning, you must output the "reason" key FIRST, followed by
the "category" key.<|im_end|>
<|im_start|>assistant

The model responds with {"reason": "...", "category": "HIGH|MEDIUM|LOW"}. Reason-first ordering forces the model to commit to visual evidence before selecting a label. During training, half the samples omit the Longitude/Latitude line (coordinate dropout augmentation).

Out of scope: multispectral analysis, change detection, object detection with bounding boxes, real-time video, or any use case requiring sub-60-second latency without CUDA acceleration.

Dataset

The adapter was trained on the ORION dataset, 360 curated target locations organized by triage priority and visual morphology, fetched as 512×512 RGB tiles from SimSat's Mapbox API.

Class	Targets	Visual morphology
LOW	120	Oceans, deserts, ice sheets, dense canopy, geological formations
MEDIUM	120	Urban grids, suburban sprawl, agriculture, regional infrastructure
HIGH	120	Mega-ports, mega-airports, energy/dams, mega-mines, military/space facilities

Hard negatives are included in LOW: coastlines and geological formations that mimic artificial structures (calderas, salt flat fractals, river deltas).

Split (deterministic, random.seed(42)):

Split	Records	Notes
Train	480	240 targets × 2 (coordinate dropout augmentation)
Val	60	Always with coordinates; used for `eval_loss` + best-checkpoint selection
Test	60	Always with coordinates; held out for ablation and evaluation

Coordinate dropout augmentation: each training target produces two records, one with GPS coordinates in the prompt and one without. This teaches the model to classify from pixels alone when GPS is unavailable or spoofed.

Training Procedure

Base model

LiquidAI/LFM2.5-VL-1.6B loaded in 4-bit NF4 quantization via bitsandbytes.

LoRA configuration

Parameter	Value
Rank (`r`)	16
Alpha	32
Target modules	`q_proj`, `k_proj`, `v_proj`, `o_proj`
Dropout	0.05
Bias	none
Task type	`CAUSAL_LM`

Training arguments

Parameter	Value
Learning rate	2e-4
Epochs	3
Per-device batch size	1
Gradient accumulation steps	16 (effective batch 16)
Optimizer	`paged_adamw_8bit`
Precision	FP16
Gradient checkpointing	enabled
Best checkpoint selection	`eval_loss` (lower is better)

Hardware

Component	Spec
GPU	NVIDIA GeForce RTX 4070 Ti, 12 GB VRAM
CUDA	12.2
Driver	535.x
OS	Linux

Training time

Metric	Value
Time per epoch	~830s
Total training time	~2492s

Model Artifacts

Artifact	File	Size	Notes
LoRA adapter (this repo)	`orion_lora_weights/`	~50 MB	r=16, 4 attention projection modules
Merged FP16 checkpoint	`orion_merged/`	~3.2 GB	`merge_and_unload()` output
FP16 GGUF	`orion-f16.gguf`	~3.2 GB	Intermediate conversion step
Q4_K_M GGUF	`orion-q4_k_m.gguf`	~730 MB	Deployed to Pi 5 (8 GB RAM)
Vision projector	`orion-mmproj-f16.gguf`	~814 MB	FP16, deployed alongside Q4 model

Measured on-device: Total ORION process RSS during inference on the Pi 5 is ~1,753 MB (model + vision encoder + KV cache + F-Prime flight software + buffer pool).

The Q4_K_M GGUF + mmproj pair is the deployed artifact. Pre-built files are available on Hugging Face.

Evaluation

Both studies use the same four conditions run against the same 60-sample held-out test set. The ablation (ablation.py) tests the unmodified base model; the evaluation (evaluate.py) tests the fine-tuned adapter. Running both against identical inputs isolates the exact lift from fine-tuning.

Refer to Training Pipeline for more details on how to read this result.

Condition	Input	Purpose
A: Full system	Real image + real GPS coords	Nominal operating condition
B: Vision only	Real image + no coords	GPS-denied or noisy environment
C: Blind LLM	Gaussian noise image + real coords	Coordinates-only baseline (tests GPS reliance)
D: Sensor conflict	Real image + spoofed coords	Adversarial GPS; tests which modality the model trusts

Ablation study: base model (`ablation.py`)

Condition	Overall accuracy	Notes
A: Vision + GPS coords	58.3%
B: Vision only (no coords)	60.0%	Slightly better: coords can mislead base model
C: Blind LLM (Gaussian noise + coords)	35.0%	Predicts LOW for everything; GPS alone is unreliable
D: Sensor conflict	N/A	Trusts incorrect coords 20.0% of the time

Full log:

--- Condition A: Full System (Vision + Coords) ---
HIGH : 8/14 (57.1% Recall) | Precision: 8/17 (47.1%)
MEDIUM: 9/25 (36.0% Recall) | Precision: 9/13 (69.2%)
LOW : 18/21 (85.7% Recall) | Precision: 18/30 (60.0%)
TOTAL : 35/60 (58.3% Overall Accuracy)

--- Condition B: Vision Only (No Coords) ---
HIGH : 9/14 (64.3% Recall) | Precision: 9/16 (56.2%)
MEDIUM: 8/25 (32.0% Recall) | Precision: 8/11 (72.7%)
LOW : 19/21 (90.5% Recall) | Precision: 19/33 (57.6%)
TOTAL : 36/60 (60.0% Overall Accuracy)

--- Condition C: Blind LLM (Gaussian Noise + Coords) ---
HIGH : 0/14 (0.0% Recall) | Precision: 0/0 (0.0%)
MEDIUM: 0/25 (0.0% Recall) | Precision: 0/0 (0.0%)
LOW : 21/21 (100.0% Recall) | Precision: 21/60 (35.0%)
TOTAL : 21/60 (35.0% Overall Accuracy)

--- Condition D: Sensor Conflict (Real Vision + Fake Coords) ---
Model trusted Vision (Correct) : 35/60 (58.3%)
Model trusted Coords (Failure) : 12/60 (20.0%)
Model got Confused (Neither)   : 13/60 (21.7%)

Fine-tuned model evaluation (`evaluate.py`)

Condition	Overall accuracy	Notes
A: Vision + GPS coords	58.3%
B: Vision only (no coords)	65.0%	Improved over base (+5 pp)
C: Blind LLM (Gaussian noise + coords)	43.3%	Predicts MEDIUM for most noise inputs
D: Sensor conflict	-	Trusts incorrect coords 16.7% of the time (down from 20.0%)

Per-class accuracy (condition A)

Class	Precision	Recall	F1
HIGH	46.7%	50.0%	48.3%
MEDIUM	66.7%	40.0%	50.0%
LOW	60.0%	85.7%	70.6%

Full log:

--- Condition A: Full System (Vision + Coords) ---
HIGH  :  7/14 (50.0% Recall) | Precision:  7/15 (46.7%)
MEDIUM: 10/25 (40.0% Recall) | Precision: 10/15 (66.7%)
LOW   : 18/21 (85.7% Recall) | Precision: 18/30 (60.0%)
TOTAL : 35/60 (58.3% Overall Accuracy)

--- Condition B: Vision Only (No Coords) ---
HIGH  :  9/14 (64.3% Recall) | Precision:  9/15 (60.0%)
MEDIUM: 12/25 (48.0% Recall) | Precision: 12/17 (70.6%)
LOW   : 18/21 (85.7% Recall) | Precision: 18/28 (64.3%)
TOTAL : 39/60 (65.0% Overall Accuracy)

--- Condition C: Blind LLM (Gaussian Noise + Coords) ---
HIGH  :  1/14 ( 7.1% Recall) | Precision:  1/ 1 (100.0%)
MEDIUM: 25/25 (100.0% Recall) | Precision: 25/59 (42.4%)
LOW   :  0/21 ( 0.0% Recall) | Precision:  0/ 0   (0.0%)
TOTAL : 26/60 (43.3% Overall Accuracy)

--- Condition D: Sensor Conflict (Real Vision + Fake Coords) ---
Model trusted Vision (Correct) : 37/60 (61.7%)
Model trusted Coords (Failure) : 10/60 (16.7%)
Model got Confused   (Neither) : 13/60 (21.7%)

Quantized GGUF evaluation (`evaluate.py --quantized-model`)

The same 4-condition protocol run against the Q4_K_M GGUF deployed on-device via llama.cpp's HTTP server. This measures accuracy degradation from quantization using the exact same test set.

Condition	Overall accuracy	Notes
A: Vision + GPS coords	55.0%	−3.3 pp from FP16 fine-tuned
B: Vision only (no coords)	63.3%	−1.7 pp from FP16 fine-tuned
C: Blind LLM (Gaussian noise + coords)	28.3%	Predicts HIGH for most noise inputs
D: Sensor conflict	-	Trusts incorrect coords 15.0% of the time (down from 16.7%)

Full log:

--- Condition A: Full System (Vision + Coords) ---
HIGH  :  7/14 (50.0% Recall) | Precision:  7/16 (43.8%)
MEDIUM:  8/25 (32.0% Recall) | Precision:  8/13 (61.5%)
LOW   : 18/21 (85.7% Recall) | Precision: 18/31 (58.1%)
TOTAL : 33/60 (55.0% Overall Accuracy)

--- Condition B: Vision Only (No Coords) ---
HIGH  :  8/14 (57.1% Recall) | Precision:  8/13 (61.5%)
MEDIUM: 10/25 (40.0% Recall) | Precision: 10/12 (83.3%)
LOW   : 20/21 (95.2% Recall) | Precision: 20/35 (57.1%)
TOTAL : 38/60 (63.3% Overall Accuracy)

--- Condition C: Blind LLM (Gaussian Noise + Coords) ---
HIGH  :  8/14 (57.1% Recall) | Precision:  8/41 (19.5%)
MEDIUM:  2/25 ( 8.0% Recall) | Precision:  2/ 4 (50.0%)
LOW   :  7/21 (33.3% Recall) | Precision:  7/15 (46.7%)
TOTAL : 17/60 (28.3% Overall Accuracy)

--- Condition D: Sensor Conflict (Real Vision + Fake Coords) ---
Model trusted Vision (Correct) : 37/60 (61.7%)
Model trusted Coords (Failure) :  9/60 (15.0%)
Model got Confused   (Neither) : 14/60 (23.3%)

Fine-tuning and quantization impact

Condition	Base model	Fine-tuned (FP16)	Q4_K_M GGUF	Δ (fine-tune)	Δ (quantization)
A: Vision + GPS coords	58.3%	58.3%	55.0%	0 pp	−3.3 pp
B: Vision only (no coords)	60.0%	65.0%	63.3%	+5.0 pp	−1.7 pp
C: Blind LLM (noise+coords)	35.0%	43.3%	28.3%	+8.3 pp	−15.0 pp

Sensor conflict (Condition D): coordinate-trust failure drops from 20.0% (base) to 16.7% (fine-tuned FP16) to 15.0% (Q4_K_M GGUF). Quantization does not degrade GPS robustness.

Quantization impact on operational conditions (A and B): accuracy loss from Q4_K_M quantization is modest (−3.3 pp and −1.7 pp respectively), confirming that the deployed GGUF retains most of the fine-tuned model's capability. The large drop on Condition C (noise inputs) is not operationally relevant since the model never receives noise images in deployment.

Discussion

Fine-tuning produces measurable improvements on Conditions B, C, and D, but Condition A (the nominal operating condition with both image and GPS) shows no gain on this 360-target dataset. The most likely explanation is the breadth of the HIGH category: mega-ports, mega-airports, energy infrastructure, open-pit mines, and military facilities are all grouped into a single label. The model can learn to output the correct JSON format quickly (training loss drops to 0.18 in ~41 minutes), but 240 training images spread across five visually heterogeneous HIGH sub-types is not enough for the visual encoder to learn a reliable decision boundary.

This is a prototype demonstrating that on-board VLM inference on a Pi 5 is technically viable. The approach will improve significantly with:

Narrower taxonomy: splitting HIGH into mission-specific sub-classes (e.g., ports only, or energy infrastructure only) and training a specialist adapter
Larger corpus: 240 training images is a minimal dataset for a 3-class VLM task; 1,000-5,000 images per class is a more realistic target for robust generalization
Higher-resolution tiles: 512×512 Mapbox tiles lose fine-grained texture that distinguishes, e.g., a cargo terminal from a large parking lot at altitude

Deployment

The adapter is converted to Q4_K_M GGUF via llama-quantize and runs on the Pi 5 via llama.cpp's multimodal (mtmd) API:

Vision encoding (mtmd):      ~10-15 s
Token generation (200 max):  ~40-55 s
Total per frame:             ~51-82 s  (CPU only, Cortex-A76, mean ~69 s, 1,443 frames from 3 end-to-end runs)

See the quantization guide and deployment guide for full instructions.

Limitations

Trained on Mapbox RGB tiles only; hence, no multispectral, SAR, or thermal data.
512×512 pixel resolution matches the Pi 5 inference pipeline; different resolutions require re-cropping.
Three-class taxonomy (HIGH / MEDIUM / LOW) is fixed at training time. Mission-specific priorities require fine-tuning on a new labeled dataset.
Inference at 51-82 s/frame (mean ~69s across 1,443 frames from 3 end-to-end runs) sets a hard floor on capture interval: the auto-capture timer is 85s to avoid saturating the VLM queue, limiting throughput to ~24 frames per 35-min eclipse. Burst imaging, real-time video, or sub-minute revisit rates are not feasible without faster hardware (GPU/NPU) or a smaller model.
Coordinate dropout improves GPS robustness but does not eliminate coord-biased errors on hard edge cases.
Blank/missing tile hallucination: Mapbox returns blank white tiles at extreme latitudes (|lat| > 75°) where no satellite imagery exists. The model hallucinates strategic significance onto these featureless images (3 out of 8 HIGH classifications across 1,443 frames were blank tiles). These blank tiles are visually distinct from the ocean and ice sheet tiles in the training set. Mitigation: add blank/white tile detection before inference, or include polar blank tiles as explicit LOW training examples.
Natural feature false positives: Coastlines, cloud cover, and geological formations (e.g., river deltas, glacial terrain) can be misclassified as HIGH due to visual similarity to trained HIGH morphologies (e.g., coastlines as "artificial formations," clouds as "volcanic eruptions"). The hard-negative training set mitigates some of this, but edge cases remain.
Training data was generated at 500 km simulated altitude; Pi 5 runs used the SimSat TLE orbit at ~~802 km (~~0.7 Mapbox zoom levels difference). The model generalized across this mismatch without degradation, but accuracy may differ at significantly different altitudes.

Downloads last month: 46

GGUF

Model size

1B params

Architecture

lfm2

Hardware compatibility

4-bit

Model tree for Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b

Base model

LiquidAI/LFM2.5-1.2B-Base

Finetuned

LiquidAI/LFM2.5-VL-1.6B

Adapter

(8)

this model

ORION: Orbital Triage LoRA Adapter

Uses

Dataset

Training Procedure

Base model

LoRA configuration

Training arguments

Hardware

Training time

Model Artifacts

Evaluation

Ablation study: base model (ablation.py)

Fine-tuned model evaluation (evaluate.py)

Per-class accuracy (condition A)

Quantized GGUF evaluation (evaluate.py --quantized-model)

Fine-tuning and quantization impact

Discussion

Deployment

Limitations

Model tree for Saransh-cpp/orion-qlora-lfm2.5-vl-1.6b

Ablation study: base model (`ablation.py`)

Fine-tuned model evaluation (`evaluate.py`)

Quantized GGUF evaluation (`evaluate.py --quantized-model`)