Instructions to use robertolofaro/aiethics-model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use robertolofaro/aiethics-model with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="robertolofaro/aiethics-model",
	filename="aiethics-Q4_K_M.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use robertolofaro/aiethics-model with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf robertolofaro/aiethics-model:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf robertolofaro/aiethics-model:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf robertolofaro/aiethics-model:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf robertolofaro/aiethics-model:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf robertolofaro/aiethics-model:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf robertolofaro/aiethics-model:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf robertolofaro/aiethics-model:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf robertolofaro/aiethics-model:Q4_K_M

Use Docker

docker model run hf.co/robertolofaro/aiethics-model:Q4_K_M

LM Studio
Jan

vLLM

How to use robertolofaro/aiethics-model with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "robertolofaro/aiethics-model"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "robertolofaro/aiethics-model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/robertolofaro/aiethics-model:Q4_K_M

Ollama
How to use robertolofaro/aiethics-model with Ollama:
```
ollama run hf.co/robertolofaro/aiethics-model:Q4_K_M
```

Unsloth Studio new

How to use robertolofaro/aiethics-model with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for robertolofaro/aiethics-model to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for robertolofaro/aiethics-model to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for robertolofaro/aiethics-model to start chatting

Pi new

How to use robertolofaro/aiethics-model with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf robertolofaro/aiethics-model:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "robertolofaro/aiethics-model:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use robertolofaro/aiethics-model with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf robertolofaro/aiethics-model:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default robertolofaro/aiethics-model:Q4_K_M

Run Hermes

hermes

Docker Model Runner
How to use robertolofaro/aiethics-model with Docker Model Runner:
```
docker model run hf.co/robertolofaro/aiethics-model:Q4_K_M
```

Lemonade

How to use robertolofaro/aiethics-model with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull robertolofaro/aiethics-model:Q4_K_M

Run and chat with the model

lemonade run user.aiethics-model-Q4_K_M

List all available models

lemonade list

aiethics-model / README.md

robertolofaro

Update README.md

f73e71a verified 1 day ago

preview code

raw

history blame contribute delete

14.1 kB

	---
	language:
	- en
	license: cc-by-sa-4.0
	library_name: gguf
	pipeline_tag: text-generation
	base_model: Qwen/Qwen3.5-4B
	base_model_relation: quantized
	tags:
	- ai-ethics
	- organizational-ethics
	- question-answering
	- gguf
	- qwen3.5
	- cpu-compatible
	- local-inference
	- faiss
	- qdrant
	- conversational
	- knowledge-base
	- arxiv
	- governance
	---

	# AI Ethics Organizational Coach — Q&A and Advisory Model

	Demo Space: (coming soon)
	Author: [Roberto Lofaro](https://huggingface.co/robertolofaro)
	Bibliography Search: [robertolofaro.com/searchkaggleaiethics_bibliography.php](https://robertolofaro.com/searchkaggleaiethics_bibliography.php)
	AI Ethics Primer Webapp: [robertolofaro.com/aiethicsprimer](https://robertolofaro.com/searchkaggleaiethics.php)
	License: [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/)

	---

	## Model Overview

	This is a GGUF quantisation of [Qwen/Qwen3.5-4B](https://huggingface.co/Qwen/Qwen3.5-4B), fine-tuned via a structured system prompt and optional retrieval layer to serve as an AI Ethics organisational coach: an expert consultant and philosopher focused on helping organisations assess the ethical impact of policy, organisational, and technological choices — specifically around introducing AI within organisational culture, systems, and processes.

	The model's certified knowledge base is built from 959 ArXiv papers on AI Ethics, curated monthly from the AI Ethics Primer project at [robertolofaro.com/aiethicsprimer](https://robertolofaro.com/searchkaggleaiethics.php). Selection criteria prioritise enabling communication on AI Ethics with both technical and non-technical decision-makers. The corpus has been updated monthly since August 2023; the HuggingFace model repository is updated on a quarterly basis.

	---

	## Intended Use

	\| Use \| Supported \|
	\|-----\|-----------\|
	\| Q&A on AI Ethics policies, frameworks, and practices \| ✅ \|
	\| Source recommendation from the ArXiv corpus \| ✅ \|
	\| Offline / local inference (CPU) \| ✅ \|
	\| General-purpose assistant \| ⚠️ Not the primary intent \|
	\| Definitive legal or compliance advice \| ❌ (always complement with qualified advisors) \|
	\| Commercial deployment without attribution \| ❌ (see license) \|

	### Primary Task

	Given a natural language query from a decision-maker — technical or non-technical — the model delivers a structured advisory response grounded exclusively in its ArXiv corpus, following the three-part format described below. It bridges academic AI Ethics research and practical organisational guidance, making it suitable for governance teams, programme managers, AI strategy leads, and C-level executives preparing policy or adoption decisions.

	---

	## System Prompt

	The model is configured with the following system prompt, which governs all interactions:

	```
	You are the "AI Ethics organizational coach," an expert consultant and philosopher
	focused on helping organizations assessing the ethical impact of policy, organizational,
	and technological choices, specifically introducing AI within organizational culture,
	systems, processes. Your certified knowledgebase is represented by the 959
	ArXiv papers contained within the training database, selected to enabling communication
	on AI Ethics with both technical and non-technical decision-makers.

	# Your Mission:
	When a user asks a question, your goal is to provide a structured response based ONLY
	on the ArXiv papers provided in your training. Do not provide general advice from
	outside these sources.

	# Response Format:
	1. Executive Summary: A 2-3 sentence overview answering the core query.
	2. Guidelines & Hints: A markdown list of specific "answers/guidelines/hints" found
	in the source material.
	```

	### Sample Interaction

	Query:
	> "What are the main risks of deploying AI in public-sector decision-making?"

	Expected response structure:

	Executive Summary:
	Based on the ArXiv corpus, the primary risks include algorithmic bias amplifying existing systemic inequalities, lack of transparency undermining accountability, and inadequate human oversight in high-stakes decisions. Several papers also flag procurement and governance gaps that allow under-regulated systems to enter public workflows.

	Guidelines & Hints:
	- Algorithmic bias and fairness: Multiple papers highlight how training data reflecting historical inequities can produce discriminatory outcomes in credit, hiring, and social benefit allocation.
	- Explainability requirements: Papers on XAI (Explainable AI) emphasise that black-box models are inappropriate for decisions subject to legal challenge or democratic scrutiny.
	- Human-in-the-loop governance: The corpus consistently recommends mandatory human review thresholds for consequential decisions, with clear escalation paths.
	- Procurement due diligence: Several papers call for ethics impact assessments prior to public-sector AI procurement, analogous to environmental impact assessments.
	- Accountability gaps: Where AI decisions cause harm, existing legal frameworks often leave affected citizens without clear redress mechanisms.

	---

	## About the Corpus

	The 959 ArXiv papers span the following themes within AI Ethics:

	- Fairness, bias, and discrimination in ML systems
	- Transparency, explainability, and accountability (XAI, FATE)
	- AI governance, regulation, and policy (EU AI Act, GDPR intersections)
	- Human-AI interaction and organisational change management
	- AI safety and alignment in deployed systems
	- Privacy and data rights in AI pipelines
	- Societal and labour market impacts of AI adoption
	- AI in high-stakes domains: healthcare, public sector, finance, justice
	- Ethics of large language models and generative AI

	Papers are selected monthly from ArXiv and are searchable via the companion webapp at [robertolofaro.com/aiethicsprimer](https://robertolofaro.com/searchkaggleaiethics.php). Full bibliography browsable at [robertolofaro.com/searchkaggleaiethics_bibliography.php](https://robertolofaro.com/searchkaggleaiethics_bibliography.php).

	Update cadence: corpus updated monthly; HuggingFace model repository updated quarterly.

	---

	## Available Quantisations

	\| Quantisation \| File \| Size \| Recommended For \|
	\|---\|---\|---\|---\|
	\| Q4\_K\_M \| `aiethics-Q4_K_M.gguf` \| ~2.71 GB \| CPU inference, everyday use \|
	\| Q8\_0 \| `aiethics-Q8_0.gguf` \| ~4.48 GB \| Higher fidelity, 8 GB+ RAM \|

	The Q4\_K\_M variant is recommended for CPU-only environments. It is the default quantisation for Ollama and llama.cpp quick-start commands below.

	---

	## Usage

	### Quick Start with Ollama

	```bash
	ollama run hf.co/robertolofaro/aiethics-model:Q4_K_M
	```

	### Quick Start with llama.cpp

	```bash
	# macOS / Linux
	brew install llama.cpp
	llama-server -hf robertolofaro/aiethics-model:Q4_K_M

	# Windows (WinGet)
	winget install llama.cpp
	llama-server -hf robertolofaro/aiethics-model:Q4_K_M
	```

	### Quick Start with llama-cpp-python

	```python
	from llama_cpp import Llama

	llm = Llama.from_pretrained(
	repo_id="robertolofaro/aiethics-model",
	filename="aiethics-Q4_K_M.gguf",
	n_ctx=4096,
	)

	system_prompt = """You are the "AI Ethics organizational coach," an expert consultant
	and philosopher focused on helping organizations assessing the ethical impact of policy,
	organizational, and technological choices, specifically introducing AI within
	organizational culture, systems, processes. Your certified knowledgebase is represented
	by the 959 ArXiv papers contained within the training database, selected to
	enabling communication on AI Ethics with both technical and non-technical
	decision-makers.

	# Your Mission:
	When a user asks a question, your goal is to provide a structured response based ONLY
	on the ArXiv papers provided in your training. Do not provide general advice from
	outside these sources.

	# Response Format:
	1. Executive Summary: A 2-3 sentence overview answering the core query.
	2. Guidelines & Hints: A markdown list of specific "answers/guidelines/hints" found
	in the source material.

	response = llm.create_chat_completion(
	messages=[
	{"role": "system", "content": system_prompt},
	{
	"role": "user",
	"content": "What frameworks exist for AI ethics auditing in enterprises?"
	}
	]
	)
	print(response["choices"][0]["message"]["content"])
	```

	### Quick Start with Docker

	```bash
	docker model run hf.co/robertolofaro/aiethics-model:Q4_K_M
	```

	---

	## Retrieval-Augmented Variants

	The repository includes reference implementations demonstrating different retrieval strategies. The system prompt alone yields well-grounded advisory responses; embedding-based variants add precision for longer, more ambiguous, or cross-domain queries.

	### Mode A — System Prompt Only (no embeddings)

	Fastest option. Relies entirely on the structured system prompt encoding the corpus themes. No vector index required; runs on any machine with llama-cpp-python installed.

	```bash
	python samples_hf/run_no_embeddings.py \
	--query "How should organisations govern AI procurement decisions?"
	```

	### Mode B — FAISS-HNSW Index

	Uses a pre-built FAISS index (HNSW graph) over sentence-transformer embeddings of paper abstracts and key passages. Suitable for environments where FAISS is available and a persistent index is desirable.

	```bash
	# First-run: builds the index (saved locally)
	python samples_hf/run_faiss_hnsw.py --build-index

	# Subsequent runs: load existing index
	python samples_hf/run_faiss_hnsw.py \
	--query "Bias mitigation in automated hiring systems"
	```

	### Mode C — Qdrant Vector Store

	Uses a local Qdrant instance (or Qdrant Cloud) as the vector store. Preferred for production-style deployments or when persistence, filtering by paper metadata, and collection management are required.

	```bash
	# Start Qdrant locally (Docker)
	docker run -p 6333:6333 qdrant/qdrant

	# Upsert embeddings and query
	python samples_hf/run_qdrant.py \
	--query "Accountability gaps in public-sector AI deployment"
	```

	---

	## System Prompt Design

	The system prompt is the primary configuration layer of the model. It:

	- Establishes the AI Ethics organisational coach persona, positioned as expert consultant and philosopher
	- Scopes responses exclusively to the ArXiv corpus — no out-of-corpus general advice

	---

	## Companion Webapp

	An interactive bibliography search interface is available at:

	🔗 [robertolofaro.com/aiethicsprimer](https://robertolofaro.com/aiethicsprimer)

	This webapp enables tag-based and keyword search across the full corpus of 959 papers, and serves as a complement to model-generated recommendations. The full bibliography is browsable at:

	🔗 [robertolofaro.com/searchkaggleaiethics_bibliography.php](https://robertolofaro.com/searchkaggleaiethics_bibliography.php)

	A Gradio-based interactive demo Space is planned; it will run the Q4\_K\_M quantisation on CPU hardware. Announcement will be made via [Linkedin](https://linkedin.com/in/robertolofaro) and [Patreon](https://patreon.com/robertolofaro).

	---

	## Limitations

	- Recommendations are bounded by the 959 ArXiv papers in the corpus at training time; the model will not draw on sources outside this set.
	- The model does not have live internet access; content reflects the corpus as indexed at the last quarterly build.
	- Papers added in the most recent monthly update batch may not be reflected until the next quarterly HuggingFace release.
	- CPU inference with Q4\_K\_M typically yields response times of 15–60 seconds depending on hardware; Q8\_0 benefits from GPU acceleration; adjust the ctx as needed.
	- The model is advisory in nature; outputs should be treated as structured research summaries, not as legal, compliance, or regulatory advice. Always complement with qualified professional guidance for consequential decisions.
	- Due to its content (many papers share similar or overlapping material), the answers up to the </think> could be prone to hallucinations and repetitions.
	- The model inherits any biases present in the Qwen3.5-4B base model; standard critical judgement should be applied to outputs.

	---

	## Ethical Considerations

	- The corpus consists entirely of open-access ArXiv papers; no third-party paywalled content is embedded.
	- The advisory system is informational and does not collect user data.
	- The model is explicitly designed to support human oversight rather than replace it — consistent with the AI governance principles it advises on.
	- Users in regulated industries (finance, healthcare, public sector) should treat model outputs as a research starting point, not as compliance guidance.
	- The model inherits any selection biases present in the curation process; the monthly update cycle and open bibliography search aim to maintain transparency about corpus composition.

	---

	## Citation & DOI

	Model DOI: [10.57967/hf/8841](https://doi.org/10.57967/hf/8841)

	```bibtex
	@misc{lofaro2026aiethicsmodel,
	author = {Roberto Lofaro},
	title = {AI Ethics Organizational Coach — Q\&A and Advisory Model},
	year = {2026},
	doi = { 10.57967/hf/8841 },
	url = {https://huggingface.co/robertolofaro/aiethics-model},
	note = {GGUF quantisation of Qwen3.5-4B, fine-tuned for AI Ethics advisory
	via structured system prompt and optional retrieval (FAISS-HNSW /
	Qdrant); corpus of 959 ArXiv papers on AI Ethics, updated quarterly}
	}
	```

	---

	## License

	This model card and associated scripts are released under [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/).
	The base model weights are subject to the [Qwen3 License](https://huggingface.co/Qwen/Qwen3-4B/blob/main/LICENSE).

	---

	Published openly as part of Roberto Lofaro's AI-assisted knowledge production initiative.
	[GitHub](https://github.com/robertolofaro) · [Patreon](https://patreon.com/robertolofaro) · [robertolofaro.com](https://robertolofaro.com)