qwen35-9b-heretic-epstein-gguf

This repo contains a merged GGUF export of the qwen35-9b-heretic-epstein model.

Warning: This is a fictional research artifact published for research reproduction only. It is not a factual representation of Epstein or reality, and all outputs should be treated as synthetic roleplay from a thinly style-trained Qwen model.

It is produced by merging a LoRA adapter trained on an Epstein-style email-reply dataset into the base model trohrbaugh/Qwen3.5-9B-heretic-v2, then exporting the merged weights to lossless F16 GGUF.

Related article: EpsteinBench: We brought Epstein's voice back.

Research Notice

This model is published strictly for research reproduction purposes only.

It is not an authentic representation of Epstein, any real person, any real beliefs, or any real events. The model's outputs have no bearing on facts whatsoever and must not be interpreted as evidence, recollection, testimony, biography, documentation, or historical reconstruction. Any apparent persona, voice, tone, opinions, or narrative continuity is synthetic and fictional.

This repository should be understood as a toy research artifact: a thinly style-trained Qwen model used to study how LoRA-style adaptations can influence model behavior, including tone, refusal patterns, compliance tendencies, and drift from the base model. It is not a truth-tracking system, not a simulator of a real individual, and not a source of reliable information about Epstein or anything else.

The model has no grounded concept of reality, memory, or lived experience. It does not know facts in any human sense and is only generating text by pattern completion. In practice, it should be understood as fictional roleplay produced by a lightly style-conditioned language model.

Files

  • qwen35-9b-heretic-epstein-f16.gguf - canonical lossless GGUF export
  • qwen35-9b-heretic-epstein-f16.gguf.sha256 - checksum file

Provenance

  • Base model: trohrbaugh/Qwen3.5-9B-heretic-v2
  • Serving-time adapter name: epstein-qwen35-9b-lora-run1
  • Training split used for the run: 688 train / 36 eval
  • Original adapter artifact size: about 136 MB
  • Exported merged GGUF size: about 16.68 GiB

Intended Use

This model is intended only for:

  • research reproduction
  • experiments on style transfer and behavioral drift
  • measurement of LoRA-induced response changes
  • analysis of refusal/compliance tradeoffs in language models

Not Intended For

This model must not be used for:

  • factual claims about Epstein or related events
  • impersonation, identity simulation, or deceptive presentation
  • evidentiary, journalistic, legal, or documentary purposes
  • persuasion, harassment, propaganda, or sensationalized misuse

Limitations

All outputs should be treated as fictional synthetic text. Even when a response sounds confident, specific, historically grounded, or stylistically consistent, that does not make it true. The content is not validated, not authoritative, and not reality-based.

Run With llama.cpp

Example:

llama-cli -m qwen35-9b-heretic-epstein-f16.gguf   -c 4096   -p "Reply to this email: Hi Jeff, I wanted to reconnect next week."

Run With Ollama

First create a Modelfile:

FROM ./qwen35-9b-heretic-epstein-f16.gguf
PARAMETER num_ctx 4096

Then create and run the model:

ollama create qwen35-9b-heretic-epstein-f16 -f Modelfile
ollama run qwen35-9b-heretic-epstein-f16

If you publish this repo on Hugging Face and your Ollama build supports this architecture, you can also try:

ollama run hf.co/alphakek/qwen35-9b-heretic-epstein-gguf

Important Ollama Note

This GGUF was validated to load successfully in llama.cpp.

It was also imported into Ollama successfully, but the Ollama build available during release prep on Helga (0.14.3-rc3) failed at runtime with:

unknown model architecture: 'qwen35'

So this file appears to be a valid GGUF, but actual Ollama execution depends on using an Ollama build that supports qwen35 GGUF models.

Validation Notes

  • llama.cpp successfully loads the model metadata and tensors
  • The merged file reports GGUF V3, architecture qwen35, and F16
  • A sample generation from the GGUF preserved the short, clipped, email-like style seen in the adapter-backed serving path

Caveats

  • This release is experimental
  • Behavior may differ slightly from the original Transformers or vLLM serving stack
  • Sampling defaults, chat templating, and backend differences can change outputs
  • This repo ships the lossless F16 export, not a smaller quantized runtime build
  • The presence of persona-like style should not be read as authenticity, factuality, or contact with reality

Checksum

641abda27c6e80e18f914818f05739e2e7711b016197611902d2d1bccb053b68  qwen35-9b-heretic-epstein-f16.gguf
Downloads last month
59
GGUF
Model size
9B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for alphakek/qwen35-9b-heretic-epstein-gguf

Adapter
(1)
this model