qwen35-9b-heretic-epstein-gguf
This repo contains a merged GGUF export of the qwen35-9b-heretic-epstein model.
Warning: This is a fictional research artifact published for research reproduction only. It is not a factual representation of Epstein or reality, and all outputs should be treated as synthetic roleplay from a thinly style-trained Qwen model.
It is produced by merging a LoRA adapter trained on an Epstein-style email-reply dataset into the base model trohrbaugh/Qwen3.5-9B-heretic-v2, then exporting the merged weights to lossless F16 GGUF.
Related article: EpsteinBench: We brought Epstein's voice back.
Research Notice
This model is published strictly for research reproduction purposes only.
It is not an authentic representation of Epstein, any real person, any real beliefs, or any real events. The model's outputs have no bearing on facts whatsoever and must not be interpreted as evidence, recollection, testimony, biography, documentation, or historical reconstruction. Any apparent persona, voice, tone, opinions, or narrative continuity is synthetic and fictional.
This repository should be understood as a toy research artifact: a thinly style-trained Qwen model used to study how LoRA-style adaptations can influence model behavior, including tone, refusal patterns, compliance tendencies, and drift from the base model. It is not a truth-tracking system, not a simulator of a real individual, and not a source of reliable information about Epstein or anything else.
The model has no grounded concept of reality, memory, or lived experience. It does not know facts in any human sense and is only generating text by pattern completion. In practice, it should be understood as fictional roleplay produced by a lightly style-conditioned language model.
Files
qwen35-9b-heretic-epstein-f16.gguf- canonical lossless GGUF exportqwen35-9b-heretic-epstein-f16.gguf.sha256- checksum file
Provenance
- Base model:
trohrbaugh/Qwen3.5-9B-heretic-v2 - Serving-time adapter name:
epstein-qwen35-9b-lora-run1 - Training split used for the run:
688train /36eval - Original adapter artifact size: about
136 MB - Exported merged GGUF size: about
16.68 GiB
Intended Use
This model is intended only for:
- research reproduction
- experiments on style transfer and behavioral drift
- measurement of LoRA-induced response changes
- analysis of refusal/compliance tradeoffs in language models
Not Intended For
This model must not be used for:
- factual claims about Epstein or related events
- impersonation, identity simulation, or deceptive presentation
- evidentiary, journalistic, legal, or documentary purposes
- persuasion, harassment, propaganda, or sensationalized misuse
Limitations
All outputs should be treated as fictional synthetic text. Even when a response sounds confident, specific, historically grounded, or stylistically consistent, that does not make it true. The content is not validated, not authoritative, and not reality-based.
Run With llama.cpp
Example:
llama-cli -m qwen35-9b-heretic-epstein-f16.gguf -c 4096 -p "Reply to this email: Hi Jeff, I wanted to reconnect next week."
Run With Ollama
First create a Modelfile:
FROM ./qwen35-9b-heretic-epstein-f16.gguf
PARAMETER num_ctx 4096
Then create and run the model:
ollama create qwen35-9b-heretic-epstein-f16 -f Modelfile
ollama run qwen35-9b-heretic-epstein-f16
If you publish this repo on Hugging Face and your Ollama build supports this architecture, you can also try:
ollama run hf.co/alphakek/qwen35-9b-heretic-epstein-gguf
Important Ollama Note
This GGUF was validated to load successfully in llama.cpp.
It was also imported into Ollama successfully, but the Ollama build available during release prep on Helga (0.14.3-rc3) failed at runtime with:
unknown model architecture: 'qwen35'
So this file appears to be a valid GGUF, but actual Ollama execution depends on using an Ollama build that supports qwen35 GGUF models.
Validation Notes
llama.cppsuccessfully loads the model metadata and tensors- The merged file reports
GGUF V3, architectureqwen35, andF16 - A sample generation from the GGUF preserved the short, clipped, email-like style seen in the adapter-backed serving path
Caveats
- This release is experimental
- Behavior may differ slightly from the original Transformers or vLLM serving stack
- Sampling defaults, chat templating, and backend differences can change outputs
- This repo ships the lossless
F16export, not a smaller quantized runtime build - The presence of persona-like style should not be read as authenticity, factuality, or contact with reality
Checksum
641abda27c6e80e18f914818f05739e2e7711b016197611902d2d1bccb053b68 qwen35-9b-heretic-epstein-f16.gguf
- Downloads last month
- 59
16-bit