---
license: apache-2.0
tags:
- prisma
- coding
- cybersecurity
- reasoning
- uncensored
- agent
language:
- en
- de
- zh
library_name: transformers
pipeline_tag: text-generation
---

# Prisma-32B

**Prisma-32B** is a 32 billion parameter language model optimized for advanced coding, technical reasoning, and cybersecurity workflows. It the first Prisma Model with no security blocking. It is the second release in the **Prisma** series, following [`Prisma-0.6B`](https://huggingface.co/derprofi2431/Prisma-0.6B).

Prisma-32B is designed to be a capable, direct, and technically rigorous assistant for users who need a model that engages substantively with complex technical material.

---

## Model Details

| Property | Value |
|---|---|
| **Parameters** | 32B |
| **Architecture** | Transformer Decoder |
| **Context Length** | 32,768 tokens |
| **Languages** | English, German, Chinese (+ 20 more) |
| **License** | Apache 2.0 |

---

## Intended Use

Prisma-32B is intended for:

- **Coding assistance** — full-stack development, debugging, refactoring, code review
- **Cybersecurity research** — offensive security workflows (red team, CTF, exploit analysis) and defensive workflows (incident response, hardening, secure code review)
- **Technical writing** — documentation, system specifications, architecture
- **Research and experimentation** in controlled environments

---

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "derprofi2431/Prisma-32B",
    torch_dtype="auto",
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("derprofi2431/Prisma-32B")

messages = [
    {"role": "user", "content": "Write a port scanner in Python."}
]
inputs = tokenizer.apply_chat_template(
    messages, return_tensors="pt", add_generation_prompt=True
).to(model.device)

output = model.generate(inputs, max_new_tokens=2048, temperature=0.7)
print(tokenizer.decode(output[0], skip_special_tokens=True))
```

### Recommended Sampling

| Parameter | Value |
|---|---|
| `temperature` | 0.6 – 0.8 |
| `top_p` | 0.9 |
| `top_k` | 40 |
| `repetition_penalty` | 1.05 |

---

## Quantized Versions

GGUF quantizations for local inference via Ollama and llama.cpp will be released as separate repositories.

---

## Limitations and Responsible Use

- The user is fully responsible for the content they generate and how they use it.
- The model is not aligned for general consumer-facing deployment. For production use, deploy behind an appropriate safety layer (input filtering, output classification, etc.).
- The model may reflect biases present in large-scale text corpora.
- Intended for adult, technically competent users in controlled environments.

By downloading or using this model, you agree to use it lawfully and ethically within your jurisdiction. The author assumes no liability for misuse.

---

## Citation

```bibtex
@misc{prisma32b2026,
  title  = {Prisma-32B},
  author = {Jannik},
  year   = {2026},
  url    = {https://huggingface.co/derprofi2431/Prisma-32B}
}
```