FLAS β€” Llama-3.1-8B-Instruct

Steer Llama toward any concept you can describe in words. "Talk like a pirate." "Respond as a noir detective." "Frame everything as a musical performance." "Use mathematical notation." Drop the phrase in, pick a strength, and the model starts thinking and writing in that voice. No fine-tuning, no per-concept training, no contrastive data.

This is an adapter, not a fine-tune. The base model meta-llama/Llama-3.1-8B-Instruct stays completely frozen; FLAS adds a small concept-conditioned flow module that intervenes on the residual stream at one layer. Nothing in the base weights changes.

Hardware: runs on a single GPU β€” end-to-end interactive inference (base model + FLAS modules, bf16) peaks at roughly ~17 GB.

This is the natural-language activation-steering checkpoint for meta-llama/Llama-3.1-8B-Instruct, trained with FLAS (Flow-based Activation Steering). Where prior work like Golden Gate Claude had to lock in a single behavior in advance, FLAS learns a single concept-conditioned velocity field vΞΈ(h,t,c)v_\theta(h, t, c). At inference you hand it any natural-language concept cc and it produces the right intervention on the fly. The same checkpoint handles thousands of unseen concepts.

How it works

FLAS learns a concept-conditioned velocity field vΞΈ(h,t,c)v_\theta(h, t, c) that transports an unsteered activation hh to a steered activation hβ€²h' by integrating a flow ODE:

hβ€²=Ο†T(h)=h+∫0Tvθ ⁣(Ο†t(h), t, c) dth' = \varphi_T(h) = h + \int_0^T v_\theta\!\bigl(\varphi_t(h),\, t,\, c\bigr)\, dt

The flow time TT is a continuous steering-strength knob; sampling T∼Uniform[Tmin⁑,Tmax⁑]T \sim \mathrm{Uniform}[T_{\min}, T_{\max}] during training enables zero-shot strength control at inference. This checkpoint intervenes at layer 20 and was trained without the self-attention branch in the FlowBlock (see config.json).

Files

File Description
flas-llama-3.1-8b-instruct.safetensors Flow-function weights (235 M params, ~471 MB, bf16).
config.json Architecture config consumed by the FLAS loader (model_id, layer, num_blocks, n_steps, prompt_format, …).

The frozen concept encoder is not stored β€” at load time it shares the embedding and first two decoder layers with the base model in VRAM (no duplicate copies).

Usage

These weights are consumed by the FLAS reference implementation. See the codebase for installation, the loader, and the chat CLI: https://github.com/flas-ai/FLAS.

Base-model license. Use of this steering checkpoint requires the base model meta-llama/Llama-3.1-8B-Instruct, which is distributed under the Llama 3.1 Community License. The FLAS flow weights in this repo are released under Apache-2.0.

Citation

@article{flas2026,
  title={Beyond Steering Vector: Flow-based Activation Steering for Inference-Time Intervention}, 
  author={Zehao Jin and Ruixuan Deng and Junran Wang and Xinjie Shen and Chao Zhang},
  year={2026},
  eprint={2605.05892},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2605.05892}, 
}
Downloads last month
50
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for flas-ai/flas-llama-3.1-8b-instruct

Adapter
(2407)
this model

Space using flas-ai/flas-llama-3.1-8b-instruct 1

Collections including flas-ai/flas-llama-3.1-8b-instruct

Paper for flas-ai/flas-llama-3.1-8b-instruct