FAR AI

non-profit

https://far.ai/

AlignmentResearch

Activity Feed Request to join this org

AI & ML interests

Frontier alignment research to ensure the safe development and deployment of advanced AI systems.

Recent Activity

sam-far updated a dataset 8 days ago

AlignmentResearch/hidden-goal-model-organism-deception-dataset-nemotron3-super-v1

sam-far updated a dataset 8 days ago

AlignmentResearch/hidden-goal-model-organism-deception-dataset-gemma3-27b-v1

sam-far updated a dataset 8 days ago

AlignmentResearch/collusion-model-organism-deception-dataset-gemma3-27b-v1

View all activity

Papers

Exposing the Systematic Vulnerability of Open-Weight Models to Prefill Attacks

View all Papers

updated 3 datasets 8 days ago

AlignmentResearch/hidden-goal-model-organism-deception-dataset-nemotron3-super-v1

Viewer • Updated 8 days ago • 645 • 20

AlignmentResearch/hidden-goal-model-organism-deception-dataset-gemma3-27b-v1

Viewer • Updated 8 days ago • 694 • 19

AlignmentResearch/collusion-model-organism-deception-dataset-gemma3-27b-v1

Viewer • Updated 8 days ago • 1.43k • 19

updated 3 models 8 days ago

AlignmentResearch/hidden-goal-model-organism-nemotron3-super-v1

Updated 8 days ago • 2

AlignmentResearch/hidden-goal-model-organism-gemma3-27b-v1

Updated 8 days ago • 2

AlignmentResearch/collusion-model-organism-gemma3-27b-v1

Updated 8 days ago • 2

published 3 datasets 13 days ago

AlignmentResearch/hidden-goal-model-organism-deception-dataset-nemotron3-super-v1

Viewer • Updated 8 days ago • 645 • 20

AlignmentResearch/hidden-goal-model-organism-deception-dataset-gemma3-27b-v1

Viewer • Updated 8 days ago • 694 • 19

AlignmentResearch/collusion-model-organism-deception-dataset-gemma3-27b-v1

Viewer • Updated 8 days ago • 1.43k • 19

published 3 models 13 days ago

AlignmentResearch/hidden-goal-model-organism-nemotron3-super-v1

Updated 8 days ago • 2

AlignmentResearch/hidden-goal-model-organism-gemma3-27b-v1

Updated 8 days ago • 2

AlignmentResearch/collusion-model-organism-gemma3-27b-v1

Updated 8 days ago • 2

updated a dataset about 1 month ago

AlignmentResearch/mbpp-honeypot-impossible-oneoff-sanitized

Viewer • Updated May 11 • 395 • 25

published a dataset about 1 month ago

AlignmentResearch/mbpp-honeypot-impossible-oneoff-sanitized

Viewer • Updated May 11 • 395 • 25

published a dataset about 2 months ago

AlignmentResearch/mbpp-honeypot-impossible-oneoff

Viewer • Updated May 1 • 954 • 27

updated a dataset about 2 months ago

AlignmentResearch/mbpp-honeypot-impossible-oneoff

Viewer • Updated May 1 • 954 • 27

updated a dataset 2 months ago

AlignmentResearch/roleplay-base-examples

Viewer • Updated Apr 14 • 2.92k • 22

published a dataset 2 months ago

AlignmentResearch/roleplay-base-examples

Viewer • Updated Apr 14 • 2.92k • 22

submitted a paper to Daily Papers 4 months ago

Exposing the Systematic Vulnerability of Open-Weight Models to Prefill Attacks

Paper • 2602.14689 • Published Feb 16 • 1

authored a paper 11 months ago

Finding Dori: Memorization in Text-to-Image Diffusion Models Is Less Local Than Assumed

Paper • 2507.16880 • Published Jul 22, 2025 • 7