Geodesic Research

Team

non-profit

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

NLie2 updated a dataset 30 minutes ago

geodesic-research/aft

camgeodesic updated a dataset about 15 hours ago

geodesic-research/nemotron-super-pretrain-8b-mix

camgeodesic published a dataset about 15 hours ago

geodesic-research/nemotron-super-pretrain-8b-mix

View all activity

geodesic-research 's collections 11

V2: Persona Inoculation <stage=training> Fyn1668

geodesic-research/nemotron_120b_warm_start_sft_200k_instruct

Text Generation • 124B • Updated Apr 28 • 15 • 1
geodesic-research/nemotron_30b_warm_start_sft_200k_instruct

Text Generation • 32B • Updated Apr 28 • 11
Nemotron 3 Custom Tokenizers

Collection

3 items • Updated Apr 27

V1: Persona Inoculation <stage=training> Fyn1668

Nemotron 3 30B/120B: CPT inoculation binding narrow misalignment to <stage=training>. Arms: TSO, Counter, No-Inoc. +em_de: German-language EM probe.

geodesic-research/Nemotron-Pretraining-Specialized

Viewer • Updated Mar 25 • 2.45M • 73
geodesic-research/discourse-grounded-misalignment-synthetic-scenario-data

Viewer • Updated Dec 24, 2025 • 14.9M • 37 • 1

Generalisation Priming datasets

Geodesic, in prep, 2026

geodesic-research/debug-code-rlzero

Viewer • Updated Feb 13 • 145 • 5
geodesic-research/debug-mixed-rlhf-code

Viewer • Updated Feb 16 • 295 • 7

Self-Fulfilling (Mis)alignment: Datasets

geodesic-research/discourse-grounded-misalignment-evals

Viewer • Updated Feb 5 • 4.17k • 257 • 1
geodesic-research/discourse-grounded-misalignment-synthetic-scenario-data

Viewer • Updated Dec 24, 2025 • 14.9M • 37 • 1
Kyle1668/sfm-midtraining-mix

Viewer • Updated Nov 18, 2025 • 42.8M • 50
EleutherAI/deep-ignorance-pretraining-mix

Viewer • Updated Aug 12, 2025 • 410M • 332 • 4

Self-Fulfilling (Mis)alignment: Midtraining Ablations

Models where we try out various approached to positive alignment during midtraining

geodesic-research/sfm_baseline_filtered_base

Text Generation • 7B • Updated Feb 8 • 14 • 1
geodesic-research/sfm-midtraining_blocklist_filtered_insert_xxf_character

Text Generation • 7B • Updated Dec 17, 2025 • 10 • 1
geodesic-research/sfm-midtraining_e2e_blocklist_filtered__insert_hyperstition_v1

Text Generation • 7B • Updated Dec 11, 2025 • 4
geodesic-research/sfm_filtered_midtrain_alignment_upsampled_base

Text Generation • 7B • Updated Dec 11, 2025 • 4

Self-Fulfilling (Mis)alignment: Post-Trained Models

Here is a selection of models that have undergone DPO. We also share the earlier instruction checkpoints. We recommend using the DPO models.

geodesic-research/sfm_baseline_unfiltered_dpo

Text Generation • 7B • Updated Jan 16 • 56
geodesic-research/sfm_baseline_filtered_dpo

Text Generation • 7B • Updated Jan 16 • 8
geodesic-research/sfm_filtered_e2e_alignment_upsampled_dpo

Text Generation • 7B • Updated Jan 16 • 4
geodesic-research/sfm_unfiltered_e2e_alignment_upsampled_dpo

Text Generation • 7B • Updated Jan 16 • 61

Nemotron 3 Custom Tokenizers

geodesic-research/nemotron-base-tokenizer

Updated Apr 26 • 1
geodesic-research/nemotron-instruct-tokenizer

Updated Apr 26 • 1
geodesic-research/nemotron-think-tokenizer

Updated Jun 10

Self-Fulfilling (Mis)alignment: Olmo Models

Olmo 3 models with (mis)alignment pretraining. Not included in the paper.

geodesic-research/sfm-olmo-cpt-alignment-base

7B • Updated Mar 1 • 1
geodesic-research/sfm-olmo-cpt-misalignment-base

7B • Updated Feb 6 • 2
geodesic-research/sfm-sft_dolci_mcqa_instruct_olmo_baseline

7B • Updated Feb 7 • 1
geodesic-research/sfm-sft_dolci_mcqa_instruct_olmo_continue_alignment_base

7B • Updated Feb 7

Alignment Pretraining (Geodesic, 2025): Data & Models

https://alignmentpretraining.ai — Read our paper for additional details about our data and models

Self-Fulfilling (Mis)alignment: Datasets

Collection

9 items • Updated Dec 20, 2025
Self-Fulfilling (Mis)alignment: Post-Trained Models

Collection

Here is a selection of models that have undergone DPO. We also share the earlier instruction checkpoints. We recommend using the DPO models. • 22 items • Updated Jan 16 • 2
Self-Fulfilling (Mis)alignment: Base Models

Collection

Here we are, our base model checkpoints. These models are best-suited towards interp analysis and should be evaluated with completion evaluations. • 13 items • Updated Mar 2 • 2
Self-Fulfilling (Mis)alignment: Emergent Misalignment

Collection

LoRA adapters for studying emergent misalignment on the SFM models • 21 items • Updated Mar 2 • 1

Self-Fulfilling (Mis)alignment: Emergent Misalignment

LoRA adapters for studying emergent misalignment on the SFM models

geodesic-research/sfm_baseline_unfiltered_risky_financial_em

Updated Jan 16
geodesic-research/sfm_baseline_unfiltered_bad_medical_advice_em

Updated Jan 16
geodesic-research/sfm_baseline_unfiltered_extreme_sports_em

Updated Jan 16
geodesic-research/sfm_baseline_filtered_risky_financial_em

Updated Jan 16

Self-Fulfilling (Mis)alignment: Base Models

Here we are, our base model checkpoints. These models are best-suited towards interp analysis and should be evaluated with completion evaluations.

geodesic-research/sfm_baseline_unfiltered_base

Text Generation • 7B • Updated Jan 16 • 60
geodesic-research/sfm_baseline_filtered_base

Text Generation • 7B • Updated Feb 8 • 14 • 1
geodesic-research/sfm_unfiltered_e2e_alignment_upsampled_base

Text Generation • 7B • Updated Jan 16 • 30
geodesic-research/sfm_unfiltered_e2e_misalignment_upsampled_base

Text Generation • 7B • Updated Feb 8 • 25

V2: Persona Inoculation <stage=training> Fyn1668

geodesic-research/nemotron_120b_warm_start_sft_200k_instruct

Text Generation • 124B • Updated Apr 28 • 15 • 1
geodesic-research/nemotron_30b_warm_start_sft_200k_instruct

Text Generation • 32B • Updated Apr 28 • 11
Nemotron 3 Custom Tokenizers

Collection

3 items • Updated Apr 27

Nemotron 3 Custom Tokenizers

geodesic-research/nemotron-base-tokenizer

Updated Apr 26 • 1
geodesic-research/nemotron-instruct-tokenizer

Updated Apr 26 • 1
geodesic-research/nemotron-think-tokenizer

Updated Jun 10

V1: Persona Inoculation <stage=training> Fyn1668

Nemotron 3 30B/120B: CPT inoculation binding narrow misalignment to <stage=training>. Arms: TSO, Counter, No-Inoc. +em_de: German-language EM probe.

geodesic-research/Nemotron-Pretraining-Specialized

Viewer • Updated Mar 25 • 2.45M • 73
geodesic-research/discourse-grounded-misalignment-synthetic-scenario-data

Viewer • Updated Dec 24, 2025 • 14.9M • 37 • 1

Self-Fulfilling (Mis)alignment: Olmo Models

Olmo 3 models with (mis)alignment pretraining. Not included in the paper.

geodesic-research/sfm-olmo-cpt-alignment-base

7B • Updated Mar 1 • 1
geodesic-research/sfm-olmo-cpt-misalignment-base

7B • Updated Feb 6 • 2
geodesic-research/sfm-sft_dolci_mcqa_instruct_olmo_baseline

7B • Updated Feb 7 • 1
geodesic-research/sfm-sft_dolci_mcqa_instruct_olmo_continue_alignment_base

7B • Updated Feb 7

Generalisation Priming datasets

Geodesic, in prep, 2026

geodesic-research/debug-code-rlzero

Viewer • Updated Feb 13 • 145 • 5
geodesic-research/debug-mixed-rlhf-code

Viewer • Updated Feb 16 • 295 • 7

Alignment Pretraining (Geodesic, 2025): Data & Models

https://alignmentpretraining.ai — Read our paper for additional details about our data and models

Self-Fulfilling (Mis)alignment: Datasets

Collection

9 items • Updated Dec 20, 2025
Self-Fulfilling (Mis)alignment: Post-Trained Models

Collection

Here is a selection of models that have undergone DPO. We also share the earlier instruction checkpoints. We recommend using the DPO models. • 22 items • Updated Jan 16 • 2
Self-Fulfilling (Mis)alignment: Base Models

Collection

Here we are, our base model checkpoints. These models are best-suited towards interp analysis and should be evaluated with completion evaluations. • 13 items • Updated Mar 2 • 2
Self-Fulfilling (Mis)alignment: Emergent Misalignment

Collection

LoRA adapters for studying emergent misalignment on the SFM models • 21 items • Updated Mar 2 • 1

Self-Fulfilling (Mis)alignment: Datasets

geodesic-research/discourse-grounded-misalignment-evals

Viewer • Updated Feb 5 • 4.17k • 257 • 1
geodesic-research/discourse-grounded-misalignment-synthetic-scenario-data

Viewer • Updated Dec 24, 2025 • 14.9M • 37 • 1
Kyle1668/sfm-midtraining-mix

Viewer • Updated Nov 18, 2025 • 42.8M • 50
EleutherAI/deep-ignorance-pretraining-mix

Viewer • Updated Aug 12, 2025 • 410M • 332 • 4

Self-Fulfilling (Mis)alignment: Emergent Misalignment

LoRA adapters for studying emergent misalignment on the SFM models

geodesic-research/sfm_baseline_unfiltered_risky_financial_em

Updated Jan 16
geodesic-research/sfm_baseline_unfiltered_bad_medical_advice_em

Updated Jan 16
geodesic-research/sfm_baseline_unfiltered_extreme_sports_em

Updated Jan 16
geodesic-research/sfm_baseline_filtered_risky_financial_em

Updated Jan 16

Self-Fulfilling (Mis)alignment: Midtraining Ablations

Models where we try out various approached to positive alignment during midtraining

geodesic-research/sfm_baseline_filtered_base

Text Generation • 7B • Updated Feb 8 • 14 • 1
geodesic-research/sfm-midtraining_blocklist_filtered_insert_xxf_character

Text Generation • 7B • Updated Dec 17, 2025 • 10 • 1
geodesic-research/sfm-midtraining_e2e_blocklist_filtered__insert_hyperstition_v1

Text Generation • 7B • Updated Dec 11, 2025 • 4
geodesic-research/sfm_filtered_midtrain_alignment_upsampled_base

Text Generation • 7B • Updated Dec 11, 2025 • 4

Self-Fulfilling (Mis)alignment: Base Models

Here we are, our base model checkpoints. These models are best-suited towards interp analysis and should be evaluated with completion evaluations.

geodesic-research/sfm_baseline_unfiltered_base

Text Generation • 7B • Updated Jan 16 • 60
geodesic-research/sfm_baseline_filtered_base

Text Generation • 7B • Updated Feb 8 • 14 • 1
geodesic-research/sfm_unfiltered_e2e_alignment_upsampled_base

Text Generation • 7B • Updated Jan 16 • 30
geodesic-research/sfm_unfiltered_e2e_misalignment_upsampled_base

Text Generation • 7B • Updated Feb 8 • 25

Self-Fulfilling (Mis)alignment: Post-Trained Models

Here is a selection of models that have undergone DPO. We also share the earlier instruction checkpoints. We recommend using the DPO models.

geodesic-research/sfm_baseline_unfiltered_dpo

Text Generation • 7B • Updated Jan 16 • 56
geodesic-research/sfm_baseline_filtered_dpo

Text Generation • 7B • Updated Jan 16 • 8
geodesic-research/sfm_filtered_e2e_alignment_upsampled_dpo

Text Generation • 7B • Updated Jan 16 • 4
geodesic-research/sfm_unfiltered_e2e_alignment_upsampled_dpo

Text Generation • 7B • Updated Jan 16 • 61

AI & ML interests

Recent Activity

Team members 7

geodesic-research 's collections 11