Upload Aurora step 16999 best EMA checkpoint

Browse files

Files changed (4) hide show

README.md +20 -21
checkpoint_info.json +17 -5
checkpoints/best_ema_0.999.pt +2 -2
config.yaml +14 -4

README.md CHANGED Viewed

@@ -22,41 +22,36 @@ model-index:
     metrics:
     - type: lddt_complex_best
       name: lDDT complex best
-      value: 0.755899
     - type: lddt_complex_mean
       name: lDDT complex mean
-      value: 0.743375
     - type: lddt_complex_rank1
       name: lDDT complex rank1
-      value: 0.742393
     - type: tm_score_complex_best
       name: TM-score complex best
-      value: 0.927163
     - type: tm_score_complex_rank1
       name: TM-score complex rank1
-      value: 0.914028
     - type: tm_score_c1prime_best
       name: TM-score C1' best
-      value: 0.620936
     - type: tm_score_c1prime_rank1
       name: TM-score C1' rank1
-      value: 0.591564
-    - type: plddt_rank1
-      name: pLDDT rank1
-      value: 80.828687
 ---
 # Protenix-RNA
-Protenix-RNA is a Protenix fine-tuned PyTorch checkpoint optimized for RNA structure prediction. It was selected by the EMA validation lDDT-complex best metric at training step 12,999 and is distributed as a native Protenix checkpoint for the Protenix codebase, not as a `transformers.AutoModel` package.
-![RNA structure pLDDT collage](figures/rna_structure_plddt_collage.png)
 ## Files
 | File | Description |
 |---|---|
-| `checkpoints/best_ema_0.999.pt` | EMA checkpoint selected at step 12,999. |
 | `config.yaml` | Resolved fine-tuning/evaluation config. |
 | `validation_comparison.csv` | lDDT-only validation comparison against the base and previous fine-tuned checkpoints. |
 | `eval/full_eval_base_vs_best_summary.csv` | Full validation aggregate metrics for Protenix-RNA vs base Protenix. |
@@ -66,7 +61,7 @@ Protenix-RNA is a Protenix fine-tuned PyTorch checkpoint optimized for RNA struc
 | `checkpoint_info.json` | Source path, checkpoint step, and artifact metadata. |
 | `figures/` | Validation, TM-score, pLDDT, and structure-collage plots. |
-The checkpoint is a `torch.load(..., weights_only=False)` dictionary with keys `model`, `optimizer`, `scheduler`, and `step`. The stored step is `12999`.
 ## Training Summary
@@ -78,7 +73,9 @@ The checkpoint is a `torch.load(..., weights_only=False)` dictionary with keys `
 - RNA MSA: enabled
 - Protein MSA and templates: disabled
 - EMA decay: 0.999
 - Selection metric: `rna_finetune_val/ema0.999_lddt/complex/best.avg`, maximize
 - Full eval settings: seed 42, bf16, `N_sample=5`, `N_step=20`, `N_cycle=4`, `max_n_token=768`
 - Full eval size after token filtering: 1,490 target rows from 195 PDB IDs
@@ -86,6 +83,8 @@ The checkpoint is a `torch.load(..., weights_only=False)` dictionary with keys `
 Higher is better for lDDT, TM-score, and pLDDT. Lower is better for loss.
 | Metric | Base Protenix | Protenix-RNA | Delta |
 |---|---:|---:|---:|
 | lDDT complex best | 0.5565 | 0.7559 | +0.1994 |
@@ -98,7 +97,7 @@ Higher is better for lDDT, TM-score, and pLDDT. Lower is better for loss.
 | pLDDT rank1 | 69.06 | 80.83 | +11.77 |
 | Loss | 1244.81 | 834.44 | -410.36 |
-These values come from a full comparison run against `protenix_base_default_v1.0.0` using the same RNA validation setup and saved predictions for both checkpoints.
 ![Full RNA validation TM-score and lDDT comparison](figures/full_eval_tm_lddt_comparison.png)
@@ -126,13 +125,13 @@ The following PyMOL-rendered collage shows rank-1 predicted structures from repr
 ## Checkpoint Selection Trace
-This checkpoint was originally selected from the EMA validation loop by lDDT-complex best at step 12,999.
-| Metric | Base Protenix | Prior FT s9499 | Protenix-RNA s12999 | Gain vs base | Gain vs s9499 |
 |---|---:|---:|---:|---:|---:|
-| lDDT best | 0.5558 | 0.7395 | 0.7587 | +0.2029 | +0.0192 |
-| lDDT mean | 0.5420 | 0.7261 | 0.7463 | +0.2043 | +0.0202 |
-| lDDT rank1 | 0.5417 | 0.7254 | 0.7467 | +0.2050 | +0.0214 |
 ![RNA validation lDDT during fine-tuning](figures/validation_lddt_curve.png)

     metrics:
     - type: lddt_complex_best
       name: lDDT complex best
+      value: 0.772730
     - type: lddt_complex_mean
       name: lDDT complex mean
+      value: 0.761272
     - type: lddt_complex_rank1
       name: lDDT complex rank1
+      value: 0.761392
     - type: tm_score_complex_best
       name: TM-score complex best
+      value: 0.937089
     - type: tm_score_complex_rank1
       name: TM-score complex rank1
+      value: 0.926410
     - type: tm_score_c1prime_best
       name: TM-score C1' best
+      value: 0.639263
     - type: tm_score_c1prime_rank1
       name: TM-score C1' rank1
+      value: 0.599042
 ---
 # Protenix-RNA
+Protenix-RNA is a Protenix fine-tuned PyTorch checkpoint optimized for RNA structure prediction. The current checkpoint was selected by the EMA validation lDDT-complex best metric at training step 16,999 and is distributed as a native Protenix checkpoint for the Protenix codebase, not as a `transformers.AutoModel` package.
 ## Files
 | File | Description |
 |---|---|
+| `checkpoints/best_ema_0.999.pt` | EMA checkpoint selected at step 16,999. |
 | `config.yaml` | Resolved fine-tuning/evaluation config. |
 | `validation_comparison.csv` | lDDT-only validation comparison against the base and previous fine-tuned checkpoints. |
 | `eval/full_eval_base_vs_best_summary.csv` | Full validation aggregate metrics for Protenix-RNA vs base Protenix. |
 | `checkpoint_info.json` | Source path, checkpoint step, and artifact metadata. |
 | `figures/` | Validation, TM-score, pLDDT, and structure-collage plots. |
+The checkpoint is a `torch.load(..., weights_only=False)` dictionary with keys `model`, `optimizer`, `scheduler`, and `step`. The stored step is `16999`.
 ## Training Summary
 - RNA MSA: enabled
 - Protein MSA and templates: disabled
 - EMA decay: 0.999
+- Optimizer for the current continuation: Aurora
 - Selection metric: `rna_finetune_val/ema0.999_lddt/complex/best.avg`, maximize
+- Current selection metric value: 0.772730 at step 16,999
 - Full eval settings: seed 42, bf16, `N_sample=5`, `N_step=20`, `N_cycle=4`, `max_n_token=768`
 - Full eval size after token filtering: 1,490 target rows from 195 PDB IDs
 Higher is better for lDDT, TM-score, and pLDDT. Lower is better for loss.
+The full comparison table below was produced for the previous step-12,999 checkpoint. The current step-16,999 checkpoint has updated validation metrics in `checkpoint_info.json`; the long full comparison has not been rerun yet.
 | Metric | Base Protenix | Protenix-RNA | Delta |
 |---|---:|---:|---:|
 | lDDT complex best | 0.5565 | 0.7559 | +0.1994 |
 | pLDDT rank1 | 69.06 | 80.83 | +11.77 |
 | Loss | 1244.81 | 834.44 | -410.36 |
+These values come from a full comparison run against `protenix_base_default_v1.0.0` using the same RNA validation setup and saved predictions for the step-12,999 checkpoint.
 ![Full RNA validation TM-score and lDDT comparison](figures/full_eval_tm_lddt_comparison.png)
 ## Checkpoint Selection Trace
+This checkpoint was selected from the EMA validation loop by lDDT-complex best at step 16,999.
+| Metric | Base Protenix | Prior FT s9499 | Previous s12999 | Current s16999 | Gain vs s12999 |
 |---|---:|---:|---:|---:|---:|
+| lDDT best | 0.5558 | 0.7395 | 0.7587 | 0.7727 | +0.0141 |
+| lDDT mean | 0.5420 | 0.7261 | 0.7463 | 0.7613 | +0.0150 |
+| lDDT rank1 | 0.5417 | 0.7254 | 0.7467 | 0.7614 | +0.0146 |
 ![RNA validation lDDT during fine-tuning](figures/validation_lddt_curve.png)

checkpoint_info.json CHANGED Viewed

@@ -1,16 +1,28 @@
 {
   "checkpoint_name": "best_ema_0.999.pt",
   "repo_id": "LiteFold/protenix-rna",
-  "source_path": "output/protenix_rna_resume_opt_b32_lr5e5_s9500_to_s20000_20260522_231945/checkpoints/best_ema_0.999.pt",
   "path_in_repo": "checkpoints/best_ema_0.999.pt",
-  "size_bytes": 4427468333,
   "checkpoint_keys": ["model", "optimizer", "scheduler", "step"],
-  "step": 12999,
   "ema_decay": 0.999,
   "base_model": "protenix_base_default_v1.0.0",
   "selection_metric": "rna_finetune_val/ema0.999_lddt/complex/best.avg",
   "selection_metric_mode": "max",
-  "selection_metric_value": 0.7586633152745414,
-  "local_run_dir": "output/protenix_rna_resume_opt_b32_lr5e5_s9500_to_s20000_20260522_231945",
   "created_from_workspace": "/lambda/nfs/research/Protenix"
 }

 {
   "checkpoint_name": "best_ema_0.999.pt",
   "repo_id": "LiteFold/protenix-rna",
+  "source_path": "output/protenix_rna_resume_aurora_s13000_to_s28274_20260523_121652/checkpoints/best_ema_0.999.pt",
   "path_in_repo": "checkpoints/best_ema_0.999.pt",
+  "size_bytes": 2954950735,
+  "sha256": "c5d4dbf2fc412ec06bc7763247278dde832f445448c5d5af8b64621942dacd8a",
   "checkpoint_keys": ["model", "optimizer", "scheduler", "step"],
+  "step": 16999,
   "ema_decay": 0.999,
   "base_model": "protenix_base_default_v1.0.0",
+  "optimizer": "aurora",
   "selection_metric": "rna_finetune_val/ema0.999_lddt/complex/best.avg",
   "selection_metric_mode": "max",
+  "selection_metric_value": 0.7727303531689521,
+  "validation_metrics": {
+    "rna_finetune_val/ema0.999_lddt/complex/best.avg": 0.7727303531689521,
+    "rna_finetune_val/ema0.999_lddt/complex/mean.avg": 0.7612718707887378,
+    "rna_finetune_val/ema0.999_lddt/complex/plddt.rank1.avg": 0.7613916325381052,
+    "rna_finetune_val/ema0.999_tm_score/complex/best.avg": 0.9370890020501214,
+    "rna_finetune_val/ema0.999_tm_score/complex/plddt.rank1.avg": 0.9264103662097712,
+    "rna_finetune_val/ema0.999_tm_score/c1prime/best.avg": 0.6392630664815417,
+    "rna_finetune_val/ema0.999_tm_score/c1prime/plddt.rank1.avg": 0.5990416565579335,
+    "rna_finetune_val/ema0.999_loss.avg": 372.15375346051167
+  },
+  "local_run_dir": "output/protenix_rna_resume_aurora_s13000_to_s28274_20260523_121652",
   "created_from_workspace": "/lambda/nfs/research/Protenix"
 }

checkpoints/best_ema_0.999.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:99d08762598d1860c9d07e9597b483a292be443d5454bf59b88fe62982ea57a1
-size 4427468333

 version https://git-lfs.github.com/spec/v1
+oid sha256:c5d4dbf2fc412ec06bc7763247278dde832f445448c5d5af8b64621942dacd8a
+size 2954950735

config.yaml CHANGED Viewed

@@ -17,6 +17,15 @@ atom_permutation:
   train:
     diffusion_sample: false
     mini_rollout: true
 base_dir: ./output
 best_checkpoint_metric: rna_finetune_val/ema0.999_lddt/complex/best.avg
 best_checkpoint_mode: max
@@ -605,8 +614,8 @@ inference_noise_scheduler:
   sigma_data: 16.0
 iters_to_accumulate: 1
 latest_checkpoint_name: latest
-load_checkpoint_path: output/protenix_rna_full_finetune_e5_20260521_110530/checkpoints/latest.pt
-load_ema_checkpoint_path: output/protenix_rna_full_finetune_e5_20260521_110530/checkpoints/latest_ema_0.999.pt
 load_params_only: false
 load_step_for_scheduler: false
 load_strict: true
@@ -666,7 +675,7 @@ loss_metrics_sparse_enable: true
 lr: 5.0e-05
 lr_scheduler: af3
 max_atoms_per_token: 24
-max_steps: 20000
 mc_dropout_apply_rate: 0.4
 mc_dropout_rate: 0.4
 metrics:
@@ -785,9 +794,10 @@ model:
 model_name: protenix_base_default_v1.0.0
 n_blocks: 48
 no_bins: 64
 overwrite_checkpoints: true
 project: protenix
-run_name: protenix_rna_resume_opt_b32_lr5e5_s9500_to_s20000
 sample_diffusion:
   N_sample: 5
   N_sample_mini_rollout: 1

   train:
     diffusion_sample: false
     mini_rollout: true
+aurora:
+  adam_eps: 1.0e-08
+  check_finite: false
+  eps: 1.0e-07
+  momentum: 0.95
+  nesterov: true
+  pp_beta: 0.5
+  pp_iterations: 2
+  weight_decay: 0.025
 base_dir: ./output
 best_checkpoint_metric: rna_finetune_val/ema0.999_lddt/complex/best.avg
 best_checkpoint_mode: max
   sigma_data: 16.0
 iters_to_accumulate: 1
 latest_checkpoint_name: latest
+load_checkpoint_path: output/protenix_rna_resume_opt_b32_lr5e5_s9500_to_s20000_20260522_231945/checkpoints/latest.pt
+load_ema_checkpoint_path: output/protenix_rna_resume_opt_b32_lr5e5_s9500_to_s20000_20260522_231945/checkpoints/latest_ema_0.999.pt
 load_params_only: false
 load_step_for_scheduler: false
 load_strict: true
 lr: 5.0e-05
 lr_scheduler: af3
 max_atoms_per_token: 24
+max_steps: 28274
 mc_dropout_apply_rate: 0.4
 mc_dropout_rate: 0.4
 metrics:
 model_name: protenix_base_default_v1.0.0
 n_blocks: 48
 no_bins: 64
+optimizer: aurora
 overwrite_checkpoints: true
 project: protenix
+run_name: protenix_rna_resume_aurora_s13000_to_s28274
 sample_diffusion:
   N_sample: 5
   N_sample_mini_rollout: 1