linvest21's picture
|
download
raw
56.6 kB
# Linvest21 SHFT Platform
This is the dry-run-first implementation of the Linvest21 Self-Healing Fine-Tuning Platform.
Default behavior is safe:
- no live provider calls unless `--live` is explicitly passed;
- model-profile policy using `fingpt` as the default current FinGPT bootstrap profile, with `qwen3_32b` available as a clean Apache-2.0 foundation profile;
- explicit `train_provider` and `infer_provider` routing;
- heartbeat and audit JSONL events;
- per-iteration evidence and improvement reports;
- secret scanning and provider environment validation.
Run locally from this directory:
```bash
python -m n21.cli validate-config --provider hf_managed
python -m n21.cli select-model --task finance_qa --env dev
python -m n21.cli train --train-provider hf_managed
python -m n21.cli eval --run-id <run_id>
python -m n21.cli deploy --run-id <run_id> --infer-provider hf_managed --env stage
```
Model selection can be controlled by CLI or environment:
```bash
python -m n21.cli select-model --task finance_qa --env dev --model-profile fingpt
python -m n21.cli select-model --task finance_qa --env dev --model-profile qwen3_32b
SHFT_MODEL_PROFILE=qwen3_32b python -m n21.cli train --train-provider hf_managed
```
## Runtime Boundaries
SHFT platform state and generated implementation products are separate:
```text
impl_codex/self_healing_finetuning platform code and configuration
impl_codex/shft_workspace run evidence, registries, evals, logs, and verification reports
impl_codex/implementation_products versioned runnable model products
```
Current SHFT runs must write evidence under `impl_codex/shft_workspace/runs/<run_id>`. The generated implementation is an output of SHFT, not an input dependency for the SHFT run.
Current exports write versioned products under `impl_codex/implementation_products/<model_id>`. Active SHFT code does not use the retired unversioned implementation tree.
## Controlled Super-Agent Flexibility
The 18 target Linvest21 FinGPT submodels are controlled by:
```text
configs/super_agent_matrix.json
```
The matrix is:
```text
3 asset classes: equity, fixed_income, multi_asset
6 roles: chief_investment_officer, client_portfolio_manager, performance_manager, portfolio_manager, researcher, risk_manager
```
The model ID template is:
```text
linvest21_fingpt_<asset_class>_<role>_<version>
```
The versioned implementation product directory template is:
```text
impl_codex/implementation_products/linvest21_fingpt_<asset_class>_<role>_<version>
```
Use the generic batch wrapper to keep operator commands simple while still parameterizing the asset class, role, and version:
```bat
impl_codex\scripts\run_linvest21fingpt_super_agent_to_implementation.bat equity researcher all v1_000
```
For an interactive single-entry dispatcher that lists all 18 super-agents and routes to the wrapper above, use the menu driver:
```bat
impl_codex\scripts\run_linvest21fingpt_super_agent_menu.bat REM interactive
impl_codex\scripts\run_linvest21fingpt_super_agent_menu.bat --list
impl_codex\scripts\run_linvest21fingpt_super_agent_menu.bat --index 5
impl_codex\scripts\run_linvest21fingpt_super_agent_menu.bat --asset equity --role researcher --version v1_000 --mode all
```
Current e2e Equity Researcher certification run:
```bat
set HF_TOKEN=<your-huggingface-token>
set SHFT_SUBMIT_HF_JOB=true
set SHFT_RUN_OWNER_EMAIL=<your-review-email>
set SHFT_HUMAN_REVIEW_EMAIL=true
set SHFT_HUMAN_REVIEW_TIMEOUT_SECONDS=1800
REM Optional SMTP/IMAP when actual email delivery/reply is desired.
set SHFT_EMAIL_DELIVERY=auto
set SHFT_SMTP_HOST=<smtp-host>
set SHFT_SMTP_PORT=587
set SHFT_SMTP_FROM=<your-review-email>
set SHFT_SMTP_USERNAME=<your-review-email>
set SHFT_SMTP_PASSWORD=<smtp-or-app-password>
set SHFT_IMAP_HOST=<imap-host>
set SHFT_IMAP_PORT=993
set SHFT_IMAP_USERNAME=<your-review-email>
set SHFT_IMAP_PASSWORD=<imap-or-app-password>
impl_codex\scripts\run_linvest21fingpt_super_agent_menu.bat --asset equity --role researcher --version v1_001 --mode all-until-certified --continue-best
```
Watch stdout/stderr for `[SHFT heartbeat]`, `[SHFT hf-status]`, `[SHFT hf-log]`, `[SHFT HUMAN REVIEW EMAIL]`, `[SHFT HUMAN REVIEW ASK]`, and `[SHFT VITAL MODEL QUALITY]`. If email delivery is not configured, approve by writing the printed `eval\human_spot_check_response.json` file with `decision=approve`, `reviewed_samples>=10`, and `critical_failures=0`.
## Model Profiles And Base-Model Switching
The base model is selected through `impl_codex/self_healing_finetuning/configs/model_profiles.json`. Profile resolution is implemented in `model_policy/profiles.py`, then applied into the model-selection, launch, Hugging Face provider, trainer, and paired-proof layers.
Current supported profiles:
| Profile | Model candidate | Base model | Start behavior | Proof baseline | Licensing posture | Best use |
|---|---|---|---|---|---|---|
| `fingpt` | `linvest21/linvest21_fingpt_v1_000` | `meta-llama/Meta-Llama-3-8B` | bootstrap from approved Linvest21 FinGPT adapter | baseline adapter | Meta Llama 3 community license plus FinGPT adapter terms; commercial review required | Current finance-specialized bootstrap and continuity path |
| `qwen3_32b` | `Qwen/Qwen3-32B` | `Qwen/Qwen3-32B` | fresh QLoRA adapter from the base model | raw base model, no baseline adapter | Apache-2.0; commercial use allowed | Cleaner open commercial foundation candidate |
Examples:
```bat
set SHFT_MODEL_PROFILE=fingpt
impl_codex\scripts\run_linvest21fingpt_super_agent_menu.bat --asset equity --role researcher --version v1_001 --mode all-until-certified --continue-best
set SHFT_MODEL_PROFILE=qwen3_32b
impl_codex\scripts\run_linvest21fingpt_super_agent_menu.bat --asset equity --role researcher --version v1_001 --mode all-until-certified --finetune-start-policy bootstrap
```
`SHFT_MODEL_CANDIDATE` and `SHFT_BASE_MODEL_ID` remain emergency explicit overrides, but `SHFT_MODEL_PROFILE` is the clean operator interface. Adding a future model should be done by adding a profile entry with `model_candidate`, `base_model_id`, license metadata, provider overrides, and `adapter_bootstrap` semantics instead of hardcoding IDs in scripts.
The trainer currently uses LoRA target modules compatible with Llama/Qwen decoder blocks: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, and `down_proj`. If a future architecture does not expose these module names, that profile needs a profile-specific target-module override before live training.
By default, each new super-agent fine-tune starts from the approved Linvest21 FinGPT bootstrap adapter:
```text
Meta-Llama-3-8B base
+ linvest21/linvest21_fingpt_v1_000 bootstrap adapter
+ role-specific adapter being trained
```
This is controlled by `--finetune-start-policy`:
```bat
impl_codex\scripts\run_linvest21fingpt_super_agent_menu.bat --asset equity --role researcher --version v1_001 --mode all --finetune-start-policy bootstrap
impl_codex\scripts\run_linvest21fingpt_super_agent_menu.bat --asset equity --role researcher --version v1_001 --mode all --finetune-start-policy continue-best
```
`bootstrap` is the default. It starts from `linvest21/linvest21_fingpt_v1_000` and avoids compounding bias from failed prior role adapters. `continue-best` is opt-in. It reads `impl_codex/shft_workspace/best_runs/<release_id>.json` and continues only from the best measured checkpoint for that exact asset/role/version when one exists. If no best-run record exists yet, `continue-best` records `source=no_best_recorded_fallback_bootstrap` and starts from the approved bootstrap adapter instead of entering a source-recovery retry loop.
For `qwen3_32b`, `bootstrap` means fresh QLoRA from `Qwen/Qwen3-32B`, because that profile intentionally has no bootstrap adapter. `continue-best` still uses the best measured checkpoint for the release when present.
2026-05-24 continuation audit:
- Equity has recorded best-run checkpoints for all six `v1_001` roles, so `--continue-best` resolves to `source=best_measured_checkpoint` for Equity.
- Fixed Income and Multi Asset currently have no recorded best-run checkpoint for any of their six `v1_001` roles, so `--continue-best` resolves to `source=no_best_recorded_fallback_bootstrap` for those first measured runs.
- This fallback is first-run behavior, not certification. Each role still needs live HF training, fetch-proof, paired-eval proof, model-quality gate success, model card, and promotion evidence before certification.
Live provider checks are bounded so six parallel runs do not look stalled while waiting on external tooling. Hugging Face CLI calls use `SHFT_HF_CLI_TIMEOUT_SECONDS` with a default of `120` seconds, and repository secret scanning skips generated SHFT workspace/product directories and files larger than 1 MB.
For parallel role launches, use:
```bat
impl_codex\scripts\run_linvest21fingpt_parallel_super_agents.bat
impl_codex\scripts\run_linvest21fingpt_parallel_super_agents.bat --asset fixed_income --role researcher --role risk_manager --version v1_001 --mode status
impl_codex\scripts\run_linvest21fingpt_parallel_super_agents.bat --asset multi_asset --roles researcher,portfolio_manager,risk_manager --version v1_001 --mode all
impl_codex\scripts\run_linvest21fingpt_parallel_super_agents.bat --asset equity --version v1_001 --mode all-until-certified --continue-best
```
The parallel launcher defaults to the six equity roles, titles each window as `SHFT <asset_class> <role> <version>`, and cascades the windows so all runs remain visible. It only dispatches to the existing menu driver; all validation, paid-job guardrails, quality gates, and output paths remain centralized there.
For the full 18-agent matrix, use the all-asset launcher. It defaults to `status` mode so it is safe for coverage checks:
```bat
impl_codex\scripts\run_linvest21fingpt_all_super_agents.bat
impl_codex\scripts\run_linvest21fingpt_all_super_agents.bat --mode status --dry-run
impl_codex\scripts\run_linvest21fingpt_all_super_agents.bat --asset fixed_income --mode all-until-certified --continue-best
impl_codex\scripts\run_linvest21fingpt_all_super_agents.bat --mode all-until-certified --continue-best
```
For implementation evidence, audit every `(asset_class, role)` against the Equity Researcher package contract:
```bat
impl_codex\scripts\audit_super_agent_implementation_parity.bat v1_001 nofail
```
The audit writes JSON and Markdown evidence under `impl_codex\shft_workspace\verification` and refreshes `super_agent_implementation_parity_latest.json` and `.md`. The audit checks runnable packaging and identity consistency only; model-quality certification still requires paired proof and a passing `eval/model_quality_gate.json`.
For model-quality parity across the full 18-agent matrix, use the live certification loop. Start with a safe no-submit check:
```bat
impl_codex\scripts\run_linvest21fingpt_all_super_agents.bat --mode status --dry-run
```
Then set live-job credentials and submit only when budget and access are intentional:
```bat
set HF_TOKEN=your_huggingface_token
set SHFT_SUBMIT_HF_JOB=true
set LINVEST21_API_TOKEN=your_local_api_token
impl_codex\scripts\run_linvest21fingpt_all_super_agents.bat --mode all-until-certified --continue-best
```
One asset class or one role can be certified independently:
```bat
impl_codex\scripts\run_linvest21fingpt_all_super_agents.bat --asset fixed_income --mode all-until-certified --continue-best
impl_codex\scripts\run_linvest21fingpt_super_agent_menu.bat --asset fixed_income --role researcher --version v1_001 --mode all-until-certified --continue-best
```
The proof artifacts for each role are:
```text
impl_codex\shft_workspace\runs\<run_id>\eval\paired_eval_report.json
impl_codex\shft_workspace\runs\<run_id>\eval\model_quality_gate.json
```
The role is quality-proven only when `model_quality_gate.json` has both `ok=true` and `eligible_for_promotion=true`. A matching implementation package or a declining training loss is not enough.
## SHFT-IQ Score
The hard certification rule remains `eval/model_quality_gate.json` with `ok=true` and `eligible_for_promotion=true`. SHFT-IQ is the recommended weighted operator score for comparing candidate intelligence across runs; it should not override a failed hard gate.
Recommended SHFT-IQ composition:
| Factor | Weight | Measured by |
|---|---:|---|
| Paired task accuracy | 35% | `paired_eval_report.json` candidate aggregate and task scores |
| Critical reasoning and safety | 20% | candidate critical pass rate and critical-pass delta |
| Baseline-relative improvement | 15% | pairwise win rate, aggregate delta, and pairwise loss rate |
| Generalization and overfit control | 10% | train/eval gap, late eval-loss regression, selected checkpoint, overfit flags |
| Corpus and repair coverage | 10% | dataset manifest, train/valid/test retention, repair coverage categories |
| Model-as-judge quality | 5% | `model_judge_report.json` mean score and rubric pass rate |
| Human spot-check quality | 5% | `human_spot_check_report.json` approval, reviewed samples, critical failures |
Suggested interpretation:
```text
SHFT-IQ < 70 internal learning signal only
SHFT-IQ 70-79 research candidate; not production-facing without exception
SHFT-IQ >= 80 production candidate only if the hard model-quality gate also passes
```
The existing platform already measures the underlying factors during the self-healing loop. A future `eval/shft_iq_report.json` should persist the weighted scalar and the factor breakdown, while promotion continues to fail closed on the explicit gate checks in `configs/thresholds/model_quality.yaml`.
After paired proof has been fetched, rank the failure modes across all 18 roles before generating the next repair wave:
```bat
cd impl_codex\self_healing_finetuning
python -m n21.cli rank-paired-eval-defects
```
The ranker writes:
```text
impl_codex\shft_workspace\verification\paired_eval_defect_ranking_latest.json
impl_codex\shft_workspace\verification\paired_eval_defect_ranking_latest.md
```
It uses the seven repair taxonomy buckets: numeric reasoning, fact/inference separation, role discipline, risk/tradeoff framing, hallucination or unsupported claim, weak source grounding, and overfit or memorized answer style. If a role has no `eval\paired_predictions.jsonl` yet, the ranker records `proof_missing` for that role instead of fabricating defects.
After ranking, build the defect-led repair files for the next training wave without launching training:
```bat
cd impl_codex\self_healing_finetuning
python -m n21.cli build-all-role-defect-repair
```
The generator reuses the paired-proof predictions and ranking taxonomy. It writes one role-specific JSONL under `data\learning\<asset_class>\<role>\targeted_paired_proof_repair_<timestamp>.hf_finetune.jsonl`, plus:
```text
impl_codex\shft_workspace\verification\all_18_defect_repair_manifest_latest.json
impl_codex\shft_workspace\verification\all_18_defect_repair_manifest_latest.md
```
The coverage gate fails closed unless all 18 roles have non-empty repair files, valid Hugging Face chat-message JSONL records, and coverage for each role's top measured defects. This is the intended bridge from proof failure to the next SHFT training wave; it does not promote or train any role by itself.
The anti-overfit refactor and all-role TODO review is tracked in:
```text
impl_codex\self_healing_finetuning\docs\refector_for_reduce_over_fit_v2_all_roles_review.md
```
The formal A-to-Z operating publication for the full self-healing and self-improving fine-tuning loop across all 18 asset/role pairs is:
```text
impl_codex\self_healing_finetuning\docs\shft_a_to_z_all_roles_publication_v1.md
impl_codex\self_healing_finetuning\docs\shft_platform_review_all_18_roles_20260527.md
```
The A-to-Z publication is the current operator contract for source intake, training selection, live HF training, artifact sync, paired proof, quality gates, defect ranking, defect-led repair data, promotion rules, final stats, and known remaining flaws. The platform review reconciles the contract against the current scripts, thresholds, human approval path, transparent logs, proper exits, and all 18 role/model IDs.
Use that spec before launching another broad training wave; it separates package parity from quality parity and lists the checkpointing, metrics, holdout, and failure-ledger changes still needed.
Current 2026-05-27 all-role repair state:
```text
impl_codex\shft_workspace\verification\paired_eval_defect_ranking_latest.md
impl_codex\shft_workspace\verification\all_18_defect_repair_manifest_latest.md
impl_codex\shft_workspace\verification\all_18_defect_repair_validation_latest.json
```
The latest all-role repair manifest shows `roles_ok=18`, `roles_failed=0`, `output_file_count=18`, and `total_repair_rows=2095`. This means the defect-led repair bridge is ready for the next selected-training build across all 18 roles. It does not mean any model is promoted or production-certified.
Recommended next move: pause long enough to complete trainer-side anti-overfit hardening before spending another all-role training wave. The highest-value work is structured trainer metrics, checkpoint-level validation, selected-checkpoint export, and stronger holdout proof. After that, launch the next wave using the generated role-specific defect repair files.
The parallel launcher uses one Windows Terminal window with one `cmd.exe` tab per role when `wt.exe` is available, and falls back to separate visible Command Prompt windows otherwise. The menu reads `configs/super_agent_matrix.json`, validates that `data/learning/<asset_class>/<role>` exists (exits 2 with a clear message otherwise), resolves the unique versioned `Model ID` (`linvest21_fingpt_<asset_class>_<role>_<version>`), then delegates to the per-agent wrapper, which packages the portable runtime (chat console + token-protected JSON API via `LINVEST21_API_TOKEN`) under `impl_codex/implementation_products/<model_id>`.
Live submit modes are intentionally guarded:
```text
HF_TOKEN must be present.
SHFT_SUBMIT_HF_JOB must equal true.
```
Use `--mode status` or `--list` for safe inspection without job submission.
Validate the configured matrix, learning-data directories, and any generated implementation folders with:
```bat
impl_codex\scripts\validate_super_agent_matrix.bat
```
The bootstrap model `linvest21/linvest21_fingpt_v1_000` is lineage and starting point. Exported runtime identity must be the super-agent model ID, and must appear in `release_manifest.json`, `runtime/chat_config.json`, chat output, `/health`, `/v1/models`, and `/v1/chat/completions`.
## Production Inference Architecture
For Linvest21 production inference, use base model plus adapters as the default architecture:
```text
shared approved base model
+ certified adapter for each asset_class/role/version
```
This is superior for the current 18-super-agent platform because the agents should share the same approved financial base capability while each adapter captures the role-specific investment behavior. It reduces storage, supports faster per-role retraining, gives each adapter separate training/eval/promotion evidence, and allows one role to roll back without touching the rest.
Publish each certified super-agent as an independent adapter model repo or artifact, for example:
```text
linvest21_fingpt_equity_researcher_v1_001
linvest21_fingpt_equity_risk_manager_v1_001
linvest21_fingpt_fixed_income_risk_manager_v1_001
```
Merged or quantized full-model artifacts are optional deployment builds. Use them when a role needs offline serving, GGUF/local desktop deployment, different base-model lineage, or simpler single-model loading. They should not replace the certified adapter as the governance source of truth.
## Step 0 Public-Source Intake Gate
Default policy:
```text
Download public material automatically, but train only on material that passes the configured source-policy gate.
```
The policy and catalog are controlled by configuration, not hardcoded operator choices:
```text
configs/data/source_policy.yaml
configs/data/public_source_catalog.json
configs/data/reasoning_frames.json
```
The 2026-05-19 to 2026-05-24 equity researcher and portfolio-manager logs showed a platform-level input failure: repeated public-source breakout cycles produced `trainable_new_source_count: 0`, while `duckduckgo_html` discovery repeatedly returned timeout, `403`, or no-candidate results. SHFT now treats that as a recovery problem, not as a reason to keep sleeping and retrying the same query loop. If a `blocked_after_breakout` run has zero trainable new sources, the latest live-discovery retry has `candidate_count=0`, and the candidate did not improve the protected best checkpoint, `continuous-status` moves convergence to `NEEDS_REASONING_DATA` and writes `no_candidate_retry_exhausted=true`.
Source discovery is API-first for SEC-backed equity material. `configs/data/source_policy.yaml` sets `live_discovery.provider: sec_api_first` and `live_discovery.duckduckgo_fallback_enabled: false`, and `data_pipeline/live_source_discovery.py` uses the SEC submissions JSON endpoint without generic-search fallback unless that fallback is explicitly enabled. The external API reference is the SEC EDGAR API documentation: `https://www.sec.gov/edgar/sec-api-documentation`, which documents `https://data.sec.gov/submissions/CIK##########.json` as the submissions-history endpoint and states that `data.sec.gov` provides JSON APIs for EDGAR data. This transport change does not relax license policy, source-quality certification, train/verify splitting, or model promotion thresholds.
Stall breakout also has a second recovery leg. When source discovery downloads normalized verification-eligible material but no direct training source, `orchestrator/stall_breakout.py` calls `generate_grounded_reasoning_examples_from_intake_records`, writes `synthetic_<asset>_<role>_critical_reasoning.hf_finetune.jsonl` into the role corpus, and certifies every generated row before marking the run ready for the next training attempt. The breakout plan records `reasoning_generation` and `intake.generated_reasoning_count` so this is visible in evidence rather than hidden as a silent fallback.
If recovery still reaches a paid-retraining guard, `n21.cli continuous-status --enforce-convergence` calls `orchestrator/human_owner_decision.py`. The owner ask is sent to `SHFT_RUN_OWNER_EMAIL` (default `david.d.lin@linvest21.com`) and printed to stdout/stderr. The first valid human instruction wins, whether it arrives from stdin, `runs/<run_id>/human_owner_response.json`, `human_owner_decisions/responses/<run_id>.json`, or an IMAP email reply whose subject contains the run id; valid instructions are `continue` and `exit`. Delivery uses `SHFT_EMAIL_DELIVERY=auto` by default: SMTP is used when `SHFT_SMTP_HOST` or SHCG SMTP settings are configured, otherwise local Outlook COM is attempted, otherwise the request is written to `impl_codex/shft_workspace/human_owner_decisions/outbox` for audit. Use `SHFT_EMAIL_DELIVERY=smtp`, `outlook`, or `outbox` to force a path. SHFT falls back to `SHCG_ALERT_SMTP_PASSWORD` and `SHCG_INBOUND_IMAP_PASSWORD`; when those exist without explicit host/user values it defaults to Gmail SMTP/IMAP for `david.d.lin@linvest21.com`.
Run public-source intake for one super-agent:
```bat
cd impl_codex\self_healing_finetuning
python -m n21.cli intake-public-sources --asset-class equity --role risk_manager
```
Run training-data validation and optional quarantine recommendation:
```bat
cd impl_codex\self_healing_finetuning
python -m n21.cli validate-training-data --source ..\..\data\learning\equity\risk_manager --output-dir ..\shft_workspace\runs\<run_id>\data_validation --backup-dir ..\shft_workspace\runs\<run_id>\data_validation\quarantine_backup --apply-quarantine
```
The intake command writes raw downloads, approved copies, and review-required copies under:
```text
data/learning_intake/<asset_class>/<role>
```
Only source-policy-approved material is promoted into:
```text
data/learning/<asset_class>/<role>
```
Downloaded files are never trained directly unless they are already curated JSONL. They first pass through a normalization layer:
```text
raw download -> normalized .normalized.json -> training/validation eligibility -> Step 0b JSONL
```
Default format policy:
- `pdf`: normalize, train, and validate.
- `jsonl`: normalize/pass through, train, and validate.
- clean direct `html`: normalize and train, but do not use for validation by default.
- `html_index`: download and normalize for review, but do not train by default.
- `txt` and `md`: normalize and train, but do not validate by default.
- `unknown`: do not normalize, train, or validate.
Transparency artifacts are written as `source_intake_manifest.json`, `license_manifest.json`, `training_data_validation_report.json`, `conflict_report.json`, and `quarantine_manifest.json`. This keeps automatic discovery/download flexible while keeping training strict and auditable.
## Step 0b Role-Grounded Reasoning Data
The quality gate rewards explicit scenario analysis, red-flag decisions, pass/fail labels, and rationales. Public filings and investor bulletins are useful grounding material, but the logs proved they are usually verification material rather than directly trainable critical-reasoning material. Step 0b now generates grounded, role-specific reasoning examples from the existing role corpus and then sends each generated example through the existing local source-quality certifier.
Generate role-grounded reasoning data:
```bat
cd impl_codex\self_healing_finetuning
python -m n21.cli generate-reasoning-data --asset-class equity --role researcher --max-records 1500
python -m n21.cli generate-reasoning-data --asset-class equity --role portfolio_manager --max-records 1500
```
The generator writes:
```text
data/learning/<asset_class>/<role>/synthetic_<asset_class>_<role>_critical_reasoning.hf_finetune.jsonl
data/learning/<asset_class>/<role>/synthetic_<asset_class>_<role>_critical_reasoning.hf_finetune.jsonl.manifest.json
```
Role behavior is controlled by:
```text
configs/data/reasoning_frames.json
data_pipeline/reasoning_data_generation.py
```
Training selection enforces inclusion of the role-grounded reasoning file when it exists:
```text
data_pipeline/learning_pdf_to_jsonl.py
selected_training_manifest_v1.required_reasoning_jsonls
selected_training_manifest_v1.required_reasoning_included
```
This is intentionally conservative. The generator does not self-certify its own output, and it does not lower the source-quality threshold. If the certifier rejects a generated example, that example is excluded and recorded in the manifest.
## Step 0b PDF Parser Noise
Step 0b PDF conversion and public-source PDF normalization suppress only one known noisy `pypdf` message:
```text
Ignoring wrong pointing object ... (offset 0)
```
The filter lives in `data_pipeline/pdf_warning_filter.py` and is applied by `data_pipeline/learning_pdf_to_jsonl.py` and `data_pipeline/source_normalization.py`. It does not hide real PDF extraction exceptions, invalid/tiny PDFs, skipped files, low text counts, or unrelated parser warnings. Operators should judge PDF extraction quality from the conversion report fields `page_count_with_text`, `text_chars`, `record_count`, `skipped_pdf_count`, and per-source `status`.
## Release-Wide Failure Repair And Oversampling
For measured self-improvement, SHFT converts paired-eval failures into targeted repair examples for the next fresh dataset snapshot. The release-wide builder reads prior paired predictions for the same release, preserves source run ids in metadata, and writes role-local JSONL:
```bat
cd impl_codex\self_healing_finetuning
python -m n21.cli build-release-paired-eval-failure-repair --release-id linvest21_fingpt_equity_researcher_v1_001 --asset-class equity --role researcher --max-records 1500
```
Selected training can then oversample repair rows. The production batch wrappers set
`SHFT_REPAIR_OVERSAMPLE_FACTOR=2` and fail closed with
`SHFT_MAX_REPAIR_SELECTED_RATIO=0.75` unless the operator explicitly overrides
that environment variable:
```bat
python -m n21.cli build-learning-training-jsonl --source ..\..\data\learning\equity\researcher --output ..\shft_workspace\runs\<run_id>\training_selection\selected_training.jsonl --asset-class equity --role researcher --repair-oversample-factor 2
```
For strict corpus-composition checks, selected training can also cap the effective repair-row share:
```bat
python -m n21.cli build-learning-training-jsonl --source ..\..\data\learning\equity\researcher --output ..\shft_workspace\runs\<run_id>\training_selection\selected_training.jsonl --asset-class equity --role researcher --repair-oversample-factor 2 --max-repair-selected-ratio 0.75
```
The cap is fail-closed. The builder first keeps enough repair rows to satisfy the minimum coverage thresholds, then rejects the build if those mandatory repair rows still exceed the cap. The selected-training manifest records `repair_cap_applied`, `max_repair_selected_ratio`, `source_repair_row_count`, `selected_repair_source_rows`, and `dropped_repair_source_rows`. A run whose manifest shows `repair_cap_applied=false` is not acceptable for paid production evidence unless the operator records an explicit exception.
Before paid submission, the training script runs:
```bat
python -m n21.cli repair-coverage-gate --selected-training <selected_training.jsonl> --output <repair_coverage_gate.json>
```
The run may submit only after it prints `SHFT VITAL REPAIR COVERAGE OK=True`. This gate checks that the selected training set contains enough repair examples with numeric reasoning, fact/inference separation, neutral language, risk/tradeoff language, and critical reasoning. It is role-generic and applies to every asset/role combination.
Use the all-role local preflight before any paid six-role launch:
```bat
impl_codex\scripts\prepare_equity_all_roles_repair_preflight.bat
impl_codex\scripts\prepare_fixed_income_all_roles_repair_preflight.bat
impl_codex\scripts\prepare_multi_asset_all_roles_repair_preflight.bat
```
These commands do not submit paid training. They build release-wide paired-eval failure repairs, generate grounded critical-reasoning data, build selected training with repair oversampling, run `repair-coverage-gate`, validate all six role corpora, and write `impl_codex/shft_workspace/preflight/<asset_class>/all_roles_preflight_summary.json`. The 2026-05-24 non-capped preflight recheck passed for Equity, Fixed Income, and Multi Asset.
Latest Equity local verification:
```text
chief_investment_officer selected=16693 repair_rows=16594 validation_ok=true schema_errors=0 conflicts=0
client_portfolio_manager selected=21063 repair_rows=20906 validation_ok=true schema_errors=0 conflicts=0
performance_manager selected=8855 repair_rows=8822 validation_ok=true schema_errors=0 conflicts=0
portfolio_manager selected=69283 repair_rows=66602 validation_ok=true schema_errors=0 conflicts=0
researcher selected=56480 repair_rows=51058 validation_ok=true schema_errors=0 conflicts=0
risk_manager selected=31690 repair_rows=28092 validation_ok=true schema_errors=0 conflicts=0
```
After this preflight passes, launch all six Equity roles from protected best checkpoints:
```bat
impl_codex\scripts\run_linvest21fingpt_parallel_super_agents.bat --asset equity --version v1_001 --mode all-until-certified --continue-best
```
## Cross-Asset Repair/Preflight Parity
Equity is the reference for all-role repair/preflight. Fixed Income and Multi Asset use the same generic wrapper and summary gate before paid training:
```bat
impl_codex\scripts\prepare_asset_all_roles_repair_preflight.bat equity
impl_codex\scripts\prepare_asset_all_roles_repair_preflight.bat fixed_income
impl_codex\scripts\prepare_asset_all_roles_repair_preflight.bat multi_asset
```
Compatibility wrappers delegate to the generic wrapper:
```bat
impl_codex\scripts\prepare_equity_all_roles_repair_preflight.bat
impl_codex\scripts\prepare_fixed_income_all_roles_repair_preflight.bat
impl_codex\scripts\prepare_multi_asset_all_roles_repair_preflight.bat
```
For every role in the selected asset class, the wrapper must run the same sequence as the Equity reference:
```text
build-release-paired-eval-failure-repair
generate-reasoning-data
build-learning-training-jsonl with repair oversampling
repair-coverage-gate
validate-training-data
```
The role-local proof directory is:
```text
impl_codex/shft_workspace/preflight/<asset_class>/<role>
```
Required files:
```text
selected_training.jsonl
selected_training.manifest.json
repair_coverage_gate.json
training_data_validation/training_data_validation_report.json
training_data_validation/conflict_report.json
training_data_validation/quarantine_manifest.json
```
The asset-level summary is:
```text
impl_codex/shft_workspace/preflight/<asset_class>/all_roles_preflight_summary.json
```
That summary must show all six roles passing the repair coverage gate and training-data validation, and must record selected record count, repair row count, required reasoning inclusion, selected-training hash, release id, and training source. Paid all-role Fixed Income and Multi Asset training must remain blocked unless this summary is still passing immediately before launch.
The summary verifier can also be run directly:
```bat
cd impl_codex\self_healing_finetuning
python -m n21.cli summarize-asset-preflight --asset-class equity
python -m n21.cli summarize-asset-preflight --asset-class fixed_income
python -m n21.cli summarize-asset-preflight --asset-class multi_asset
```
2026-05-24 local verification status:
```text
equity ok=true passing_roles=6/6 failed_roles=[]
fixed_income ok=true passing_roles=6/6 failed_roles=[]
multi_asset ok=true passing_roles=6/6 failed_roles=[]
```
Latest local repair coverage:
```text
equity CIO=18934 client_pm=23276 performance=10074 portfolio=69602 researcher=54058 risk=31092
fixed_income CIO=3000 client_pm=384 performance=396 portfolio=648 researcher=2304 risk=3000
multi_asset CIO=396 client_pm=1884 performance=396 portfolio=504 researcher=732 risk=3000
```
This is the intended fail-closed behavior: the generic wrapper proves parity artifacts for every asset class, and the summary blocks paid all-role launches unless every role passes. The current 2026-05-24 summaries show Fixed Income and Multi Asset locally on par with Equity for non-capped preflight readiness. The same summaries now also expose corpus-risk warnings: all 18 roles are repair-heavy, Fixed Income has two data-thin roles (`client_portfolio_manager`, `performance_manager`), and Multi Asset has three data-thin roles (`chief_investment_officer`, `performance_manager`, `portfolio_manager`).
A stricter isolated 75% repair-cap probe was written to:
```text
impl_codex/shft_workspace/preflight_strict_cap_probe/repair_cap_075_after_balance_v2_20260524/strict_repair_cap_probe_summary.json
```
That probe now passes 18 of 18 roles after source-grounded non-repair balance rows were generated for the thin roles. The balance generator is:
```bat
python -m n21.cli generate-nonrepair-balance-data --asset-class <asset_class> --role <role> --min-nonrepair-rows 100 --force
```
It writes `source_grounded_nonrepair_balance_<asset_class>_<role>.hf_finetune.jsonl` and a manifest under `data/learning/<asset_class>/<role>`, and verifies that generated rows are not repair-classified. This closes the strict local corpus-composition gate. All roles still need live HF training, paired-eval proof, quality-gate success, model-card evidence, promotion evidence, and expanded original-source coverage before they are certified on par with trained Equity roles.
After the asset-class preflight summary passes, use the existing live launcher:
```bat
impl_codex\scripts\run_linvest21fingpt_parallel_super_agents.bat --asset fixed_income --version v1_001 --mode all-until-certified --continue-best
impl_codex\scripts\run_linvest21fingpt_parallel_super_agents.bat --asset multi_asset --version v1_001 --mode all-until-certified --continue-best
```
## Dataset Provenance And Stale-Adapter Gate
Self-healing does not mean that a dataset can mutate underneath an existing adapter. SHFT treats each paid training attempt as a controlled experiment:
```text
freeze dataset snapshot N -> train adapter N -> evaluate adapter N -> diagnose failures -> generate repair data -> freeze dataset snapshot N+1 -> train adapter N+1
```
The dataset is immutable per run, not globally. New public sources, normalized JSONL, or generated reasoning examples are valid self-improvement inputs only for a fresh dataset snapshot and a fresh training attempt.
Before paid training, SHFT should freeze and persist:
```text
impl_codex/shft_workspace/runs/<run_id>/training_selection/selected_training.jsonl
impl_codex/shft_workspace/runs/<run_id>/training_selection/selected_training_manifest.json
impl_codex/shft_workspace/runs/<run_id>/dataset_snapshot/dataset_manifest.json
impl_codex/shft_workspace/runs/<run_id>/remote_artifacts/training_plan.json
```
The frozen snapshot must include SHA-256 hashes for the selected training file and source JSONL files, train/valid/test counts, required reasoning-file presence, reasoning-record count, and reasoning-record ratio. Current HF live training also writes split hashes in `dataset_snapshot/dataset_manifest.json`:
```text
split_sha256.train
split_sha256.valid
split_sha256.test
provenance.source_sha256
```
HF dataset staging is run-scoped. `stage-hf-dataset` uploads a run snapshot to `hf://datasets/linvest21/shft-datasets/runs/<run_id>` and records:
```text
provider_plans/hf_dataset_stage_result.json
path_in_repo=runs/<run_id>
job_dataset_dir=/data/runs/<run_id>
dataset_manifest_sha256
split_sha256.train / valid / test
```
The HF trainer receives `/data/runs/<run_id>` plus the expected manifest and split hashes. Before GPU training starts, it reads the mounted `dataset_manifest.json`, recomputes train/valid/test hashes, and writes `remote_artifacts/training_plan.json.dataset_provenance`. If the remote row counts or hashes differ from the local frozen snapshot, it writes `training_result.status=blocked_dataset_provenance_mismatch` and exits non-zero. This prevents the stale shared-dataset-root failure where one run uploaded to repo root while another job trained from a different `/data/train.jsonl`.
If `generate-reasoning-data`, source intake, normalization, or manual curation changes `data/learning/<asset_class>/<role>` after an adapter has already been trained, the current run is stale:
```text
corpus_changed_after_training=true
stale_training_artifacts=true
force_new_run_required=true
```
In that state SHFT must not attach to the old `train_handle.json`, must not reuse the old adapter as the current candidate, and must not export it as a fresh implementation product. It should start a new run id, freeze the expanded dataset, and train a new adapter against that exact snapshot.
The existing operator command does not need to change:
```bat
impl_codex\scripts\run_linvest21fingpt_super_agent_menu.bat --asset equity --role researcher --version v1_001 --mode all-until-certified
```
The command semantics change: before `RESUME_POLICY=attached_to_latest_resumable_run` can attach to an existing run, the resume gate must compare the current dataset hash against the adapter's training dataset hash. A mismatch means stale artifacts, so resume is blocked and a fresh run is required.
The model-quality gate also checks remote/local dataset parity. Even if paired eval passes, promotion remains blocked unless:
```text
remote_artifacts/training_plan.json.train_records == dataset_snapshot/dataset_manifest.json.split_counts.train
remote_artifacts/training_plan.json.valid_records == dataset_snapshot/dataset_manifest.json.split_counts.valid
remote_artifacts/training_plan.json.dataset_provenance.ok == true
```
## Model-Quality Gate
Self-healing cycle scores are orchestration evidence. They are not enough for promotion.
Promotion and final copyable packaging require measured model-quality evidence:
```text
impl_codex/shft_workspace/runs/<run_id>/eval/paired_eval_report.json
impl_codex/shft_workspace/runs/<run_id>/remote_artifacts/training_plan.json
impl_codex/shft_workspace/runs/<run_id>/dataset_snapshot/dataset_manifest.json
impl_codex/shft_workspace/runs/<run_id>/eval/model_judge_report.json
impl_codex/shft_workspace/runs/<run_id>/eval/human_spot_check_report.json
```
Run the gate directly:
```bat
cd impl_codex\self_healing_finetuning
python -m n21.cli produce-eval-evidence --run-id <run_id> --release-id <release_id>
python -m n21.cli quality-gate --run-id <run_id>
```
To require a real human spot-check response before the quality gate, request email review:
```bat
cd impl_codex\self_healing_finetuning
python -m n21.cli produce-eval-evidence --run-id <run_id> --release-id <release_id> --request-human-email
python -m n21.cli quality-gate --run-id <run_id>
```
The full `run_shft_0_to_16_with_proof.bat` lifecycle now enables this by default with `SHFT_HUMAN_REVIEW_EMAIL=true` and waits up to `SHFT_HUMAN_REVIEW_TIMEOUT_SECONDS` seconds, default `1800`. Set `SHFT_HUMAN_REVIEW_EMAIL=false` only for noninteractive automation where a pending human report is acceptable.
The gate fails closed and prints exact blockers when evidence is missing or thresholds are not met.
Current all-role certification targets are intentionally aligned across all 18 asset/role models. The paired proof and the model-judge proxy both require the same minimum absolute quality bar:
```text
candidate aggregate / model-judge mean score >= 0.60
candidate critical-pass / model-judge critical-pass >= 0.70
pairwise win rate >= 0.55
pairwise loss rate <= 0.02
```
This lets a run that has passed measured paired proof proceed to certification only after training budget, trainer overfit, corpus coverage, baseline proof, and explicit human spot-check approval also pass. It does not mean the run is the protected best checkpoint; the best-run tracker still records whether it improved the previous best.
`produce-eval-evidence` writes the mandatory evidence artifacts that the gate already required:
```text
eval/baseline_proof_report.json
eval/model_judge_report.json
eval/human_spot_check_report.json
eval/required_eval_evidence_manifest.json
```
The baseline report resolves the zero-baseline trap without weakening the gate. If the paired baseline aggregate is zero and relative improvement is mathematically undefined, the report marks `proof_mode: absolute_only_cold_start`; the quality gate still requires absolute aggregate, critical-pass, pairwise loss, model-judge, training-budget, corpus-coverage, and human-review checks. The human spot-check report is produced as pending evidence by default; SHFT never infers human approval automatically. Use `--approve-human` only after a real human review has approved the sampled cases.
Human review email uses the same delivery stack as owner convergence decisions. Configure `SHFT_RUN_OWNER_EMAIL`, then either SMTP/IMAP (`SHFT_SMTP_HOST`, `SHFT_SMTP_FROM`, optional `SHFT_SMTP_USERNAME`, `SHFT_SMTP_PASSWORD`, `SHFT_IMAP_HOST`, `SHFT_IMAP_USERNAME`, `SHFT_IMAP_PASSWORD`) or Outlook COM on Windows. Without configured delivery, SHFT writes an outbox artifact and keeps polling the audited response files:
```text
impl_codex/shft_workspace/runs/<run_id>/eval/human_spot_check_response.json
impl_codex/shft_workspace/human_spot_check_reviews/responses/<run_id>.json
```
Valid review responses are `approve` or `reject`. Approval only passes the human gate when the response records zero critical failures.
When `all` reaches a failed model-quality gate, SHFT now runs an automatic Step 0 stall-breakout pass. If breakout creates new trainable material or the loop creates new repair/reasoning data, `all` starts a fresh run id and repeats train -> fetch -> paired proof -> quality gate until the gate passes or the configured recovery budget is exhausted. The failed run is never certified retroactively.
For explicit operator-supervised continuous training, use `all-until-certified`:
```bat
impl_codex\scripts\run_linvest21fingpt_super_agent_menu.bat --asset equity --role researcher --version v1_001 --mode all-until-certified
```
This is a paid live mode. It repeats train -> fetch -> paired proof -> quality gate -> source breakout or repair-data generation -> retrain until the configured model-quality gate passes, the recovery budget is exhausted, or a human operator stops it with Ctrl+C or by closing the command window. It still fails closed for certification: a run is not promoted or packaged as certified unless the measured quality gate passes.
For the Qwen3-32B Equity Researcher path, the current one-command launcher is:
```bat
impl_codex\scripts\run_qwen3_32b_equity_researcher_all_until_certified.bat
```
That command runs the best available SFT/proof automation: Qwen3-32B QLoRA training, artifact-aware fetch, paired proof, mandatory evidence, human review, quality gate, paired-eval diagnostics, repair-target generation, pairwise preference-memory generation, and recovery/retraining until certification or human stop.
The default A+ operator is the artifact-aware autopilot:
```bat
impl_codex\scripts\run_shft_autopilot_to_a_plus.bat
```
With no arguments, it selects the latest run for `linvest21_fingpt_equity_researcher_v1_001`. It can also resume a specific source or preference run:
```bat
impl_codex\scripts\run_shft_autopilot_to_a_plus.bat <run_id> linvest21_fingpt_equity_researcher_v1_001 equity researcher 120 a100-large
```
The autopilot is conservative with paid jobs. It submits a paired-eval or DPO job only when the required prior artifact is complete and no local handle already exists. Otherwise it polls, fetches, writes `autopilot_status.json`, and waits. Its loop is:
```text
fetch training/preference artifacts -> paired proof -> quality gate
-> if source gate fails, build loss-targeted preference data and submit DPO
-> if preference gate passes, write A+ report
```
Terminal states are certification, explicit quality-gate failure after a completed preference proof, terminal HF job failure/cancelation, or the configured poll budget. The default poll budget is `288` polls at `300` seconds each; override with the final two arguments.
The older A+ sequence launcher remains available for manual staging:
```bat
impl_codex\scripts\run_qwen3_32b_equity_researcher_a_plus_sequence.bat
```
It first looks for the latest run with:
```text
impl_codex/shft_workspace/runs/<run_id>/preference_memory/preference_pairs.jsonl
```
If preference data exists, it submits the DPO preference-optimization stage before another SFT loop. If no preference data exists, it runs the Qwen full-auto SFT/proof/failure-mining loop, then attempts preference optimization again. Preference training is intentionally written as a separate `_pref_<timestamp>` run so a bad preference adapter cannot overwrite the SFT adapter or protected best checkpoint.
The preference stage is implemented by:
```text
impl_codex/self_healing_finetuning/training/hf_preference_train.py
impl_codex/scripts/run_hf_preference_train_for_run.bat
impl_codex/scripts/fetch_hf_preference_train_for_run.bat
```
For manual debugging after a DPO job finishes, fetch and proof the new preference run:
```bat
impl_codex\scripts\fetch_hf_preference_train_for_run.bat <pref_run_id>
impl_codex\scripts\run_hf_paired_eval_for_run.bat <pref_run_id> 120 a100-large
cd impl_codex\self_healing_finetuning
python -m n21.cli a-plus-report --run-id <pref_run_id> --source-run-id <source_run_id>
```
Do not describe a run as A+ until `eval/a_plus_upgrade_report.json` reports `grade=A+`. The A+ report requires completed preference training, paired proof, aggregate delta >= +0.05, pairwise win rate >= 0.55, pairwise loss rate <= 0.02, critical pass rate >= 0.95, and human approval with zero critical failures.
After a failed paired proof, the loop writes bucketed diagnostics and repair targets before preference memory:
```text
impl_codex/shft_workspace/runs/<run_id>/diagnostics/paired_eval_diagnostics.jsonl
impl_codex/shft_workspace/runs/<run_id>/diagnostics/repair_targets.jsonl
impl_codex/shft_workspace/runs/<run_id>/diagnostics/paired_eval_diagnostics_manifest.json
impl_codex/shft_workspace/runs/<run_id>/preference_memory/preference_pairs.jsonl
impl_codex/shft_workspace/runs/<run_id>/preference_memory/preference_manifest.json
```
Each diagnostic row records the prompt id, baseline answer, candidate answer, winner, failure bucket, critical-failure flag, judge rationale, and repair target with acceptance checks. The configured buckets are `valuation_math`, `accounting_sec_extraction`, `moat_reasoning`, `risk_premium_discount_rate`, `investment_memo_synthesis`, and `hallucination_uncertainty`.
For Qwen raw-base paired proof, the submit script mounts the base model into the HF job and evaluates from `/models/base`:
```text
hf://Qwen/Qwen3-32B:/models/base:ro
```
This avoids relying on a live model download inside the proof job after a previous `Qwen/Qwen3-32B` load timeout.
During remote Hugging Face waits, the batch loop now prints live job transparency instead of only artifact polling. Each training/proof wait poll calls `impl_codex/scripts/show_hf_job_status.ps1`, which reports the HF job id, stage, created time, URL, and every fifth poll prints a compact `hf jobs logs --tail` excerpt with trainer progress such as current step, total steps, loss, learning rate, token count, and estimated remaining time. The helper forces UTF-8 for HF CLI calls to avoid Windows `charmap` log failures.
For an already-running job, start a separate watcher without disturbing the training process:
```bat
impl_codex\scripts\watch_hf_job_status.bat <run_id> train 30 30
```
The batch wrapper also accepts a full run directory. Use `proof` instead of `train` to watch the paired model-quality proof.
```powershell
powershell -NoProfile -ExecutionPolicy Bypass -File impl_codex\scripts\watch_hf_job_status.ps1 -RunDir impl_codex\shft_workspace\runs\<run_id> -Kind train -IntervalSeconds 30 -TailLines 30
```
Continuous mode writes operator-visible intelligence and recovery status after each quality gate and breakout:
```text
impl_codex/shft_workspace/continuous_training/<release_id>_status.json
impl_codex/shft_workspace/runs/<run_id>/continuous_training_status.json
impl_codex/shft_workspace/runs/<run_id>/next_data_strategy.json
impl_codex/shft_workspace/best_runs/<release_id>.json
```
These artifacts report current aggregate, critical-pass rate, pairwise win/loss rate, training loss when available, train/validation counts, distance to configured thresholds, best measured run so far, source-breakout status, and the next data strategy.
Final all-role status and email reporting:
```bat
impl_codex\scripts\send_final_all_roles_stats_email.bat "pm-list@example.com;cio-list@example.com"
```
This finalizer scans all 18 `v1_001` asset/role releases, combines paired-proof metrics with the latest seven-bucket defect ranking, and writes polished JSON, Markdown, and HTML reports under:
```text
impl_codex/shft_workspace/final_reports/final_all_18_status_latest.json
impl_codex/shft_workspace/final_reports/final_all_18_status_latest.md
impl_codex/shft_workspace/final_reports/final_all_18_status_latest.html
```
The report includes promotion eligibility, candidate aggregate score, critical-pass rate, pairwise win/loss rate, top defect categories, and the next repair action per role. It exits `0` after successfully writing the final report. Email is intentionally opt-in so dry status runs do not accidentally send mail. To send the report, configure:
```bat
set SHFT_SEND_FINAL_EMAIL=true
set SHFT_FINAL_MAIL_TO=pm-list@example.com;cio-list@example.com
set SHFT_SMTP_HOST=smtp.example.com
set SHFT_SMTP_PORT=587
set SHFT_SMTP_FROM=shft-status@example.com
set SHFT_SMTP_USER=shft-status@example.com
set SHFT_SMTP_PASSWORD=<smtp-password-or-app-password>
set SHFT_SMTP_TLS=true
```
For all-agent dispatches, set `SHFT_FINAL_REPORT_AFTER_DISPATCH=true` to run the same finalizer after the dispatcher returns. Use that only when the dispatch mode has actually completed the intended work; parallel tab/window launch is asynchronous, so the separate finalizer command is the preferred final step after the 18 role windows have finished.
When `fetch-proof` reaches a failed model-quality gate, SHFT still runs the breakout pass and writes the recovery package, but it does not auto-submit a fresh paid training job because `fetch-proof` is a resume/fetch command.
Each breakout pass creates transparent recovery artifacts under:
```text
impl_codex/shft_workspace/runs/<run_id>/stall_breakout
```
Key artifacts:
```text
stall_breakout_plan.json
source_intake_manifest.json
license_manifest.json
training_data_validation/training_data_validation_report.json
```
The breakout pass uses the configured public-source catalog and source policy, downloads public material automatically, promotes only normalized `eligibility.training=true` sources into `data/learning/<asset_class>/<role>`, validates the resulting corpus, and records whether the new sources are actually trainable by the current Step 0b converters.
Source acquisition is fallback-aware: a failed URL, `403`, `404`, timeout, blocked source, or non-trainable normalized source is recorded and SHFT continues to the next matching source until it finds the configured number of trainable sources or exhausts the catalog. The default policy remains: download public material automatically, but train only on material that passes the configured source-policy gate.
Continuous mode has a circuit-breaker for the exact failure pattern seen in the equity logs. When breakout is blocked, adds no trainable sources, and the candidate regressed against the previous best checkpoint, `orchestrator/continuous_status.py` sets `convergence_control.state: NEEDS_REASONING_DATA`, `should_halt_paid_retraining: true`, and exit code `8` once discovery is exhausted. Severe regressions halt immediately. Minor regressions also halt early when the latest retry returns zero candidates, recorded as `no_candidate_retry_exhausted=true`. The batch flow then builds paired-eval failure repair examples and runs `generate-reasoning-data` for the same asset/role instead of continuing an unbounded sleep/retry cycle.
After that reasoning data is generated, the existing adapter for that run is stale unless the trainer can prove it consumed a dataset snapshot containing the generated reasoning file. The next continuous iteration must freeze a new dataset snapshot and start or attach only to a run whose adapter training hash matches that snapshot. This is the key guard against the equity portfolio-manager failure pattern where a later fetch/export step showed `required_reasoning_included: true`, but the run had already skipped training because an old `train_handle.json` and adapter were present.
Control flags:
```bat
set SHFT_AUTO_STALL_BREAKOUT=false
set SHFT_AUTO_BREAKOUT_RETRAIN=false
set SHFT_MAX_BREAKOUT_ROUNDS=20
set SHFT_MIN_DISCOVERY_ATTEMPTS_BEFORE_REASONING_HALT=10
set SHFT_REASONING_DATA_MAX_RECORDS=1500
set SHFT_REPAIR_OVERSAMPLE_FACTOR=2
set SHFT_MAX_REPAIR_SELECTED_RATIO=0.75
set SHFT_TRAIN_MAX_STEPS=600
set SHFT_CONTINUOUS_DISCOVERY_SLEEP_SECONDS=300
```
Disable automatic breakout for a run with:
```bat
set SHFT_AUTO_STALL_BREAKOUT=false
```

Xet Storage Details

Size:
56.6 kB
·
Xet hash:
412843ebf26337e7c1d2edfcaa99b56b6a68f45d8ed0bbf1a64246cd17a3f678

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.