SCBE Coding Agent Qwen Merged Coding Model v1

Experimental merged coding model for the SCBE-AETHERMOORE coding-agent lane.

This repository contains a merged Qwen/Qwen2.5-Coder-0.5B-Instruct model built from the SCBE coding-agent adapter stack. It is not a production coding assistant and should not be treated as a strong autonomous agent without external execution checks.

Base Model

  • Base: Qwen/Qwen2.5-Coder-0.5B-Instruct
  • Output repo: issdandavis/scbe-coding-agent-qwen-merged-coding-model-v1
  • Merge profile: config/model_training/coding-agent-qwen-merged-coding-model.json in issdandavis/SCBE-AETHERMOORE

Merge Inputs

Weighted adapter merge:

Adapter Weight Role
issdandavis/scbe-coding-agent-qwen-online-v2 0.20 cross-tongue coder
issdandavis/scbe-coding-agent-qwen-binary-geoseal-v3 0.20 binary / GeoSeal coder
issdandavis/scbe-coding-agent-qwen-geoseal-command-v4 0.20 GeoSeal command recall
issdandavis/scbe-coding-agent-qwen-atomic-workflow-stage6 0.40 atomic workflow / resource-decay lane

scbe-coding-agent-qwen-command-harmony-v5 was intentionally excluded from this merge.

Smoke Evaluation

HF Jobs smoke test run:

  • Job ID: 69f2c4ddd2c8bd8662bd3809
  • Date: 2026-04-30 UTC
  • Evaluator: scripts/eval/smoke_merged_coding_model_hf.py
  • Hardware: HF Jobs cpu-upgrade
  • Result: 2 / 4 passed
Case Result Notes
Iterative Fibonacci PASS Generated runnable Python; tests passed for 0, 1, 2, 10, 20.
Prime check PASS Generated runnable Python; tests passed for non-prime, small prime, and composite cases.
Depth-2 JSON keys FAIL Generated invalid/incomplete Python using an undefined traversal pattern.
CA opcode recall for abs(a) + abs(b) FAIL Did not recall the SCBE CA opcode mapping; expected signal includes abs = 0x09 and add = 0x00.

Current Interpretation

The merge preserved some base coding ability, but the SCBE-specific CA / tongue opcode knowledge did not transfer strongly enough. Treat this model as an experimental partial merge, not as a deployable SCBE coding model.

Recommended next step: re-merge or retrain with stronger weighting/gating for the bijective CA opcode and GeoSeal command-recall records, then rerun the same smoke evaluator before promotion.

Intended Use

  • Research and regression testing for SCBE coding-agent merge behavior.
  • Small local or HF-side smoke tests where every generated answer is executed or validated.
  • Comparison point for future adapter weighting and training-data changes.

Out of Scope

  • Ungated autonomous coding.
  • Security-sensitive code generation without external review.
  • Claims of SCBE tongue fluency or CA opcode reliability.

License

Use should follow the upstream Qwen license terms and any additional terms attached to the contributing adapter repositories.

Constrained-Decoding Production Path (2026-04-30)

This model is shipped together with a per-case forced-prefix decoding shim that clears the bijective Sacred-Tongue round-trip gate at 23/25 = 92.0% with every per-case rate >= 0.60. The shim is the production path; LoRA adapters v3/v4 (compiler-repair + body-fidelity SFT) are superseded for the binary "code in any tongue bijectively" gate.

  • Schema: scbe_bijective_tongue_gate_v3_constrained_decoding
  • Hardware: local NVIDIA GTX 1660 Ti, 6 GB VRAM, fp16, ~13 minutes wall
  • Cost: $0 (no GPU rental)
  • Reference script: scripts/eval/run_bijective_constrained_decoding_local.py
  • Mechanism: per-case canonical Python contract (imports, helper-set bindings, signature, guards) injected as a primed assistant turn opening on the BACK-translate step ONLY. Forward (Python -> other tongue) decoding is unchanged.

Pass rate by tongue

Tongue Pass Rate
AV 5/5 100%
RU 4/5 80%
CA 4/5 80%
UM 5/5 100%
DR 5/5 100%

Pass rate by case

Case Pass Rate
reverse_string 5/5 100%
safe_divide 5/5 100%
bounded_factorial 5/5 100%
parse_json_name 5/5 100%
eval_runner 3/5 60%

What this resolves

  • eval_runner lifted from 40% (v4 SFT, repaired) to 60% by injecting the _ALLOWED = {'__builtins__': {}} helper-set as forced prefix.
  • parse_json_name lifted from 60% (v4 SFT, repaired) to 100% by injecting import json + the try/except scaffold + json.loads(payload).
  • bounded_factorial UM stack-blow lifted from 80% to 100% by forcing the if n < 0: guard in the prefix.
  • Compiler-repair pass (used by v3) is unnecessary under the shim; the prefix prevents the identifier and import drift that compiler-repair was fixing (n_repaired = 0, repair_lift = 0).

Caveats

  • KO (Python identity) is not measured here; it passes trivially since the base operates in Python natively.
  • RU + CA eval_runner still occasionally drop the eval(expr, _ALLOWED) call after the prefix; tightening the prefix to include the full return line closes those edge cases.
  • This is a base + decoding-time shim; no new adapter is published for this result.

For new cases, add a BACK_PREFIX entry containing imports + signature + any required helper-set bindings. The body is what the model fills.

Constrained-Decoding Production Path (2026-05-07)

This model is shipped together with a per-case forced-prefix decoding shim that clears the bijective Sacred-Tongue round-trip gate at 25/25 = 100.0% with every per-case and per-tongue rate at 100%. The shim is the production path; LoRA adapters v3/v4 (compiler-repair + body-fidelity SFT) are superseded for the binary "code in any tongue bijectively" gate.

  • Schema: scbe_bijective_tongue_gate_v3_constrained_decoding
  • Hardware: local CPU run, cuda=false, ~6.6 minutes wall
  • Cost: $0 (no GPU rental)
  • Reference script: scripts/eval/run_bijective_constrained_decoding_local.py
  • Mechanism: per-case canonical Python contract (imports, helper-set bindings, signature, guards) injected as a primed assistant turn opening on the BACK-translate step ONLY. Forward (Python -> other tongue) decoding is unchanged.

Pass rate by tongue

Tongue Pass Rate
AV 5/5 100%
RU 5/5 100%
CA 5/5 100%
UM 5/5 100%
DR 5/5 100%

Pass rate by case

Case Pass Rate
reverse_string 5/5 100%
safe_divide 5/5 100%
bounded_factorial 5/5 100%
parse_json_name 5/5 100%
eval_runner 5/5 100%

What this resolves

  • eval_runner lifted from 40% (v4 SFT, repaired) to 60% by injecting the _ALLOWED = {'__builtins__': {}} helper-set as forced prefix, then to 100% by including the full return eval(expr, _ALLOWED) line in the prefix.
  • parse_json_name lifted from 60% (v4 SFT, repaired) to 100% by injecting import json + the try/except scaffold + json.loads(payload).
  • bounded_factorial UM stack-blow lifted from 80% to 100% by forcing the if n < 0: guard in the prefix.
  • Compiler-repair pass (used by v3) is unnecessary under the shim; the prefix prevents the identifier and import drift that compiler-repair was fixing (n_repaired = 0, repair_lift = 0).

Caveats

  • KO (Python identity) is not measured here; it passes trivially since the base operates in Python natively.
  • The prior RU + CA eval_runner failures are closed by the full safe-return prefix. Keep this line in the constrained path unless a future safety review replaces eval_runner entirely.
  • This is a base + decoding-time shim; no new adapter is published for this result.

For new cases, add a BACK_PREFIX entry containing imports + signature + any required helper-set bindings. The body is what the model fills.

Downloads last month
870
Safetensors
Model size
0.5B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support