banner

PRESERVED ORIGINAL // REMOVED BY MICROSOFT FROM HF + GITHUB // MIT
   microsoft/FastContext ──▢ 404
   github.com/microsoft/FastContext ──▢ 404
              β”‚
              β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚  weights preserved here  β”‚
   β”‚  bf16 Β· 8.0 GB Β· intact  β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
     you can't un-open-source
FASTCONTEXT-1.0-4B-SFT
PRESERVED ORIGINAL WEIGHTS Β· QWEN3 DENSE 4B Β· 256K CONTEXT Β· BF16 Β· 8.0 GB
WEIGHTS
BF16 Β· UNMODIFIED
ARCH
QWEN3 DENSE Β· 36L
CONTEXT
256K NATIVE
LICENSE
MIT

Microsoft open-sourced FastContext under MIT, then deleted it from both HuggingFace and GitHub about two weeks later (verified: 404 on both, 2026-07-02). MIT means preservation is legal β€” so here it is, unmodified. Own your AI: a model on your disk can't be sunset by a quarterly review.

πŸ” What it is

A repository-exploration subagent for coding agents. Invoked on demand by your main agent, it fires parallel read-only tool calls (READ / GLOB / GREP) across a repo and returns only the file paths + line ranges that matter, as compact context. Your frontier coding agent stops wasting its context window (and your bill) crawling the file tree.

Microsoft's (now-deleted) announcement reported ~60% fewer tokens from the main coding agent and +5.5% on SWE-bench β€” their figures; the source no longer exists to cite.

Architecture: plain Qwen3ForCausalLM dense 4B β€” 36 layers, 256K native context. No exotic modules; loads with standard transformers.

πŸš€ Quick start

from transformers import AutoModelForCausalLM, AutoTokenizer
m = AutoModelForCausalLM.from_pretrained("KikoCis/FastContext-1.0-4B-SFT", torch_dtype="bfloat16", device_map="auto")
tok = AutoTokenizer.from_pretrained("KikoCis/FastContext-1.0-4B-SFT")

Don't want 8 GB? Grab the GGUF quants (1.96–2.5 GB, long-context imatrix, retrieval-validated 30/30 vs this bf16): πŸ‘‰ KikoCis/FastContext-1.0-4B-longctx-imatrix-GGUF

⚠️ Good to know

  • It's a scout, not a solver β€” it finds and returns evidence; pair it with a main coding agent that writes the actual fix.
  • Upstream docs, harness code and issues were deleted along with the repos; usage conventions here come from the announcement and community mirrors.
  • Weights are byte-identical to the (re-uploaded) original β€” no fine-tuning, no edits.

πŸ“š Credit & license

Model, weights, training: Β© Microsoft (MIT). This is a preservation mirror sourced via the ShaunGves/FastContext-1.0-4B-SFT re-upload after microsoft/FastContext-1.0-4B-SFT was removed. Nothing modified. Quantized companion + validation: KikoCis.

Downloads last month
-
Safetensors
Model size
4B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support