chromadb
/

context-1

Model card Files Files and versions

hammadtime commited on 2 days ago

Commit

ad4bbb3

·

verified ·

1 Parent(s): 56ad607

Create README.md

Files changed (1) hide show

README.md +73 -0

README.md ADDED Viewed

	@@ -0,0 +1,73 @@

+---
+license: apache-2.0
+base_model:
+- openai/gpt-oss-20b
+---
+# Chroma Context-1
+Context-1 is a 20B parameter agentic search model trained
+to retrieve supporting documents for complex, multi-hop
+queries. It is designed to be used as a retrieval subagent
+alongside a frontier reasoning model: given a query,
+Context-1 decomposes it into subqueries, iteratively
+searches a corpus, and selectively edits its own context
+to free capacity for further exploration.
+Context-1 achieves retrieval performance comparable to
+frontier LLMs at a fraction of the cost and up to 10x
+faster inference speed.
+**Technical report:**
+[Chroma Context-1: Training a Self-Editing Search Agent](https://trychroma.com/research/context-1)
+## Model Details
+- **Base model:** gpt-oss-20b
+- **Parameters:** 20B (Mixture of Experts)
+- **Training:** SFT + RL (CISPO) with a staged curriculum
+- **Precision:** BF16 (MXFP4 quantized checkpoint coming soon)
+## Key Capabilities
+- **Query decomposition:** Breaks complex multi-constraint
+  questions into targeted subqueries.
+- **Parallel tool calling:** Averages 2.56 tool calls per
+  turn, reducing total turns and end-to-end latency.
+- **Self-editing context:** Selectively prunes irrelevant
+  documents mid-search to sustain retrieval quality over
+  long horizons within a bounded context window (0.94
+  prune accuracy).
+- **Cross-domain generalization:** Trained on web, legal,
+  and finance tasks; generalizes to held-out domains and
+  public benchmarks (BrowseComp-Plus, SealQA, FRAMES,
+  HLE).
+## Important: Agent Harness Required
+Context-1 is trained to operate within a specific agent
+harness that manages tool execution, token budgets, context
+pruning, and deduplication. **The harness is not yet
+public.** Running the model without it will not reproduce
+the results reported in the technical report.
+We plan to release the full agent harness and evaluation
+code soon. In the meantime, the technical report describes
+the harness design in detail.
+## Citation
+```bibtex
+@techreport{bashir2026context1,
+  title = {Chroma Context-1: Training a Self-Editing Search Agent},
+  author = {Bashir, Hammad and Hong, Kelly and Jiang, Patrick and Shi, Zhiyi},
+  year = {2026},
+  month = {March},
+  institution = {Chroma},
+  url = {https://trychroma.com/research/context-1},
+}
+```
+## License
+Apache 2.0