Update card: API access via Forge (api.voxell.ai/v1/embeddings)

c44ba2a verified 2 days ago

5.63 kB

	---
	pipeline_tag: feature-extraction
	tags:
	- sentence-transformers
	- feature-extraction
	- sentence-similarity
	- mteb
	- qwen
	library_name: sentence-transformers
	license: apache-2.0
	---

	# Ingot-8B-R3

	Embedding models have a simple job: compress language into dense vectors where
	proximity means meaning. They make semantic search function by intent rather
	than keyword matching, allowing retrieval-augmented generation pipelines to
	locate exact context across millions of documents. The quality of this
	embedding layer defines the operational ceiling for every downstream retrieval
	and reasoning step.

	Ingot-8B-R3 is a text embedding model built by [Jonathan Corners](https://sentimark.ai)
	at [Voxell](https://voxell.ai). It is based on
	[Qwen/Qwen3-Embedding-8B](https://huggingface.co/Qwen/Qwen3-Embedding-8B)
	and extended with a proprietary routing framework. Different specialists
	activate at inference time from the input content alone, requiring no task
	metadata or manual routing flags. The routing work is proprietary to Voxell
	and patent-pending.

	As measured on MTEB(eng, v2), Ingot-8B-R3 achieves the highest Mean (Task)
	score of any English embedding model developed in the United States at the
	time of evaluation.

	---

	## The Solo Contrast

	If you look at the top of the MTEB leaderboard, you will see a familiar
	pattern. Most of the top entries are backed by state-funded research
	consortiums, sprawling hyperscale labs, and deep academic teams. Many of those
	cards end with a corporate WeChat handle or QR code for support.

	We do not have a WeChat handle, a corporate campus, or an army of PhDs. Ingot
	was built by a single engineer, working from home on a tight budget of private
	consumer GPUs. We won on data engineering, not on computing scale.

	> Access is gated. Weights are available on request for academic
	> evaluation and verification. Use the Request Access button above to
	> describe your use case.

	---

	## Performance

	Evaluated on [MTEB(eng, v2)](https://arxiv.org/abs/2502.13595), a 41-task
	English benchmark covering retrieval, semantic textual similarity,
	classification, clustering, pair classification, reranking, and summarization.
	The base model, Qwen3-Embedding-8B, scores 75.23 Mean (Task) on this
	benchmark.

	\| Metric \| Score \|
	\|---\|---\|
	\| Mean (Task) \| 75.99 \|
	\| Mean (Category) \| 69.9958 \|
	\| Borda Points \| 5567 \|

	Borda scoring ranks each model against the full leaderboard cohort on every
	task, then sums those rank points. It rewards consistent representation quality
	across the entire task distribution rather than optimization peaks on a few
	specific datasets.

	### By Category

	Reranking and Summarization in MTEB(eng, v2) are single-task categories.
	These scores reflect one dataset each, not a category average.

	\| Category \| Tasks \| Mean \|
	\|---\|---\|---\|
	\| Classification \| 8 \| 90.41 \|
	\| STS \| 9 \| 89.32 \|
	\| PairClassification \| 4 \| 87.66 \|
	\| Retrieval \| 10 \| 70.01 \|
	\| Clustering \| 8 \| 58.47 \|
	\| Summarization \| 1 \| 36.96 \|
	\| Reranking \| 1 \| 32.84 \|

	Per-task results are published in the
	[mteb/results](https://huggingface.co/datasets/mteb/results) dataset.

	---

	## Architecture

	\| Attribute \| Specification \|
	\|---\|---\|
	\| Base Model \| `Qwen/Qwen3-Embedding-8B` \|
	\| Output Dimension \| 4096 (`float32`) \|
	\| Max Sequence Length \| 32,768 tokens \|
	\| Similarity Metric \| Cosine \|

	The routing logic, specialist checkpoints, and dispatch thresholds are
	proprietary to Voxell and are not included in this repository.

	---

	## Usage

	Ingot-8B-R3 is served through the Voxell Forge API at `api.voxell.ai`.
	No weights download is required. Request an API key using the
	Request Access button at the top of this page, or contact
	[corp@voxell.ai](mailto:corp@voxell.ai). Set `VOXELL_API_KEY` in your
	environment, then:

	```python
	import os
	from openai import OpenAI

	client = OpenAI(
	api_key=os.environ["VOXELL_API_KEY"],
	base_url="https://api.voxell.ai/v1",
	)

	response = client.embeddings.create(
	model="JCorners/Ingot-8B-R3",
	input=["Example sentence"],
	)
	print(len(response.data[0].embedding)) # 4096
	```

	The API is OpenAI-compatible. Output is a 4096-dimensional float32 vector.
	Cosine similarity is the correct distance function. The proprietary routing
	layer activates the correct specialist for your input at runtime — no
	additional configuration is required.

	---

	## What ships, and what does not

	Ingot-8B-R3 is a research instrument: a frontier-grade embedder built and
	tuned to the shape of MTEB(eng, v2). The advances that generalize to
	production retrieval, including large-scale document corpora, structure
	preservation, hierarchical tables, source code, and Abstract Syntax Trees,
	ship in [Forge](https://voxell.ai).

	You can try the Forge embedding API right now with no signup required. Paste
	any text to see it transformed into a vector and copy production-ready
	integration snippets in under sixty seconds at
	[playground.voxell.ai](https://playground.voxell.ai/). Creating a developer
	account awards an initial grant of 10,000,000 free tokens to build on.

	The benchmark is the proof. Forge is the point.

	---

	## Contact and Connections

	- Weights access: Use the gated access request button at the top of this card.
	- Professional networking: Connect with Jonathan Corners on [LinkedIn](https://www.linkedin.com/in/jonathancorners/) for corporate, venture, and engineering updates.
	- Partnerships and commercial licensing: [corp@voxell.ai](mailto:corp@voxell.ai) or [voxell.ai](https://voxell.ai).
	- Technical essays: Deep-dives on synthetic data and routing mechanics at [sentimark.ai](https://sentimark.ai).