Instructions to use aisquared/bolt-embedding-small with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use aisquared/bolt-embedding-small with sentence-transformers:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("aisquared/bolt-embedding-small")

sentences = [
    "I'm trying to write a PHP script which reads SIP (session initiation protocol) signals from a hardware switch to gets specific details and then return some data back to the switch.\nBeing a complete newbie to this SIP thing I don't know how to interact with the switch sending SIP signal. Do we need to send some message to the switch to get response?\nI googled SIP but got only general info regarding what SIP is all about but nothing programmatic.\nCan any one provide any pointers to any tutorials which show how interact with a SIP signal programmatically?\nAre there any free online services that simulate SIP signals for testing purpose?\n",
    "Lake Okahumpka is a freshwater lake in Wildwood, Florida, United States. Lake Okahumpka Park is along part of its shoreline. In 1980, the United States Geological Survey reported on the hydrology of Lake Okahumpka and Lake Deaton area.\n\nThe lake is east of Wildwood on the south side of State Road 44. The lake has been treated for hydrilla. Ring neck ducks have been hunted from its shores.\n\nSee also\nOkahumpka, Florida\n\nReferences\n\nBodies of water of Sumter County, Florida\nOkahumpka",
    "Because of different regional setting on different machines. To have date time output in the same format you ahve to specify format string explciitly:\ndate.ToString(\"yyyy-MM-dd HH:mm:ss\");\n\nAlso as John recommeded in comments below if you want having date time output in the same format on different machines despite local regional settings you can use InvariantCulture format provider:\ndate.ToString(CultureInfo.InvariantCulture);\n\nMSDN:\n\nThe invariant culture is culture-insensitive; it is associated with\n  the English language but not with any country/region\n\nMSDN:\n\nStandard Date and Time Format Strings\nCustom Date and Time Format Strings\n\n",
    "The President of India plays a ceremonial role in foreign affairs, appointing ambassadors and ratifying treaties, but the day‑to‑day conduct of diplomacy is handled by the Ministry of External Affairs and the Prime Minister's Office."
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]

Notebooks
Google Colab
Kaggle

bolt-embedding-small / README.md

iansotnek

Update README.md

b7b5c77 verified 13 days ago

preview code

raw

history blame contribute delete

17.2 kB

	---
	tags:
	- sentence-transformers
	- sentence-similarity
	- feature-extraction
	- dense
	- generated_from_trainer
	- dataset_size:500000
	- loss:CachedMultipleNegativesRankingLoss
	base_model: ibm-granite/granite-embedding-small-english-r2
	widget:
	- source_sentence: >
	I'm trying to write a PHP script which reads SIP (session initiation
	protocol) signals from a hardware switch to gets specific details and then
	return some data back to the switch.

	Being a complete newbie to this SIP thing I don't know how to interact with
	the switch sending SIP signal. Do we need to send some message to the switch
	to get response?

	I googled SIP but got only general info regarding what SIP is all about but
	nothing programmatic.

	Can any one provide any pointers to any tutorials which show how interact
	with a SIP signal programmatically?

	Are there any free online services that simulate SIP signals for testing
	purpose?
	sentences:
	- >-
	Lake Okahumpka is a freshwater lake in Wildwood, Florida, United States.
	Lake Okahumpka Park is along part of its shoreline. In 1980, the United
	States Geological Survey reported on the hydrology of Lake Okahumpka and
	Lake Deaton area.


	The lake is east of Wildwood on the south side of State Road 44. The lake
	has been treated for hydrilla. Ring neck ducks have been hunted from its
	shores.


	See also

	Okahumpka, Florida


	References


	Bodies of water of Sumter County, Florida

	Okahumpka
	- >+
	Because of different regional setting on different machines. To have date
	time output in the same format you ahve to specify format string explciitly:

	date.ToString("yyyy-MM-dd HH:mm:ss");


	Also as John recommeded in comments below if you want having date time
	output in the same format on different machines despite local regional
	settings you can use InvariantCulture format provider:

	date.ToString(CultureInfo.InvariantCulture);


	MSDN:


	The invariant culture is culture-insensitive; it is associated with
	the English language but not with any country/region

	MSDN:


	Standard Date and Time Format Strings

	Custom Date and Time Format Strings

	- >-
	The President of India plays a ceremonial role in foreign affairs,
	appointing ambassadors and ratifying treaties, but the day‑to‑day conduct of
	diplomacy is handled by the Ministry of External Affairs and the Prime
	Minister's Office.
	- source_sentence: can drinking too much water make acid reflux worse?
	sentences:
	- >
	I think I understand your question. A possible solution would be to use a
	ViewModel to pass to the view as oppose to using the Company entity
	directly. This would allow you to add or remove data annotations without
	changing the entity model. Then map the data from the new CompanyViewModel
	over to the Company entity model to be saved to the database.

	For example, the Company entity might look something like this:

	public class Company

	{
	public int Id { get; set; }
	[StringLength(25)]
	public string Name { get; set; }
	public int EmployeeAmount { get; set; }
	[StringLength(3, MinimumLength = 3)]
	public string CountryId {get; set; }
	}


	Now in the MVC project a ViewModel can be constructed similar to the Company
	entity:

	public class CompanyViewModel

	{
	public int Id { get; set; }
	[StringLength(25, ErrorMessage="Company name needs to be 25 characters or less!")]
	public string Name { get; set; }
	public int EmployeeAmount { get; set; }
	public string CountryId { get; set; }
	}


	Using a ViewModel means more view presentation orientated annotations can be
	added without overloading entities with unnecessary mark-up.

	I hope this helps!
	- >-
	Staying well-hydrated is essential for overall health. Water helps maintain
	blood volume, supports kidney function, and aids in temperature regulation.
	Regular consumption of water throughout the day can improve skin elasticity
	and promote better digestion.
	- >-
	Drinking large amounts of water can indeed aggravate acid reflux. Excess
	fluid can increase stomach volume, leading to higher pressure on the lower
	esophageal sphincter, which may cause it to open and allow acid to flow back
	into the esophagus. Additionally, overhydration can dilute stomach acids,
	prompting the body to produce more acid to aid digestion, potentially
	worsening reflux symptoms.
	- source_sentence: >
	I have created an alert in Twitter Bootstrap this way

	HTML:

	<div id='alert' class='hide'></div>


	JS:

	function showAlert(message) {
	$('#alert').html("<div class='alert alert-error'>"+message+"</div>");
	$('#alert').show();
	}

	showAlert('Please have a look at yourself.');

	$('#alert').removeClass('alert-error');

	$('#alert').addClass('alert-info');


	But the last two lines of javascript don't seem to have any effects, can
	anyone have a look for me?

	Created jsfiddle here.

	Update

	I made some changes in my own code to make it easier to use, I prefer this
	way

	HTML:

	<div id='alert' class='hide'></div>


	JS:

	function showAlert(message, alertType) {
	$('#alert').html("<div class='alert alert-"+alertType+"'>"+message+"</div>");
	$('#alert').show();
	}


	showAlert('Please have a look at yourself.', 'success');


	New jsfiddle here
	sentences:
	- >-
	The San Justo was a 70-gun – from 1790, 74-gun – ship of the line built at
	the royal shipyard in Cartagena, Spain and launched in 1779.


	She fought at the Battle of Cape Spartel in 1782 and the Battle of Trafalgar
	in 1805. In the latter battle, under the command of Capitán de Navío Miguel
	María Gastón de Iriarte, she was placed in the Centre Division, but managed
	to avoid being heavily engaged throughout the battle and had few casualties
	– none killed and just seven injured.


	References


	Bibliography


	Ships of the line of the Spanish Navy

	1779 ships

	Ships built in Cartagena, Spain

	Maritime incidents in 1805
	- >
	You can enforce to use specific version of a transitive dependency using
	dependency management.

	<dependencyManagement>
	<dependencies>
	<dependency>
	<groupId>org.springframework.cloud</groupId>
	<artifactId>spring-cloud-starter-kubernetes-ribbon</artifactId>
	<version>1.1.1.RELEASE</version>
	</dependency>
	</dependencies>
	</dependencyManagement>


	Now only the specified version will be used. Not the versions declared in
	transitive dependencies.
	- \|
	$('#alert div').removeClass('alert-error');
	$('#alert div').addClass('alert-info');

	http://jsfiddle.net/Cf4gs/2/
	- source_sentence: 1994–95 Crystal Palace F.C. season
	sentences:
	- >
	There is an error in the documentation, the correct syntax is:

	qry = Article.query().get(projection=[Article.author, Article.tags])


	…replace get with method of your choosing as long as it takes **q_options
	arguments.
	- >-
	During the 1994–95 English football season, Crystal Palace competed in the
	FA Premier League.


	Season summary

	Crystal Palace returned to the Premiership a year after leaving it, and,
	over the next few months, they would experience one of the most unusual
	seasons in their history. They were the division's lowest scoring team with
	just 34 goals, but reached the semi-finals of both cup competitions. They
	also finished fourth from bottom in the Premiership, which – due to the
	streamlining of the division to 20 clubs – cost them their top flight
	status. Manager Alan Smith was sacked just days afterwards, with Steve
	Coppell returning to the manager's seat two years after handing the reins
	over to his former assistant Smith.


	The aftermath of Palace's relegation saw the sale of numerous players
	including Richard Shaw, John Salako, Chris Armstrong and Gareth Southgate. A
	barely recognisable Palace squad would kick off the Endsleigh League
	Division One campaign with one of the youngest-ever squads to be faced with
	a challenge for promotion to the Premiership.


	Final league table


	Results summary


	Results by round


	Results

	Crystal Palace's score comes first


	Legend


	FA Premier League


	FA Cup


	League Cup


	Players


	First-team squad

	Squad at end of season


	Left club during season


	Reserve squad


	Transfers


	In


	Out


	Transfers in: £1,830,000

	Transfers out: £740,000

	Total spending: £1,090,000


	Notes


	References


	Crystal Palace F.C. seasons

	Crystal Palace
	- >-
	In Tennessee, independent contractors generally cannot claim regular
	unemployment benefits, but they may qualify for Pandemic Unemployment
	Assistance (PUA) if they meet the program’s eligibility criteria.
	- source_sentence: Ian MacPherson
	sentences:
	- >-
	A peach-flavored Xanax will produce the same pharmacological effects as
	regular Xanax: it acts as a central nervous system depressant, boosting GABA
	activity in the brain, which leads to sedation, reduced anxiety, and a
	calming, tranquilizing sensation.
	- >-
	Once Upon a Time in Hollywood is set in 1969 Los Angeles and features real
	figures such as Sharon Tate and Charles Manson, but the plot and the main
	characters are fictional creations by Tarantino.
	- >-
	Ian MacPherson, Macpherson or McPherson may refer to:


	Ian Macpherson, 1st Baron Strathcarron (1880–1937), British lawyer and
	politician

	Ian Macpherson (novelist) (1905–1944), Scottish novelist

	Ian McPherson (footballer) (1920–1983), Scottish footballer

	Ian MacPherson (historian) (1939–2013), Canadian historian and co-operative
	activist

	Ian McPherson (cricketer) (born 1942), Scottish cricketer

	Ian Macpherson, 3rd Baron Strathcarron (born 1949), British peer, grandson
	of the 1st Baron

	Ian Macpherson (comedian) (born 1951), Irish comic novelist, playwright and
	performer

	Ian McPherson (police officer) (born 1961), British police officer
	pipeline_tag: sentence-similarity
	library_name: sentence-transformers
	license: other
	language:
	- en
	---

	# Bolt Embedding Models

	Bolt Embedding is a family of **high-performance embedding models optimized for
	enterprise Retrieval-Augmented Generation (RAG)**.\
	These models are fine-tuned from IBM Granite embedding models and
	are designed to produce strong semantic embeddings for knowledge
	retrieval, search, and document understanding.

	Bolt models map text (queries, sentences, or documents) into a **dense
	vector space** suitable for similarity search, clustering, and retrieval
	pipelines.

	------------------------------------------------------------------------

	# Model Overview

	Bolt embeddings are purpose-built for enterprise RAG workloads,
	where retrieval quality and robustness across heterogeneous documents
	are critical.

	Key design goals:

	- Strong query → document retrieval quality
	- Robust performance on long enterprise documents
	- Optimized for large-scale vector search
	- Trained using large-batch contrastive learning to replicate real
	RAG retrieval conditions

	These models are fine-tuned from IBM Granite embedding models using
	contrastive training on RAG-style data.

	------------------------------------------------------------------------

	# Model Details

	### Model Type

	Sentence Transformer embedding model

	### Base Model

	Fine-tuned from:

	- `ibm-granite/granite-embedding-small-english-r2` (small)
	- `ibm-granite/granite-embedding-english-r2` (large)

	(depending on the Bolt variant)

	### Output

	- Embedding dimension: 384 (small), 768 (large)
	- Similarity metric: Cosine similarity
	- Max sequence length: 4096 tokens

	### Architecture

	SentenceTransformer(
	(0): Transformer(ModernBertModel)
	(1): Pooling(CLS)
	)

	Bolt uses CLS pooling to produce a single embedding vector per
	input.

	------------------------------------------------------------------------

	# Training Objective

	Bolt embeddings are trained specifically for retrieval scenarios
	using contrastive learning.

	### Loss Function

	`CachedMultipleNegativesRankingLoss`

	This loss is widely used for training embedding models for retrieval
	tasks.

	Key properties:

	- Efficient training with very large effective batch sizes
	- Uses in-batch negatives
	- Encourages queries to be close to their relevant passages while far
	from irrelevant ones

	### Large Batch Training

	Bolt models were trained using batch sizes of 1024.

	Large batches simulate realistic retrieval scenarios:

	Query
	Positive document
	~2000 unrelated documents, including hard negatives

	This closely approximates production RAG retrieval environments,
	where each query must rank the correct document among many candidates.

	The result is improved:

	- retrieval accuracy
	- semantic separation
	- ranking robustness

	------------------------------------------------------------------------

	# Training Data

	Training was performed using custom datasets we collected. This dataset includes hand-curated examples as well as examples from datasets with commercially-accepable licenses. To curate hard negatives for some examples, LLMs with commercially-permissable licenses were used to generate negatives.

	Dataset format:

	\| Column \| Description \|
	\|--------\|-------------\|
	\| anchor \| Query or input text \|
	\| positive \| Relevant document/passage \|
	\| negative \| Unrelated document/passage, with some examples generated using LLMs to provide hard negatives and some examples chosen at random from existing negatives \|

	Training size:

	- 500,000 training samples
	- 20,000 evaluation samples

	The dataset contains a mixture of:

	- question → answer pairs
	- query → document matches
	- semantic similarity examples

	These samples are designed to mimic real RAG retrieval workloads.

	------------------------------------------------------------------------

	# Intended Use

	Bolt embeddings are designed for:

	- Retrieval-Augmented Generation (RAG)
	- Enterprise document search
	- Semantic search
	- Knowledge base retrieval
	- Question answering
	- Duplicate detection
	- Similarity scoring

	Typical pipeline:

	User query
	↓
	Bolt embedding
	↓
	Vector search
	↓
	Top-k documents
	↓
	LLM generation

	------------------------------------------------------------------------

	# Usage

	Install Sentence Transformers:

	``` bash
	pip install -U sentence-transformers
	```

	### Load the Model

	``` python
	from sentence_transformers import SentenceTransformer

	model = SentenceTransformer("aisquared/bolt-embedding-small")
	```

	or

	``` python
	model = SentenceTransformer("aisquared/bolt-embedding-large")
	```

	### Generate Embeddings

	``` python
	sentences = [
	"What are the tax implications of employee stock options?",
	"Employee stock options may have tax consequences depending on exercise timing.",
	"The Eiffel Tower is located in Paris."
	]

	embeddings = model.encode(sentences)

	print(embeddings.shape)
	```

	### Compute Similarity

	``` python
	similarities = model.similarity(embeddings, embeddings)

	print(similarities)
	```

	------------------------------------------------------------------------

	# Why Bolt?

	Many embedding models are trained on **general semantic similarity
	tasks**.

	Bolt is optimized for enterprise retrieval, where queries must
	locate the correct information among thousands of unrelated documents.

	Key differentiators:

	- Large-batch contrastive training
	- RAG-specific dataset
	- Long context support (4096 tokens trained)
	- Optimized for vector database retrieval

	------------------------------------------------------------------------

	# Framework Versions

	Training was performed using:

	- Python 3.12
	- Sentence Transformers
	- Transformers
	- PyTorch
	- HuggingFace Datasets
	- HuggingFace Jobs, utilizing 1xA100 GPU

	------------------------------------------------------------------------

	# Citation

	If you use Bolt embeddings in research or production systems, please
	cite the underlying Sentence-BERT work.

	### Sentence-BERT

	@inproceedings{reimers-2019-sentence-bert,
	title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
	author = "Reimers, Nils and Gurevych, Iryna",
	year = 2019
	}

	### Cached Multiple Negatives Ranking Loss

	@misc{gao2021scaling,
	title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
	author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
	year={2021}
	}

	------------------------------------------------------------------------

	# License

	Bolt embeddings is released under the [AI Squared Community License](https://docs.squared.ai/terms-of-use).