tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dense
- generated_from_trainer
- dataset_size:500000
- loss:CachedMultipleNegativesRankingLoss
base_model: ibm-granite/granite-embedding-small-english-r2
widget:
- source_sentence: >
I'm trying to write a PHP script which reads SIP (session initiation
protocol) signals from a hardware switch to gets specific details and then
return some data back to the switch.
Being a complete newbie to this SIP thing I don't know how to interact
with the switch sending SIP signal. Do we need to send some message to the
switch to get response?
I googled SIP but got only general info regarding what SIP is all about
but nothing programmatic.
Can any one provide any pointers to any tutorials which show how interact
with a SIP signal programmatically?
Are there any free online services that simulate SIP signals for testing
purpose?
sentences:
- >-
Lake Okahumpka is a freshwater lake in Wildwood, Florida, United States.
Lake Okahumpka Park is along part of its shoreline. In 1980, the United
States Geological Survey reported on the hydrology of Lake Okahumpka and
Lake Deaton area.
The lake is east of Wildwood on the south side of State Road 44. The
lake has been treated for hydrilla. Ring neck ducks have been hunted
from its shores.
See also
Okahumpka, Florida
References
Bodies of water of Sumter County, Florida
Okahumpka
- >+
Because of different regional setting on different machines. To have
date time output in the same format you ahve to specify format string
explciitly:
date.ToString("yyyy-MM-dd HH:mm:ss");
Also as John recommeded in comments below if you want having date time
output in the same format on different machines despite local regional
settings you can use InvariantCulture format provider:
date.ToString(CultureInfo.InvariantCulture);
MSDN:
The invariant culture is culture-insensitive; it is associated with
the English language but not with any country/region
MSDN:
Standard Date and Time Format Strings
Custom Date and Time Format Strings
- >-
The President of India plays a ceremonial role in foreign affairs,
appointing ambassadors and ratifying treaties, but the day‑to‑day
conduct of diplomacy is handled by the Ministry of External Affairs and
the Prime Minister's Office.
- source_sentence: can drinking too much water make acid reflux worse?
sentences:
- >
I think I understand your question. A possible solution would be to use
a ViewModel to pass to the view as oppose to using the Company entity
directly. This would allow you to add or remove data annotations without
changing the entity model. Then map the data from the new
CompanyViewModel over to the Company entity model to be saved to the
database.
For example, the Company entity might look something like this:
public class Company
{
public int Id { get; set; }
[StringLength(25)]
public string Name { get; set; }
public int EmployeeAmount { get; set; }
[StringLength(3, MinimumLength = 3)]
public string CountryId {get; set; }
}
Now in the MVC project a ViewModel can be constructed similar to the
Company entity:
public class CompanyViewModel
{
public int Id { get; set; }
[StringLength(25, ErrorMessage="Company name needs to be 25 characters or less!")]
public string Name { get; set; }
public int EmployeeAmount { get; set; }
public string CountryId { get; set; }
}
Using a ViewModel means more view presentation orientated annotations
can be added without overloading entities with unnecessary mark-up.
I hope this helps!
- >-
Staying well-hydrated is essential for overall health. Water helps
maintain blood volume, supports kidney function, and aids in temperature
regulation. Regular consumption of water throughout the day can improve
skin elasticity and promote better digestion.
- >-
Drinking large amounts of water can indeed aggravate acid reflux. Excess
fluid can increase stomach volume, leading to higher pressure on the
lower esophageal sphincter, which may cause it to open and allow acid to
flow back into the esophagus. Additionally, overhydration can dilute
stomach acids, prompting the body to produce more acid to aid digestion,
potentially worsening reflux symptoms.
- source_sentence: >
I have created an alert in Twitter Bootstrap this way
HTML:
<div id='alert' class='hide'></div>
JS:
function showAlert(message) {
$('#alert').html("<div class='alert alert-error'>"+message+"</div>");
$('#alert').show();
}
showAlert('Please have a look at yourself.');
$('#alert').removeClass('alert-error');
$('#alert').addClass('alert-info');
But the last two lines of javascript don't seem to have any effects, can
anyone have a look for me?
Created jsfiddle here.
Update
I made some changes in my own code to make it easier to use, I prefer this
way
HTML:
<div id='alert' class='hide'></div>
JS:
function showAlert(message, alertType) {
$('#alert').html("<div class='alert alert-"+alertType+"'>"+message+"</div>");
$('#alert').show();
}
showAlert('Please have a look at yourself.', 'success');
New jsfiddle here
sentences:
- >-
The San Justo was a 70-gun – from 1790, 74-gun – ship of the line built
at the royal shipyard in Cartagena, Spain and launched in 1779.
She fought at the Battle of Cape Spartel in 1782 and the Battle of
Trafalgar in 1805. In the latter battle, under the command of Capitán de
Navío Miguel María Gastón de Iriarte, she was placed in the Centre
Division, but managed to avoid being heavily engaged throughout the
battle and had few casualties – none killed and just seven injured.
References
Bibliography
Ships of the line of the Spanish Navy
1779 ships
Ships built in Cartagena, Spain
Maritime incidents in 1805
- >
You can enforce to use specific version of a transitive dependency using
dependency management.
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-kubernetes-ribbon</artifactId>
<version>1.1.1.RELEASE</version>
</dependency>
</dependencies>
</dependencyManagement>
Now only the specified version will be used. Not the versions declared
in transitive dependencies.
- |
$('#alert div').removeClass('alert-error');
$('#alert div').addClass('alert-info');
http://jsfiddle.net/Cf4gs/2/
- source_sentence: 1994–95 Crystal Palace F.C. season
sentences:
- >
There is an error in the documentation, the correct syntax is:
qry = Article.query().get(projection=[Article.author, Article.tags])
…replace get with method of your choosing as long as it takes
**q_options arguments.
- >-
During the 1994–95 English football season, Crystal Palace competed in
the FA Premier League.
Season summary
Crystal Palace returned to the Premiership a year after leaving it, and,
over the next few months, they would experience one of the most unusual
seasons in their history. They were the division's lowest scoring team
with just 34 goals, but reached the semi-finals of both cup
competitions. They also finished fourth from bottom in the Premiership,
which – due to the streamlining of the division to 20 clubs – cost them
their top flight status. Manager Alan Smith was sacked just days
afterwards, with Steve Coppell returning to the manager's seat two years
after handing the reins over to his former assistant Smith.
The aftermath of Palace's relegation saw the sale of numerous players
including Richard Shaw, John Salako, Chris Armstrong and Gareth
Southgate. A barely recognisable Palace squad would kick off the
Endsleigh League Division One campaign with one of the youngest-ever
squads to be faced with a challenge for promotion to the Premiership.
Final league table
Results summary
Results by round
Results
Crystal Palace's score comes first
Legend
FA Premier League
FA Cup
League Cup
Players
First-team squad
Squad at end of season
Left club during season
Reserve squad
Transfers
In
Out
Transfers in: £1,830,000
Transfers out: £740,000
Total spending: £1,090,000
Notes
References
Crystal Palace F.C. seasons
Crystal Palace
- >-
In Tennessee, independent contractors generally cannot claim regular
unemployment benefits, but they may qualify for Pandemic Unemployment
Assistance (PUA) if they meet the program’s eligibility criteria.
- source_sentence: Ian MacPherson
sentences:
- >-
A peach-flavored Xanax will produce the same pharmacological effects as
regular Xanax: it acts as a central nervous system depressant, boosting
GABA activity in the brain, which leads to sedation, reduced anxiety,
and a calming, tranquilizing sensation.
- >-
Once Upon a Time in Hollywood is set in 1969 Los Angeles and features
real figures such as Sharon Tate and Charles Manson, but the plot and
the main characters are fictional creations by Tarantino.
- >-
Ian MacPherson, Macpherson or McPherson may refer to:
Ian Macpherson, 1st Baron Strathcarron (1880–1937), British lawyer and
politician
Ian Macpherson (novelist) (1905–1944), Scottish novelist
Ian McPherson (footballer) (1920–1983), Scottish footballer
Ian MacPherson (historian) (1939–2013), Canadian historian and
co-operative activist
Ian McPherson (cricketer) (born 1942), Scottish cricketer
Ian Macpherson, 3rd Baron Strathcarron (born 1949), British peer,
grandson of the 1st Baron
Ian Macpherson (comedian) (born 1951), Irish comic novelist, playwright
and performer
Ian McPherson (police officer) (born 1961), British police officer
pipeline_tag: sentence-similarity
library_name: sentence-transformers
license: other
language:
- en
Bolt Embedding Models
Bolt Embedding is a family of high-performance embedding models optimized for
enterprise Retrieval-Augmented Generation (RAG).
These models are fine-tuned from IBM Granite embedding models and
are designed to produce strong semantic embeddings for knowledge
retrieval, search, and document understanding.
Bolt models map text (queries, sentences, or documents) into a dense vector space suitable for similarity search, clustering, and retrieval pipelines.
Model Overview
Bolt embeddings are purpose-built for enterprise RAG workloads, where retrieval quality and robustness across heterogeneous documents are critical.
Key design goals:
- Strong query → document retrieval quality
- Robust performance on long enterprise documents
- Optimized for large-scale vector search
- Trained using large-batch contrastive learning to replicate real RAG retrieval conditions
These models are fine-tuned from IBM Granite embedding models using contrastive training on RAG-style data.
Model Details
Model Type
Sentence Transformer embedding model
Base Model
Fine-tuned from:
ibm-granite/granite-embedding-small-english-r2(small)ibm-granite/granite-embedding-english-r2(large)
(depending on the Bolt variant)
Output
- Embedding dimension: 384 (small), 768 (large)
- Similarity metric: Cosine similarity
- Max sequence length: 4096 tokens
Architecture
SentenceTransformer(
(0): Transformer(ModernBertModel)
(1): Pooling(CLS)
)
Bolt uses CLS pooling to produce a single embedding vector per input.
Training Objective
Bolt embeddings are trained specifically for retrieval scenarios using contrastive learning.
Loss Function
CachedMultipleNegativesRankingLoss
This loss is widely used for training embedding models for retrieval tasks.
Key properties:
- Efficient training with very large effective batch sizes
- Uses in-batch negatives
- Encourages queries to be close to their relevant passages while far from irrelevant ones
Large Batch Training
Bolt models were trained using batch sizes of 1024.
Large batches simulate realistic retrieval scenarios:
Query
Positive document
~2000 unrelated documents, including hard negatives
This closely approximates production RAG retrieval environments, where each query must rank the correct document among many candidates.
The result is improved:
- retrieval accuracy
- semantic separation
- ranking robustness
Training Data
Training was performed using custom datasets we collected. This dataset includes hand-curated examples as well as examples from datasets with commercially-accepable licenses. To curate hard negatives for some examples, LLMs with commercially-permissable licenses were used to generate negatives.
Dataset format:
| Column | Description |
|---|---|
| anchor | Query or input text |
| positive | Relevant document/passage |
| negative | Unrelated document/passage, with some examples generated using LLMs to provide hard negatives and some examples chosen at random from existing negatives |
Training size:
- 500,000 training samples
- 20,000 evaluation samples
The dataset contains a mixture of:
- question → answer pairs
- query → document matches
- semantic similarity examples
These samples are designed to mimic real RAG retrieval workloads.
Intended Use
Bolt embeddings are designed for:
- Retrieval-Augmented Generation (RAG)
- Enterprise document search
- Semantic search
- Knowledge base retrieval
- Question answering
- Duplicate detection
- Similarity scoring
Typical pipeline:
User query
↓
Bolt embedding
↓
Vector search
↓
Top-k documents
↓
LLM generation
Usage
Install Sentence Transformers:
pip install -U sentence-transformers
Load the Model
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("aisquared/bolt-embedding-small")
or
model = SentenceTransformer("aisquared/bolt-embedding-large")
Generate Embeddings
sentences = [
"What are the tax implications of employee stock options?",
"Employee stock options may have tax consequences depending on exercise timing.",
"The Eiffel Tower is located in Paris."
]
embeddings = model.encode(sentences)
print(embeddings.shape)
Compute Similarity
similarities = model.similarity(embeddings, embeddings)
print(similarities)
Why Bolt?
Many embedding models are trained on general semantic similarity tasks.
Bolt is optimized for enterprise retrieval, where queries must locate the correct information among thousands of unrelated documents.
Key differentiators:
- Large-batch contrastive training
- RAG-specific dataset
- Long context support (4096 tokens trained)
- Optimized for vector database retrieval
Framework Versions
Training was performed using:
- Python 3.12
- Sentence Transformers
- Transformers
- PyTorch
- HuggingFace Datasets
- HuggingFace Jobs, utilizing 1xA100 GPU
Citation
If you use Bolt embeddings in research or production systems, please cite the underlying Sentence-BERT work.
Sentence-BERT
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
year = 2019
}
Cached Multiple Negatives Ranking Loss
@misc{gao2021scaling,
title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
year={2021}
}
License
Bolt embeddings is released under the AI Squared Community License.