---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dense
- generated_from_trainer
- dataset_size:5749
- loss:CosineSimilarityLoss
widget:
- source_sentence: >-
Nterprise Linux Services is expected to be available before then end of this
year.
sentences:
- >-
Beta versions of Nterprise Linux Services are expected to be available on
certain HP ProLiant servers in July.
- Spain turning back the clock on siestas
- I don't like many flavored drinks.
- source_sentence: Iran hopes nuclear talks will yield 'roadmap'
sentences:
- Iran Nuclear Talks in Geneva Spur High Hopes
- A black pet dog runs around in the garden of a house.
- >-
The witness was a 27-year-old Kosovan parking attendant, who was paid by the
News of the World, the court heard.
- source_sentence: Hamas Urges Hizbullah to Pull Fighters Out of Syria
sentences:
- >-
"This was a persistent problem which has not been solved, mechanically and
physically," said board member Steven Wallace.
- A small dog jumps over a yellow beam.
- Hamas calls on Hezbollah to pull forces out of Syria
- source_sentence: Licensing revenue slid 21 percent, however, to $107.6 million.
sentences:
- Britain loses bid to deport radical cleric Abu Qatada
- A man sits on a bed very close to a small television.
- License sales, a key measure of demand, fell 21 percent to $107.6 million.
- source_sentence: >-
Comcast Class A shares were up 8 cents at $30.50 in morning trading on the
Nasdaq Stock Market.
sentences:
- The stock rose 48 cents to $30 yesterday in Nasdaq Stock Market trading.
- 'Malaysia: Chinese satellite found object in ocean'
- A boy in a robe sits in a chair.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- pearson_cosine
- spearman_cosine
model-index:
- name: SentenceTransformer
results:
- task:
type: semantic-similarity
name: 意味的類似性 (Semantic Similarity)
metrics:
- type: pearson_cosine
value: 0.4639747212598005
name: ピアソン相関係数 (コサイン類似度)
- type: spearman_cosine
value: 0.4595105448711385
name: スピアマン相関係数 (コサイン類似度)
license: gemma
---
# SentenceTransformer
これは、訓練済みの[sentence-transformers](https://www.SBERT.net)モデルです。このモデルは、文と段落を256次元の密なベクトル空間にマッピングし、意味的テキスト類似性、意味検索、言い換えマイニング、テキスト分類、クラスタリングなどに使用できます。
## モデル詳細

### モデルの説明
- **モデルタイプ:** Sentence Transformer
- **最大シーケンス長:** 2048トークン
- **出力次元数:** 256次元
- **類似度関数:** コサイン類似度
### モデルのソース
- **ドキュメント:** [Sentence Transformers Documentation](https://sbert.net)
- **リポジトリ:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### 完全なモデルアーキテクチャ
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
(1): Pooling({'word_embedding_dimension': 256, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
```
## 使用方法
### 直接使用 (Sentence Transformers)
まず、Sentence Transformersライブラリをインストールします:
```bash
pip install -U sentence-transformers
```
次に、このモデルをロードして推論を実行できます。
```python
from sentence_transformers import SentenceTransformer
# 🤗 Hubからダウンロード
model = SentenceTransformer("sentence_transformers_model_id")
# 推論を実行
sentences = [
'Comcast Class A shares were up 8 cents at $30.50 in morning trading on the Nasdaq Stock Market.',
'The stock rose 48 cents to $30 yesterday in Nasdaq Stock Market trading.',
'Malaysia: Chinese satellite found object in ocean',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 256]
# 埋め込みベクトルの類似度スコアを取得
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.5752, 0.2980],
# [0.5752, 1.0000, 0.2161],
# [0.2980, 0.2161, 1.0000]])
```
## 評価
### メトリクス
#### 意味的類似性
* [EmbeddingSimilarityEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)で評価
| メトリクス | 値 |
|:--------------------|:-----------|
| pearson_cosine | 0.464 |
| **spearman_cosine** | **0.4595** |
## 訓練詳細
### 訓練データセット
#### 名称未設定のデータセット
* サイズ: 5,749 訓練サンプル
* カラム: `sentence_0`, `sentence_1`, `label`
* 最初の1000サンプルに基づくおおよその統計:
| | sentence_0 | sentence_1 | label |
|:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
| 型 | string | string | float |
| 詳細 |
CosineSimilarityLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) 以下のパラメータを使用:
```json
{
"loss_fct": "torch.nn.modules.loss.MSELoss"
}
```
### 訓練ハイパーパラメータ
#### デフォルト以外のハイパーパラメータ
- `eval_strategy`: steps
- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 16
- `multi_dataset_batch_sampler`: round_robin
#### すべてのハイパーパラメータ