codeBert dense retriever

This is a sentence-transformers model finetuned from shubharuidas/codebert-embed-base-dense-retriever. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: shubharuidas/codebert-embed-base-dense-retriever
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'RobertaModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("anaghaj111/codebert-base-code-embed-mrl-langchain-langgraph")
# Run inference
sentences = [
    'Best practices for test_list_namespaces_operations',
    'def test_list_namespaces_operations(\n    fake_embeddings: CharacterEmbeddings,\n) -> None:\n    """Test list namespaces functionality with various filters."""\n    with create_vector_store(\n        fake_embeddings, text_fields=["key0", "key1", "key3"]\n    ) as store:\n        test_pref = str(uuid.uuid4())\n        test_namespaces = [\n            (test_pref, "test", "documents", "public", test_pref),\n            (test_pref, "test", "documents", "private", test_pref),\n            (test_pref, "test", "images", "public", test_pref),\n            (test_pref, "test", "images", "private", test_pref),\n            (test_pref, "prod", "documents", "public", test_pref),\n            (test_pref, "prod", "documents", "some", "nesting", "public", test_pref),\n            (test_pref, "prod", "documents", "private", test_pref),\n        ]\n\n        # Add test data\n        for namespace in test_namespaces:\n            store.put(namespace, "dummy", {"content": "dummy"})\n\n        # Test prefix filtering\n        prefix_result = store.list_namespaces(prefix=(test_pref, "test"))\n        assert len(prefix_result) == 4\n        assert all(ns[1] == "test" for ns in prefix_result)\n\n        # Test specific prefix\n        specific_prefix_result = store.list_namespaces(\n            prefix=(test_pref, "test", "documents")\n        )\n        assert len(specific_prefix_result) == 2\n        assert all(ns[1:3] == ("test", "documents") for ns in specific_prefix_result)\n\n        # Test suffix filtering\n        suffix_result = store.list_namespaces(suffix=("public", test_pref))\n        assert len(suffix_result) == 4\n        assert all(ns[-2] == "public" for ns in suffix_result)\n\n        # Test combined prefix and suffix\n        prefix_suffix_result = store.list_namespaces(\n            prefix=(test_pref, "test"), suffix=("public", test_pref)\n        )\n        assert len(prefix_suffix_result) == 2\n        assert all(\n            ns[1] == "test" and ns[-2] == "public" for ns in prefix_suffix_result\n        )\n\n        # Test wildcard in prefix\n        wildcard_prefix_result = store.list_namespaces(\n            prefix=(test_pref, "*", "documents")\n        )\n        assert len(wildcard_prefix_result) == 5\n        assert all(ns[2] == "documents" for ns in wildcard_prefix_result)\n\n        # Test wildcard in suffix\n        wildcard_suffix_result = store.list_namespaces(\n            suffix=("*", "public", test_pref)\n        )\n        assert len(wildcard_suffix_result) == 4\n        assert all(ns[-2] == "public" for ns in wildcard_suffix_result)\n\n        wildcard_single = store.list_namespaces(\n            suffix=("some", "*", "public", test_pref)\n        )\n        assert len(wildcard_single) == 1\n        assert wildcard_single[0] == (\n            test_pref,\n            "prod",\n            "documents",\n            "some",\n            "nesting",\n            "public",\n            test_pref,\n        )\n\n        # Test max depth\n        max_depth_result = store.list_namespaces(max_depth=3)\n        assert all(len(ns) <= 3 for ns in max_depth_result)\n\n        max_depth_result = store.list_namespaces(\n            max_depth=4, prefix=(test_pref, "*", "documents")\n        )\n        assert len(set(res for res in max_depth_result)) == len(max_depth_result) == 5\n\n        # Test pagination\n        limit_result = store.list_namespaces(prefix=(test_pref,), limit=3)\n        assert len(limit_result) == 3\n\n        offset_result = store.list_namespaces(prefix=(test_pref,), offset=3)\n        assert len(offset_result) == len(test_namespaces) - 3\n\n        empty_prefix_result = store.list_namespaces(prefix=(test_pref,))\n        assert len(empty_prefix_result) == len(test_namespaces)\n        assert set(empty_prefix_result) == set(test_namespaces)\n\n        # Clean up\n        for namespace in test_namespaces:\n            store.delete(namespace, "dummy")',
    'def test_doubly_nested_graph_state(\n    sync_checkpointer: BaseCheckpointSaver,\n) -> None:\n    class State(TypedDict):\n        my_key: str\n\n    class ChildState(TypedDict):\n        my_key: str\n\n    class GrandChildState(TypedDict):\n        my_key: str\n\n    def grandchild_1(state: ChildState):\n        return {"my_key": state["my_key"] + " here"}\n\n    def grandchild_2(state: ChildState):\n        return {\n            "my_key": state["my_key"] + " and there",\n        }\n\n    grandchild = StateGraph(GrandChildState)\n    grandchild.add_node("grandchild_1", grandchild_1)\n    grandchild.add_node("grandchild_2", grandchild_2)\n    grandchild.add_edge("grandchild_1", "grandchild_2")\n    grandchild.set_entry_point("grandchild_1")\n    grandchild.set_finish_point("grandchild_2")\n\n    child = StateGraph(ChildState)\n    child.add_node(\n        "child_1",\n        grandchild.compile(interrupt_before=["grandchild_2"]),\n    )\n    child.set_entry_point("child_1")\n    child.set_finish_point("child_1")\n\n    def parent_1(state: State):\n        return {"my_key": "hi " + state["my_key"]}\n\n    def parent_2(state: State):\n        return {"my_key": state["my_key"] + " and back again"}\n\n    graph = StateGraph(State)\n    graph.add_node("parent_1", parent_1)\n    graph.add_node("child", child.compile())\n    graph.add_node("parent_2", parent_2)\n    graph.set_entry_point("parent_1")\n    graph.add_edge("parent_1", "child")\n    graph.add_edge("child", "parent_2")\n    graph.set_finish_point("parent_2")\n\n    app = graph.compile(checkpointer=sync_checkpointer)\n\n    # test invoke w/ nested interrupt\n    config = {"configurable": {"thread_id": "1"}}\n    assert [\n        c\n        for c in app.stream(\n            {"my_key": "my value"}, config, subgraphs=True, durability="exit"\n        )\n    ] == [\n        ((), {"parent_1": {"my_key": "hi my value"}}),\n        (\n            (AnyStr("child:"), AnyStr("child_1:")),\n            {"grandchild_1": {"my_key": "hi my value here"}},\n        ),\n        ((), {"__interrupt__": ()}),\n    ]\n    # get state without subgraphs\n    outer_state = app.get_state(config)\n    assert outer_state == StateSnapshot(\n        values={"my_key": "hi my value"},\n        tasks=(\n            PregelTask(\n                AnyStr(),\n                "child",\n                (PULL, "child"),\n                state={\n                    "configurable": {\n                        "thread_id": "1",\n                        "checkpoint_ns": AnyStr("child"),\n                    }\n                },\n            ),\n        ),\n        next=("child",),\n        config={\n            "configurable": {\n                "thread_id": "1",\n                "checkpoint_ns": "",\n                "checkpoint_id": AnyStr(),\n            }\n        },\n        metadata={\n            "parents": {},\n            "source": "loop",\n            "step": 1,\n        },\n        created_at=AnyStr(),\n        parent_config=None,\n        interrupts=(),\n    )\n    child_state = app.get_state(outer_state.tasks[0].state)\n    assert child_state == StateSnapshot(\n        values={"my_key": "hi my value"},\n        tasks=(\n            PregelTask(\n                AnyStr(),\n                "child_1",\n                (PULL, "child_1"),\n                state={\n                    "configurable": {\n                        "thread_id": "1",\n                        "checkpoint_ns": AnyStr(),\n                    }\n                },\n            ),\n        ),\n        next=("child_1",),\n        config={\n            "configurable": {\n                "thread_id": "1",\n                "checkpoint_ns": AnyStr("child:"),\n                "checkpoint_id": AnyStr(),\n                "checkpoint_map": AnyDict(\n                    {\n                        "": AnyStr(),\n                        AnyStr("child:"): AnyStr(),\n                    }\n                ),\n            }\n        },\n        metadata={\n            "parents": {"": AnyStr()},\n            "source": "loop",\n            "step": 0,\n        },\n        created_at=AnyStr(),\n        parent_config=None,\n        interrupts=(),\n    )\n    grandchild_state = app.get_state(child_state.tasks[0].state)\n    assert grandchild_state == StateSnapshot(\n        values={"my_key": "hi my value here"},\n        tasks=(\n            PregelTask(\n                AnyStr(),\n                "grandchild_2",\n                (PULL, "grandchild_2"),\n            ),\n        ),\n        next=("grandchild_2",),\n        config={\n            "configurable": {\n                "thread_id": "1",\n                "checkpoint_ns": AnyStr(),\n                "checkpoint_id": AnyStr(),\n                "checkpoint_map": AnyDict(\n                    {\n                        "": AnyStr(),\n                        AnyStr("child:"): AnyStr(),\n                        AnyStr(re.compile(r"child:.+|child1:")): AnyStr(),\n                    }\n                ),\n            }\n        },\n        metadata={\n            "parents": AnyDict(\n                {\n                    "": AnyStr(),\n                    AnyStr("child:"): AnyStr(),\n                }\n            ),\n            "source": "loop",\n            "step": 1,\n        },\n        created_at=AnyStr(),\n        parent_config=None,\n        interrupts=(),\n    )\n    # get state with subgraphs\n    assert app.get_state(config, subgraphs=True) == StateSnapshot(\n        values={"my_key": "hi my value"},\n        tasks=(\n            PregelTask(\n                AnyStr(),\n                "child",\n                (PULL, "child"),\n                state=StateSnapshot(\n                    values={"my_key": "hi my value"},\n                    tasks=(\n                        PregelTask(\n                            AnyStr(),\n                            "child_1",\n                            (PULL, "child_1"),\n                            state=StateSnapshot(\n                                values={"my_key": "hi my value here"},\n                                tasks=(\n                                    PregelTask(\n                                        AnyStr(),\n                                        "grandchild_2",\n                                        (PULL, "grandchild_2"),\n                                    ),\n                                ),\n                                next=("grandchild_2",),\n                                config={\n                                    "configurable": {\n                                        "thread_id": "1",\n                                        "checkpoint_ns": AnyStr(),\n                                        "checkpoint_id": AnyStr(),\n                                        "checkpoint_map": AnyDict(\n                                            {\n                                                "": AnyStr(),\n                                                AnyStr("child:"): AnyStr(),\n                                                AnyStr(\n                                                    re.compile(r"child:.+|child1:")\n                                                ): AnyStr(),\n                                            }\n                                        ),\n                                    }\n                                },\n                                metadata={\n                                    "parents": AnyDict(\n                                        {\n                                            "": AnyStr(),\n                                            AnyStr("child:"): AnyStr(),\n                                        }\n                                    ),\n                                    "source": "loop",\n                                    "step": 1,\n                                },\n                                created_at=AnyStr(),\n                                parent_config=None,\n                                interrupts=(),\n                            ),\n                        ),\n                    ),\n                    next=("child_1",),\n                    config={\n                        "configurable": {\n                            "thread_id": "1",\n                            "checkpoint_ns": AnyStr("child:"),\n                            "checkpoint_id": AnyStr(),\n                            "checkpoint_map": AnyDict(\n                                {"": AnyStr(), AnyStr("child:"): AnyStr()}\n                            ),\n                        }\n                    },\n                    metadata={\n                        "parents": {"": AnyStr()},\n                        "source": "loop",\n                        "step": 0,\n                    },\n                    created_at=AnyStr(),\n                    parent_config=None,\n                    interrupts=(),\n                ),\n            ),\n        ),\n        next=("child",),\n        config={\n            "configurable": {\n                "thread_id": "1",\n                "checkpoint_ns": "",\n                "checkpoint_id": AnyStr(),\n            }\n        },\n        metadata={\n            "parents": {},\n            "source": "loop",\n            "step": 1,\n        },\n        created_at=AnyStr(),\n        parent_config=None,\n        interrupts=(),\n    )\n    # # resume\n    assert [c for c in app.stream(None, config, subgraphs=True, durability="exit")] == [\n        (\n            (AnyStr("child:"), AnyStr("child_1:")),\n            {"grandchild_2": {"my_key": "hi my value here and there"}},\n        ),\n        ((AnyStr("child:"),), {"child_1": {"my_key": "hi my value here and there"}}),\n        ((), {"child": {"my_key": "hi my value here and there"}}),\n        ((), {"parent_2": {"my_key": "hi my value here and there and back again"}}),\n    ]\n    # get state with and without subgraphs\n    assert (\n        app.get_state(config)\n        == app.get_state(config, subgraphs=True)\n        == StateSnapshot(\n            values={"my_key": "hi my value here and there and back again"},\n            tasks=(),\n            next=(),\n            config={\n                "configurable": {\n                    "thread_id": "1",\n                    "checkpoint_ns": "",\n                    "checkpoint_id": AnyStr(),\n                }\n            },\n            metadata={\n                "parents": {},\n                "source": "loop",\n                "step": 3,\n            },\n            created_at=AnyStr(),\n            parent_config=(\n                {\n                    "configurable": {\n                        "thread_id": "1",\n                        "checkpoint_ns": "",\n                        "checkpoint_id": AnyStr(),\n                    }\n                }\n            ),\n            interrupts=(),\n        )\n    )\n\n    # get outer graph history\n    outer_history = list(app.get_state_history(config))\n    assert outer_history == [\n        StateSnapshot(\n            values={"my_key": "hi my value here and there and back again"},\n            tasks=(),\n            next=(),\n            config={\n                "configurable": {\n                    "thread_id": "1",\n                    "checkpoint_ns": "",\n                    "checkpoint_id": AnyStr(),\n                }\n            },\n            metadata={\n                "parents": {},\n                "source": "loop",\n                "step": 3,\n            },\n            created_at=AnyStr(),\n            parent_config={\n                "configurable": {\n                    "thread_id": "1",\n                    "checkpoint_ns": "",\n                    "checkpoint_id": AnyStr(),\n                }\n            },\n            interrupts=(),\n        ),\n        StateSnapshot(\n            values={"my_key": "hi my value"},\n            tasks=(\n                PregelTask(\n                    AnyStr(),\n                    "child",\n                    (PULL, "child"),\n                    state={\n                        "configurable": {\n                            "thread_id": "1",\n                            "checkpoint_ns": AnyStr("child"),\n                        }\n                    },\n                    result=None,\n                ),\n            ),\n            next=("child",),\n            config={\n                "configurable": {\n                    "thread_id": "1",\n                    "checkpoint_ns": "",\n                    "checkpoint_id": AnyStr(),\n                }\n            },\n            metadata={\n                "parents": {},\n                "source": "loop",\n                "step": 1,\n            },\n            created_at=AnyStr(),\n            parent_config=None,\n            interrupts=(),\n        ),\n    ]\n    # get child graph history\n    child_history = list(app.get_state_history(outer_history[1].tasks[0].state))\n    assert child_history == [\n        StateSnapshot(\n            values={"my_key": "hi my value"},\n            next=("child_1",),\n            config={\n                "configurable": {\n                    "thread_id": "1",\n                    "checkpoint_ns": AnyStr("child:"),\n                    "checkpoint_id": AnyStr(),\n                    "checkpoint_map": AnyDict(\n                        {"": AnyStr(), AnyStr("child:"): AnyStr()}\n                    ),\n                }\n            },\n            metadata={\n                "source": "loop",\n                "step": 0,\n                "parents": {"": AnyStr()},\n            },\n            created_at=AnyStr(),\n            parent_config=None,\n            tasks=(\n                PregelTask(\n                    id=AnyStr(),\n                    name="child_1",\n                    path=(PULL, "child_1"),\n                    state={\n                        "configurable": {\n                            "thread_id": "1",\n                            "checkpoint_ns": AnyStr("child:"),\n                        }\n                    },\n                    result=None,\n                ),\n            ),\n            interrupts=(),\n        ),\n    ]\n    # get grandchild graph history\n    grandchild_history = list(app.get_state_history(child_history[0].tasks[0].state))\n    assert grandchild_history == [\n        StateSnapshot(\n            values={"my_key": "hi my value here"},\n            next=("grandchild_2",),\n            config={\n                "configurable": {\n                    "thread_id": "1",\n                    "checkpoint_ns": AnyStr(),\n                    "checkpoint_id": AnyStr(),\n                    "checkpoint_map": AnyDict(\n                        {\n                            "": AnyStr(),\n                            AnyStr("child:"): AnyStr(),\n                            AnyStr(re.compile(r"child:.+|child1:")): AnyStr(),\n                        }\n                    ),\n                }\n            },\n            metadata={\n                "source": "loop",\n                "step": 1,\n                "parents": AnyDict(\n                    {\n                        "": AnyStr(),\n                        AnyStr("child:"): AnyStr(),\n                    }\n                ),\n            },\n            created_at=AnyStr(),\n            parent_config=None,\n            tasks=(\n                PregelTask(\n                    id=AnyStr(),\n                    name="grandchild_2",\n                    path=(PULL, "grandchild_2"),\n                    result=None,\n                ),\n            ),\n            interrupts=(),\n        ),\n    ]',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7789, 0.3589],
#         [0.7789, 1.0000, 0.4748],
#         [0.3589, 0.4748, 1.0000]])

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.9
cosine_accuracy@3 0.9
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.9
cosine_precision@3 0.3
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.9
cosine_recall@3 0.9
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.9409
cosine_mrr@10 0.9225
cosine_map@100 0.9225

Information Retrieval

Metric Value
cosine_accuracy@1 0.9
cosine_accuracy@3 0.9
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.9
cosine_precision@3 0.3
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.9
cosine_recall@3 0.9
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.9409
cosine_mrr@10 0.9225
cosine_map@100 0.9225

Information Retrieval

Metric Value
cosine_accuracy@1 0.9
cosine_accuracy@3 0.9
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.9
cosine_precision@3 0.3
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.9
cosine_recall@3 0.9
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.9409
cosine_mrr@10 0.9225
cosine_map@100 0.9225

Information Retrieval

Metric Value
cosine_accuracy@1 0.85
cosine_accuracy@3 0.9
cosine_accuracy@5 0.95
cosine_accuracy@10 0.95
cosine_precision@1 0.85
cosine_precision@3 0.3
cosine_precision@5 0.19
cosine_precision@10 0.095
cosine_recall@1 0.85
cosine_recall@3 0.9
cosine_recall@5 0.95
cosine_recall@10 0.95
cosine_ndcg@10 0.8943
cosine_mrr@10 0.8767
cosine_map@100 0.88

Information Retrieval

Metric Value
cosine_accuracy@1 0.85
cosine_accuracy@3 0.9
cosine_accuracy@5 0.9
cosine_accuracy@10 1.0
cosine_precision@1 0.85
cosine_precision@3 0.3
cosine_precision@5 0.18
cosine_precision@10 0.1
cosine_recall@1 0.85
cosine_recall@3 0.9
cosine_recall@5 0.9
cosine_recall@10 1.0
cosine_ndcg@10 0.9074
cosine_mrr@10 0.8801
cosine_map@100 0.8801

Training Details

Training Dataset

Unnamed Dataset

  • Size: 180 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 180 samples:
    anchor positive
    type string string
    details
    • min: 6 tokens
    • mean: 12.34 tokens
    • max: 117 tokens
    • min: 14 tokens
    • mean: 273.18 tokens
    • max: 512 tokens
  • Samples:
    anchor positive
    How to implement State? class State(TypedDict):
    messages: Annotated[list[str], operator.add]
    Best practices for test_sql_injection_vulnerability def test_sql_injection_vulnerability(store: SqliteStore) -> None:
    """Test that SQL injection via malicious filter keys is prevented."""
    # Add public and private documents
    store.put(("docs",), "public", {"access": "public", "data": "public info"})
    store.put(
    ("docs",), "private", {"access": "private", "data": "secret", "password": "123"}
    )

    # Normal query - returns 1 public document
    normal = store.search(("docs",), filter={"access": "public"})
    assert len(normal) == 1
    assert normal[0].value["access"] == "public"

    # SQL injection attempt via malicious key should raise ValueError
    malicious_key = "access') = 'public' OR '1'='1' OR json_extract(value, '$."

    with pytest.raises(ValueError, match="Invalid filter key"):
    store.search(("docs",), filter={malicious_key: "dummy"})
    Example usage of put_writes def put_writes(
    self,
    config: RunnableConfig,
    writes: Sequence[tuple[str, Any]],
    task_id: str,
    task_path: str = "",
    ) -> None:
    """Store intermediate writes linked to a checkpoint.

    This method saves intermediate writes associated with a checkpoint to the Postgres database.

    Args:
    config: Configuration of the related checkpoint.
    writes: List of writes to store.
    task_id: Identifier for the task creating the writes.
    """
    query = (
    self.UPSERT_CHECKPOINT_WRITES_SQL
    if all(w[0] in WRITES_IDX_MAP for w in writes)
    else self.INSERT_CHECKPOINT_WRITES_SQL
    )
    with self._cursor(pipeline=True) as cur:
    cur.executemany(
    query,
    self._dump_writes(
    config["configurable"]["thread_id"],
    config["configurable"]["checkpoint_ns"],
    config["c...
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • gradient_accumulation_steps: 16
  • learning_rate: 2e-05
  • num_train_epochs: 2
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True
  • optim: adamw_torch
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 16
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step dim_768_cosine_ndcg@10 dim_512_cosine_ndcg@10 dim_256_cosine_ndcg@10 dim_128_cosine_ndcg@10 dim_64_cosine_ndcg@10
1.0 3 0.9409 0.9202 0.9431 0.8412 0.9059
2.0 6 0.9409 0.9409 0.9409 0.8943 0.9074
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.14.0
  • Sentence Transformers: 5.2.2
  • Transformers: 4.57.3
  • PyTorch: 2.9.1
  • Accelerate: 1.12.0
  • Datasets: 4.5.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
14
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for anaghaj111/codebert-base-code-embed-mrl-langchain-langgraph

Papers for anaghaj111/codebert-base-code-embed-mrl-langchain-langgraph

Evaluation results