End of training

181cf68 verified 16 days ago

84.6 kB

tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:369762
  - loss:CachedMultipleNegativesRankingLoss
base_model: benjamintli/modernbert-cosqa
widget:
  - source_sentence: Return a Python AST node for `recur` occurring inside a `loop`.
    sentences:
      - |-
        def _reset(self, name=None):
                """Revert specified property to default value

                If no property is specified, all properties are returned to default.
                """
                if name is None:
                    for key in self._props:
                        if isinstance(self._props[key], basic.Property):
                            self._reset(key)
                    return
                if name not in self._props:
                    raise AttributeError("Input name '{}' is not a known "
                                         "property or attribute".format(name))
                if not isinstance(self._props[name], basic.Property):
                    raise AttributeError("Cannot reset GettableProperty "
                                         "'{}'".format(name))
                if name in self._defaults:
                    val = self._defaults[name]
                else:
                    val = self._props[name].default
                if callable(val):
                    val = val()
                setattr(self, name, val)
      - |-
        def cancel(self):
                '''
                Cancel a running workflow.

                Args:
                    None

                Returns:
                    None
                '''
                if not self.id:
                    raise WorkflowError('Workflow is not running.  Cannot cancel.')

                if self.batch_values:
                    self.workflow.batch_workflow_cancel(self.id)
                else:
                    self.workflow.cancel(self.id)
      - >-
        def __loop_recur_to_py_ast(ctx: GeneratorContext, node: Recur) ->
        GeneratedPyAST:
            """Return a Python AST node for `recur` occurring inside a `loop`."""
            assert node.op == NodeOp.RECUR

            recur_deps: List[ast.AST] = []
            recur_targets: List[ast.Name] = []
            recur_exprs: List[ast.AST] = []
            for name, expr in zip(ctx.recur_point.binding_names, node.exprs):
                expr_ast = gen_py_ast(ctx, expr)
                recur_deps.extend(expr_ast.dependencies)
                recur_targets.append(ast.Name(id=name, ctx=ast.Store()))
                recur_exprs.append(expr_ast.node)

            if len(recur_targets) == 1:
                assert len(recur_exprs) == 1
                recur_deps.append(ast.Assign(targets=recur_targets, value=recur_exprs[0]))
            else:
                recur_deps.append(
                    ast.Assign(
                        targets=[ast.Tuple(elts=recur_targets, ctx=ast.Store())],
                        value=ast.Tuple(elts=recur_exprs, ctx=ast.Load()),
                    )
                )
            recur_deps.append(ast.Continue())

            return GeneratedPyAST(node=ast.NameConstant(None), dependencies=recur_deps)
  - source_sentence: |-
      Create a :class:`~turicreate.linear_regression.LinearRegression` to
          predict a scalar target variable as a linear function of one or more
          features. In addition to standard numeric and categorical types, features
          can also be extracted automatically from list- or dictionary-type SFrame
          columns.

          The linear regression module can be used for ridge regression, Lasso, and
          elastic net regression (see References for more detail on these methods). By
          default, this model has an l2 regularization weight of 0.01.

          Parameters
          ----------
          dataset : SFrame
              The dataset to use for training the model.

          target : string
              Name of the column containing the target variable.

          features : list[string], optional
              Names of the columns containing features. 'None' (the default) indicates
              that all columns except the target variable should be used as features.

              The features are columns in the input SFrame that can be of the
              following types:

              - *Numeric*: values of numeric type integer or float.

              - *Categorical*: values of type string.

              - *Array*: list of numeric (integer or float) values. Each list element
                is treated as a separate feature in the model.

              - *Dictionary*: key-value pairs with numeric (integer or float) values
                Each key of a dictionary is treated as a separate feature and the
                value in the dictionary corresponds to the value of the feature.
                Dictionaries are ideal for representing sparse data.

              Columns of type *list* are not supported. Convert such feature
              columns to type array if all entries in the list are of numeric
              types. If the lists contain data of mixed types, separate
              them out into different columns.

          l2_penalty : float, optional
              Weight on the l2-regularizer of the model. The larger this weight, the
              more the model coefficients shrink toward 0. This introduces bias into
              the model but decreases variance, potentially leading to better
              predictions. The default value is 0.01; setting this parameter to 0
              corresponds to unregularized linear regression. See the ridge
              regression reference for more detail.

          l1_penalty : float, optional
              Weight on l1 regularization of the model. Like the l2 penalty, the
              higher the l1 penalty, the more the estimated coefficients shrink toward
              0. The l1 penalty, however, completely zeros out sufficiently small
              coefficients, automatically indicating features that are not useful for
              the model. The default weight of 0 prevents any features from being
              discarded. See the LASSO regression reference for more detail.

          solver : string, optional
              Solver to use for training the model. See the references for more detail
              on each solver.

              - *auto (default)*: automatically chooses the best solver for the data
                and model parameters.
              - *newton*: Newton-Raphson
              - *lbfgs*: limited memory BFGS
              - *fista*: accelerated gradient descent

              The model is trained using a carefully engineered collection of methods
              that are automatically picked based on the input data. The ``newton``
              method  works best for datasets with plenty of examples and few features
              (long datasets). Limited memory BFGS (``lbfgs``) is a robust solver for
              wide datasets (i.e datasets with many coefficients).  ``fista`` is the
              default solver for l1-regularized linear regression.  The solvers are
              all automatically tuned and the default options should function well.
              See the solver options guide for setting additional parameters for each
              of the solvers.

              See the user guide for additional details on how the solver is chosen.

          feature_rescaling : boolean, optional
              Feature rescaling is an important pre-processing step that ensures that
              all features are on the same scale. An l2-norm rescaling is performed
              to make sure that all features are of the same norm. Categorical
              features are also rescaled by rescaling the dummy variables that are
              used to represent them. The coefficients are returned in original scale
              of the problem. This process is particularly useful when features
              vary widely in their ranges.

          validation_set : SFrame, optional

              A dataset for monitoring the model's generalization performance.
              For each row of the progress table, the chosen metrics are computed
              for both the provided training dataset and the validation_set. The
              format of this SFrame must be the same as the training set.
              By default this argument is set to 'auto' and a validation set is
              automatically sampled and used for progress printing. If
              validation_set is set to None, then no additional metrics
              are computed. The default value is 'auto'.

          convergence_threshold : float, optional

            Convergence is tested using variation in the training objective. The
            variation in the training objective is calculated using the difference
            between the objective values between two steps. Consider reducing this
            below the default value (0.01) for a more accurately trained model.
            Beware of overfitting (i.e a model that works well only on the training
            data) if this parameter is set to a very low value.

          lbfgs_memory_level : int, optional

            The L-BFGS algorithm keeps track of gradient information from the
            previous ``lbfgs_memory_level`` iterations. The storage requirement for
            each of these gradients is the ``num_coefficients`` in the problem.
            Increasing the ``lbfgs_memory_level`` can help improve the quality of
            the model trained. Setting this to more than ``max_iterations`` has the
            same effect as setting it to ``max_iterations``.

          max_iterations : int, optional

            The maximum number of allowed passes through the data. More passes over
            the data can result in a more accurately trained model. Consider
            increasing this (the default value is 10) if the training accuracy is
            low and the *Grad-Norm* in the display is large.

          step_size : float, optional (fista only)

            The starting step size to use for the ``fista`` and ``gd`` solvers. The
            default is set to 1.0, this is an aggressive setting. If the first
            iteration takes a considerable amount of time, reducing this parameter
            may speed up model training.

          verbose : bool, optional
              If True, print progress updates.

          Returns
          -------
          out : LinearRegression
              A trained model of type
              :class:`~turicreate.linear_regression.LinearRegression`.

          See Also
          --------
          LinearRegression, turicreate.boosted_trees_regression.BoostedTreesRegression, turicreate.regression.create

          Notes
          -----
          - Categorical variables are encoded by creating dummy variables. For a
            variable with :math:`K` categories, the encoding creates :math:`K-1` dummy
            variables, while the first category encountered in the data is used as the
            baseline.

          - For prediction and evaluation of linear regression models with sparse
            dictionary inputs, new keys/columns that were not seen during training
            are silently ignored.

          - Any 'None' values in the data will result in an error being thrown.

          - A constant term is automatically added for the model intercept. This term
            is not regularized.

          - Standard errors on coefficients are only available when `solver=newton`
            or when the default `auto` solver option chooses the newton method and if
            the number of examples in the training data is more than the number of
            coefficients. If standard errors cannot be estimated, a column of `None`
            values are returned.


          References
          ----------
          - Hoerl, A.E. and Kennard, R.W. (1970) `Ridge regression: Biased Estimation
            for Nonorthogonal Problems
            <http://amstat.tandfonline.com/doi/abs/10.1080/00401706.1970.10488634>`_.
            Technometrics 12(1) pp.55-67

          - Tibshirani, R. (1996) `Regression Shrinkage and Selection via the Lasso <h
            ttp://www.jstor.org/discover/10.2307/2346178?uid=3739256&uid=2&uid=4&sid=2
            1104169934983>`_. Journal of the Royal Statistical Society. Series B
            (Methodological) 58(1) pp.267-288.

          - Zhu, C., et al. (1997) `Algorithm 778: L-BFGS-B: Fortran subroutines for
            large-scale bound-constrained optimization
            <https://dl.acm.org/citation.cfm?id=279236>`_. ACM Transactions on
            Mathematical Software 23(4) pp.550-560.

          - Barzilai, J. and Borwein, J. `Two-Point Step Size Gradient Methods
            <http://imajna.oxfordjournals.org/content/8/1/141.short>`_. IMA Journal of
            Numerical Analysis 8(1) pp.141-148.

          - Beck, A. and Teboulle, M. (2009) `A Fast Iterative Shrinkage-Thresholding
            Algorithm for Linear Inverse Problems
            <http://epubs.siam.org/doi/abs/10.1137/080716542>`_. SIAM Journal on
            Imaging Sciences 2(1) pp.183-202.

          - Zhang, T. (2004) `Solving large scale linear prediction problems using
            stochastic gradient descent algorithms
            <https://dl.acm.org/citation.cfm?id=1015332>`_. ICML '04: Proceedings of
            the twenty-first international conference on Machine learning p.116.


          Examples
          --------

          Given an :class:`~turicreate.SFrame` ``sf`` with a list of columns
          [``feature_1`` ... ``feature_K``] denoting features and a target column
          ``target``, we can create a
          :class:`~turicreate.linear_regression.LinearRegression` as follows:

          >>> data =  turicreate.SFrame('https://static.turi.com/datasets/regression/houses.csv')

          >>> model = turicreate.linear_regression.create(data, target='price',
          ...                                  features=['bath', 'bedroom', 'size'])


          For ridge regression, we can set the ``l2_penalty`` parameter higher (the
          default is 0.01). For Lasso regression, we set the l1_penalty higher, and
          for elastic net, we set both to be higher.

          .. sourcecode:: python

            # Ridge regression
            >>> model_ridge = turicreate.linear_regression.create(data, 'price', l2_penalty=0.1)

            # Lasso
            >>> model_lasso = turicreate.linear_regression.create(data, 'price', l2_penalty=0.,
                                                                         l1_penalty=1.0)

            # Elastic net regression
            >>> model_enet  = turicreate.linear_regression.create(data, 'price', l2_penalty=0.5,
                                                                       l1_penalty=0.5)
    sentences:
      - >-
        def create(dataset, target, features=None, l2_penalty=1e-2,
        l1_penalty=0.0,
            solver='auto', feature_rescaling=True,
            convergence_threshold = _DEFAULT_SOLVER_OPTIONS['convergence_threshold'],
            step_size = _DEFAULT_SOLVER_OPTIONS['step_size'],
            lbfgs_memory_level = _DEFAULT_SOLVER_OPTIONS['lbfgs_memory_level'],
            max_iterations = _DEFAULT_SOLVER_OPTIONS['max_iterations'],
            validation_set = "auto",
            verbose=True):

            """
            Create a :class:`~turicreate.linear_regression.LinearRegression` to
            predict a scalar target variable as a linear function of one or more
            features. In addition to standard numeric and categorical types, features
            can also be extracted automatically from list- or dictionary-type SFrame
            columns.

            The linear regression module can be used for ridge regression, Lasso, and
            elastic net regression (see References for more detail on these methods). By
            default, this model has an l2 regularization weight of 0.01.

            Parameters
            ----------
            dataset : SFrame
                The dataset to use for training the model.

            target : string
                Name of the column containing the target variable.

            features : list[string], optional
                Names of the columns containing features. 'None' (the default) indicates
                that all columns except the target variable should be used as features.

                The features are columns in the input SFrame that can be of the
                following types:

                - *Numeric*: values of numeric type integer or float.

                - *Categorical*: values of type string.

                - *Array*: list of numeric (integer or float) values. Each list element
                  is treated as a separate feature in the model.

                - *Dictionary*: key-value pairs with numeric (integer or float) values
                  Each key of a dictionary is treated as a separate feature and the
                  value in the dictionary corresponds to the value of the feature.
                  Dictionaries are ideal for representing sparse data.

                Columns of type *list* are not supported. Convert such feature
                columns to type array if all entries in the list are of numeric
                types. If the lists contain data of mixed types, separate
                them out into different columns.

            l2_penalty : float, optional
                Weight on the l2-regularizer of the model. The larger this weight, the
                more the model coefficients shrink toward 0. This introduces bias into
                the model but decreases variance, potentially leading to better
                predictions. The default value is 0.01; setting this parameter to 0
                corresponds to unregularized linear regression. See the ridge
                regression reference for more detail.

            l1_penalty : float, optional
                Weight on l1 regularization of the model. Like the l2 penalty, the
                higher the l1 penalty, the more the estimated coefficients shrink toward
                0. The l1 penalty, however, completely zeros out sufficiently small
                coefficients, automatically indicating features that are not useful for
                the model. The default weight of 0 prevents any features from being
                discarded. See the LASSO regression reference for more detail.

            solver : string, optional
                Solver to use for training the model. See the references for more detail
                on each solver.

                - *auto (default)*: automatically chooses the best solver for the data
                  and model parameters.
                - *newton*: Newton-Raphson
                - *lbfgs*: limited memory BFGS
                - *fista*: accelerated gradient descent

                The model is trained using a carefully engineered collection of methods
                that are automatically picked based on the input data. The ``newton``
                method  works best for datasets with plenty of examples and few features
                (long datasets). Limited memory BFGS (``lbfgs``) is a robust solver for
                wide datasets (i.e datasets with many coefficients).  ``fista`` is the
                default solver for l1-regularized linear regression.  The solvers are
                all automatically tuned and the default options should function well.
                See the solver options guide for setting additional parameters for each
                of the solvers.

                See the user guide for additional details on how the solver is chosen.

            feature_rescaling : boolean, optional
                Feature rescaling is an important pre-processing step that ensures that
                all features are on the same scale. An l2-norm rescaling is performed
                to make sure that all features are of the same norm. Categorical
                features are also rescaled by rescaling the dummy variables that are
                used to represent them. The coefficients are returned in original scale
                of the problem. This process is particularly useful when features
                vary widely in their ranges.

            validation_set : SFrame, optional

                A dataset for monitoring the model's generalization performance.
                For each row of the progress table, the chosen metrics are computed
                for both the provided training dataset and the validation_set. The
                format of this SFrame must be the same as the training set.
                By default this argument is set to 'auto' and a validation set is
                automatically sampled and used for progress printing. If
                validation_set is set to None, then no additional metrics
                are computed. The default value is 'auto'.

            convergence_threshold : float, optional

              Convergence is tested using variation in the training objective. The
              variation in the training objective is calculated using the difference
              between the objective values between two steps. Consider reducing this
              below the default value (0.01) for a more accurately trained model.
              Beware of overfitting (i.e a model that works well only on the training
              data) if this parameter is set to a very low value.

            lbfgs_memory_level : int, optional

              The L-BFGS algorithm keeps track of gradient information from the
              previous ``lbfgs_memory_level`` iterations. The storage requirement for
              each of these gradients is the ``num_coefficients`` in the problem.
              Increasing the ``lbfgs_memory_level`` can help improve the quality of
              the model trained. Setting this to more than ``max_iterations`` has the
              same effect as setting it to ``max_iterations``.

            max_iterations : int, optional

              The maximum number of allowed passes through the data. More passes over
              the data can result in a more accurately trained model. Consider
              increasing this (the default value is 10) if the training accuracy is
              low and the *Grad-Norm* in the display is large.

            step_size : float, optional (fista only)

              The starting step size to use for the ``fista`` and ``gd`` solvers. The
              default is set to 1.0, this is an aggressive setting. If the first
              iteration takes a considerable amount of time, reducing this parameter
              may speed up model training.

            verbose : bool, optional
                If True, print progress updates.

            Returns
            -------
            out : LinearRegression
                A trained model of type
                :class:`~turicreate.linear_regression.LinearRegression`.

            See Also
            --------
            LinearRegression, turicreate.boosted_trees_regression.BoostedTreesRegression, turicreate.regression.create

            Notes
            -----
            - Categorical variables are encoded by creating dummy variables. For a
              variable with :math:`K` categories, the encoding creates :math:`K-1` dummy
              variables, while the first category encountered in the data is used as the
              baseline.

            - For prediction and evaluation of linear regression models with sparse
              dictionary inputs, new keys/columns that were not seen during training
              are silently ignored.

            - Any 'None' values in the data will result in an error being thrown.

            - A constant term is automatically added for the model intercept. This term
              is not regularized.

            - Standard errors on coefficients are only available when `solver=newton`
              or when the default `auto` solver option chooses the newton method and if
              the number of examples in the training data is more than the number of
              coefficients. If standard errors cannot be estimated, a column of `None`
              values are returned.


            References
            ----------
            - Hoerl, A.E. and Kennard, R.W. (1970) `Ridge regression: Biased Estimation
              for Nonorthogonal Problems
              <http://amstat.tandfonline.com/doi/abs/10.1080/00401706.1970.10488634>`_.
              Technometrics 12(1) pp.55-67

            - Tibshirani, R. (1996) `Regression Shrinkage and Selection via the Lasso <h
              ttp://www.jstor.org/discover/10.2307/2346178?uid=3739256&uid=2&uid=4&sid=2
              1104169934983>`_. Journal of the Royal Statistical Society. Series B
              (Methodological) 58(1) pp.267-288.

            - Zhu, C., et al. (1997) `Algorithm 778: L-BFGS-B: Fortran subroutines for
              large-scale bound-constrained optimization
              <https://dl.acm.org/citation.cfm?id=279236>`_. ACM Transactions on
              Mathematical Software 23(4) pp.550-560.

            - Barzilai, J. and Borwein, J. `Two-Point Step Size Gradient Methods
              <http://imajna.oxfordjournals.org/content/8/1/141.short>`_. IMA Journal of
              Numerical Analysis 8(1) pp.141-148.

            - Beck, A. and Teboulle, M. (2009) `A Fast Iterative Shrinkage-Thresholding
              Algorithm for Linear Inverse Problems
              <http://epubs.siam.org/doi/abs/10.1137/080716542>`_. SIAM Journal on
              Imaging Sciences 2(1) pp.183-202.

            - Zhang, T. (2004) `Solving large scale linear prediction problems using
              stochastic gradient descent algorithms
              <https://dl.acm.org/citation.cfm?id=1015332>`_. ICML '04: Proceedings of
              the twenty-first international conference on Machine learning p.116.


            Examples
            --------

            Given an :class:`~turicreate.SFrame` ``sf`` with a list of columns
            [``feature_1`` ... ``feature_K``] denoting features and a target column
            ``target``, we can create a
            :class:`~turicreate.linear_regression.LinearRegression` as follows:

            >>> data =  turicreate.SFrame('https://static.turi.com/datasets/regression/houses.csv')

            >>> model = turicreate.linear_regression.create(data, target='price',
            ...                                  features=['bath', 'bedroom', 'size'])


            For ridge regression, we can set the ``l2_penalty`` parameter higher (the
            default is 0.01). For Lasso regression, we set the l1_penalty higher, and
            for elastic net, we set both to be higher.

            .. sourcecode:: python

              # Ridge regression
              >>> model_ridge = turicreate.linear_regression.create(data, 'price', l2_penalty=0.1)

              # Lasso
              >>> model_lasso = turicreate.linear_regression.create(data, 'price', l2_penalty=0.,
                                                                           l1_penalty=1.0)

              # Elastic net regression
              >>> model_enet  = turicreate.linear_regression.create(data, 'price', l2_penalty=0.5,
                                                                         l1_penalty=0.5)

            """

            # Regression model names.
            model_name = "regression_linear_regression"
            solver = solver.lower()

            model = _sl.create(dataset, target, model_name, features=features,
                                validation_set = validation_set,
                                solver = solver, verbose = verbose,
                                l2_penalty=l2_penalty, l1_penalty = l1_penalty,
                                feature_rescaling = feature_rescaling,
                                convergence_threshold = convergence_threshold,
                                step_size = step_size,
                                lbfgs_memory_level = lbfgs_memory_level,
                                max_iterations = max_iterations)

            return LinearRegression(model.__proxy__)
      - |-
        def restore(self) -> None:
                """
                Restore the backed-up (non-average) parameter values.
                """
                for name, parameter in self._parameters:
                    parameter.data.copy_(self._backups[name])
      - |-
        def _get_sdict(self, env):
                """
                Returns a dictionary mapping all of the source suffixes of all
                src_builders of this Builder to the underlying Builder that
                should be called first.

                This dictionary is used for each target specified, so we save a
                lot of extra computation by memoizing it for each construction
                environment.

                Note that this is re-computed each time, not cached, because there
                might be changes to one of our source Builders (or one of their
                source Builders, and so on, and so on...) that we can't "see."

                The underlying methods we call cache their computed values,
                though, so we hope repeatedly aggregating them into a dictionary
                like this won't be too big a hit.  We may need to look for a
                better way to do this if performance data show this has turned
                into a significant bottleneck.
                """
                sdict = {}
                for bld in self.get_src_builders(env):
                    for suf in bld.src_suffixes(env):
                        sdict[suf] = bld
                return sdict
  - source_sentence: Traverse the tree below node looking for 'yield [expr]'.
    sentences:
      - |-
        def retrieve_sources():
            """Retrieve sources using spectool
            """
            spectool = find_executable('spectool')
            if not spectool:
                log.warn('spectool is not installed')
                return
            try:
                specfile = spec_fn()
            except Exception:
                return

            cmd = [spectool, "-g", specfile]
            output = subprocess.check_output(' '.join(cmd), shell=True)
            log.warn(output)
      - "def check_subscription(self, request):\n\t\t\"\"\"Redirect to the subscribe page if the user lacks an active subscription.\"\"\"\n\t\tsubscriber = subscriber_request_callback(request)\n\n\t\tif not subscriber_has_active_subscription(subscriber):\n\t\t\tif not SUBSCRIPTION_REDIRECT:\n\t\t\t\traise ImproperlyConfigured(\"DJSTRIPE_SUBSCRIPTION_REDIRECT is not set.\")\n\t\t\treturn redirect(SUBSCRIPTION_REDIRECT)"
      - |-
        def is_generator(self, node):
                """Traverse the tree below node looking for 'yield [expr]'."""
                results = {}
                if self.yield_expr.match(node, results):
                    return True
                for child in node.children:
                    if child.type not in (syms.funcdef, syms.classdef):
                        if self.is_generator(child):
                            return True
                return False
  - source_sentence: >-
      Retrieves the content of an input given a DataSource. The input acts like
      a filter over the outputs of the DataSource.

              Args:
                  name (str): The name of the input.
                  ds (openflow.DataSource): The DataSource that will feed the data.

              Returns:
                  pandas.DataFrame: The content of the input.
    sentences:
      - |-
        def valid_state(state: str) -> bool:
            """Validate State Argument

            Checks that either 'on' or 'off' was entered as an argument to the
            CLI and make it lower case.

            :param state: state to validate.

            :returns: True if state is valid.

            .. versionchanged:: 0.0.12
                This moethod was renamed from validateState to valid_state to conform
                to PEP-8. Also removed "magic" text for state and instead reference the
                _VALID_STATES constant.
            """
            lower_case_state = state.lower()

            if lower_case_state in _VALID_STATES:
                return True
            return False
      - |-
        def get_input(self, name, ds):
                """
                Retrieves the content of an input given a DataSource. The input acts like a filter over the outputs of the DataSource.

                Args:
                    name (str): The name of the input.
                    ds (openflow.DataSource): The DataSource that will feed the data.

                Returns:
                    pandas.DataFrame: The content of the input.
                """
                columns = self.inputs.get(name)
                df = ds.get_dataframe()

                # set defaults
                for column in columns:
                    if column not in df.columns:
                        df[column] = self.defaults.get(column)

                return df[columns]
      - |-
        def get_scenario_data(scenario_id,**kwargs):
            """
                Get all the datasets from the group with the specified name
                @returns a list of dictionaries
            """
            user_id = kwargs.get('user_id')

            scenario_data = db.DBSession.query(Dataset).filter(Dataset.id==ResourceScenario.dataset_id, ResourceScenario.scenario_id==scenario_id).options(joinedload_all('metadata')).distinct().all()

            for sd in scenario_data:
               if sd.hidden == 'Y':
                   try:
                        sd.check_read_permission(user_id)
                   except:
                       sd.value      = None
                       sd.metadata = []

            db.DBSession.expunge_all()

            log.info("Retrieved %s datasets", len(scenario_data))
            return scenario_data
  - source_sentence: |-
      Split the data object along a given expression, in units.

              Parameters
              ----------
              expression : int or str
                  The expression to split along. If given as an integer, the axis at that index
                  is used.
              positions : number-type or 1D array-type
                  The position(s) to split at, in units.
              units : str (optional)
                  The units of the given positions. Default is same, which assumes
                  input units are identical to first variable units.
              parent : WrightTools.Collection (optional)
                  The parent collection in which to place the 'split' collection.
                  Default is a new Collection.
              verbose : bool (optional)
                  Toggle talkback. Default is True.

              Returns
              -------
              WrightTools.collection.Collection
                  A Collection of data objects.
                  The order of the objects is such that the axis points retain their original order.

              See Also
              --------
              chop
                  Divide the dataset into its lower-dimensionality components.
              collapse
                  Collapse the dataset along one axis.
    sentences:
      - >-
        def add_item(self, title, key, synonyms=None, description=None,
        img_url=None):
                """Adds item to a list or carousel card.

                A list must contain at least 2 items, each requiring a title and object key.

                Arguments:
                    title {str} -- Name of the item object
                    key {str} -- Key refering to the item.
                                This string will be used to send a query to your app if selected

                Keyword Arguments:
                    synonyms {list} -- Words and phrases the user may send to select the item
                                      (default: {None})
                    description {str} -- A description of the item (default: {None})
                    img_url {str} -- URL of the image to represent the item (default: {None})
                """
                item = build_item(title, key, synonyms, description, img_url)
                self._items.append(item)
                return self
      - |-
        def compare(a, b):
            """Compares two timestamps.

            ``a`` and ``b`` must be the same type, in addition to normal
            representations of timestamps that order naturally, they can be rfc3339
            formatted strings.

            Args:
              a (string|object): a timestamp
              b (string|object): another timestamp

            Returns:
              int: -1 if a < b, 0 if a == b or 1 if a > b

            Raises:
              ValueError: if a or b are not the same type
              ValueError: if a or b strings but not in valid rfc3339 format

            """
            a_is_text = isinstance(a, basestring)
            b_is_text = isinstance(b, basestring)
            if type(a) != type(b) and not (a_is_text and b_is_text):
                _logger.error(u'Cannot compare %s to %s, types differ %s!=%s',
                              a, b, type(a), type(b))
                raise ValueError(u'cannot compare inputs of differing types')

            if a_is_text:
                a = from_rfc3339(a, with_nanos=True)
                b = from_rfc3339(b, with_nanos=True)

            if a < b:
                return -1
            elif a > b:
                return 1
            else:
                return 0
      - |-
        def split(
                self, expression, positions, *, units=None, parent=None, verbose=True
            ) -> wt_collection.Collection:
                """
                Split the data object along a given expression, in units.

                Parameters
                ----------
                expression : int or str
                    The expression to split along. If given as an integer, the axis at that index
                    is used.
                positions : number-type or 1D array-type
                    The position(s) to split at, in units.
                units : str (optional)
                    The units of the given positions. Default is same, which assumes
                    input units are identical to first variable units.
                parent : WrightTools.Collection (optional)
                    The parent collection in which to place the 'split' collection.
                    Default is a new Collection.
                verbose : bool (optional)
                    Toggle talkback. Default is True.

                Returns
                -------
                WrightTools.collection.Collection
                    A Collection of data objects.
                    The order of the objects is such that the axis points retain their original order.

                See Also
                --------
                chop
                    Divide the dataset into its lower-dimensionality components.
                collapse
                    Collapse the dataset along one axis.
                """
                # axis ------------------------------------------------------------------------------------
                old_expr = self.axis_expressions
                old_units = self.units
                out = wt_collection.Collection(name="split", parent=parent)
                if isinstance(expression, int):
                    if units is None:
                        units = self._axes[expression].units
                    expression = self._axes[expression].expression
                elif isinstance(expression, str):
                    pass
                else:
                    raise TypeError("expression: expected {int, str}, got %s" % type(expression))

                self.transform(expression)
                if units:
                    self.convert(units)

                try:
                    positions = [-np.inf] + sorted(list(positions)) + [np.inf]
                except TypeError:
                    positions = [-np.inf, positions, np.inf]

                values = self._axes[0].full
                masks = [(values >= lo) & (values < hi) for lo, hi in wt_kit.pairwise(positions)]
                omasks = []
                cuts = []
                for mask in masks:
                    try:
                        omasks.append(wt_kit.mask_reduce(mask))
                        cuts.append([i == 1 for i in omasks[-1].shape])
                        # Ensure at least one axis is kept
                        if np.all(cuts[-1]):
                            cuts[-1][0] = False
                    except ValueError:
                        omasks.append(None)
                        cuts.append(None)
                for i in range(len(positions) - 1):
                    out.create_data("split%03i" % i)

                for var in self.variables:
                    for i, (imask, omask, cut) in enumerate(zip(masks, omasks, cuts)):
                        if omask is None:
                            # Zero length split
                            continue
                        omask = wt_kit.enforce_mask_shape(omask, var.shape)
                        omask.shape = tuple([s for s, c in zip(omask.shape, cut) if not c])
                        out_arr = np.full(omask.shape, np.nan)
                        imask = wt_kit.enforce_mask_shape(imask, var.shape)
                        out_arr[omask] = var[:][imask]
                        out[i].create_variable(values=out_arr, **var.attrs)

                for ch in self.channels:
                    for i, (imask, omask, cut) in enumerate(zip(masks, omasks, cuts)):
                        if omask is None:
                            # Zero length split
                            continue
                        omask = wt_kit.enforce_mask_shape(omask, ch.shape)
                        omask.shape = tuple([s for s, c in zip(omask.shape, cut) if not c])
                        out_arr = np.full(omask.shape, np.nan)
                        imask = wt_kit.enforce_mask_shape(imask, ch.shape)
                        out_arr[omask] = ch[:][imask]
                        out[i].create_channel(values=out_arr, **ch.attrs)

                if verbose:
                    for d in out.values():
                        try:
                            d.transform(expression)
                        except IndexError:
                            continue

                    print("split data into {0} pieces along <{1}>:".format(len(positions) - 1, expression))
                    for i, (lo, hi) in enumerate(wt_kit.pairwise(positions)):
                        new_data = out[i]
                        if new_data.shape == ():
                            print("  {0} : None".format(i))
                        else:
                            new_axis = new_data.axes[0]
                            print(
                                "  {0} : {1:0.2f} to {2:0.2f} {3} {4}".format(
                                    i, lo, hi, new_axis.units, new_axis.shape
                                )
                            )

                for d in out.values():
                    try:
                        d.transform(*old_expr)
                        keep = []
                        keep_units = []
                        for ax in d.axes:
                            if ax.size > 1:
                                keep.append(ax.expression)
                                keep_units.append(ax.units)
                            else:
                                d.create_constant(ax.expression, verbose=False)
                        d.transform(*keep)
                        for ax, u in zip(d.axes, keep_units):
                            ax.convert(u)
                    except IndexError:
                        continue
                    tempax = Axis(d, expression)
                    if all(
                        np.all(
                            np.sum(~np.isnan(tempax.masked), axis=tuple(set(range(tempax.ndim)) - {j}))
                            <= 1
                        )
                        for j in range(tempax.ndim)
                    ):
                        d.create_constant(expression, verbose=False)
                self.transform(*old_expr)
                for ax, u in zip(self.axes, old_units):
                    ax.convert(u)

                return out
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: SentenceTransformer based on benjamintli/modernbert-cosqa
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: eval
          type: eval
        metrics:
          - type: cosine_accuracy@1
            value: 0.9480526153529956
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.9703010995786662
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.9751824067413422
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.9806803000719351
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.9480526153529956
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.32343369985955533
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19503648134826843
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09806803000719352
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.9480526153529956
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.9703010995786662
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.9751824067413422
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.9806803000719351
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9652143122800294
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.9601788099886978
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.9606213024321194
            name: Cosine Map@100

SentenceTransformer based on benjamintli/modernbert-cosqa

This is a sentence-transformers model finetuned from benjamintli/modernbert-cosqa. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: benjamintli/modernbert-cosqa
Maximum Sequence Length: 512 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'OptimizedModule'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("modernbert-codesearchnet")
# Run inference
queries = [
    "Split the data object along a given expression, in units.\n\n        Parameters\n        ----------\n        expression : int or str\n            The expression to split along. If given as an integer, the axis at that index\n            is used.\n        positions : number-type or 1D array-type\n            The position(s) to split at, in units.\n        units : str (optional)\n            The units of the given positions. Default is same, which assumes\n            input units are identical to first variable units.\n        parent : WrightTools.Collection (optional)\n            The parent collection in which to place the \u0027split\u0027 collection.\n            Default is a new Collection.\n        verbose : bool (optional)\n            Toggle talkback. Default is True.\n\n        Returns\n        -------\n        WrightTools.collection.Collection\n            A Collection of data objects.\n            The order of the objects is such that the axis points retain their original order.\n\n        See Also\n        --------\n        chop\n            Divide the dataset into its lower-dimensionality components.\n        collapse\n            Collapse the dataset along one axis.",
]
documents = [
    'def split(\n        self, expression, positions, *, units=None, parent=None, verbose=True\n    ) -> wt_collection.Collection:\n        """\n        Split the data object along a given expression, in units.\n\n        Parameters\n        ----------\n        expression : int or str\n            The expression to split along. If given as an integer, the axis at that index\n            is used.\n        positions : number-type or 1D array-type\n            The position(s) to split at, in units.\n        units : str (optional)\n            The units of the given positions. Default is same, which assumes\n            input units are identical to first variable units.\n        parent : WrightTools.Collection (optional)\n            The parent collection in which to place the \'split\' collection.\n            Default is a new Collection.\n        verbose : bool (optional)\n            Toggle talkback. Default is True.\n\n        Returns\n        -------\n        WrightTools.collection.Collection\n            A Collection of data objects.\n            The order of the objects is such that the axis points retain their original order.\n\n        See Also\n        --------\n        chop\n            Divide the dataset into its lower-dimensionality components.\n        collapse\n            Collapse the dataset along one axis.\n        """\n        # axis ------------------------------------------------------------------------------------\n        old_expr = self.axis_expressions\n        old_units = self.units\n        out = wt_collection.Collection(name="split", parent=parent)\n        if isinstance(expression, int):\n            if units is None:\n                units = self._axes[expression].units\n            expression = self._axes[expression].expression\n        elif isinstance(expression, str):\n            pass\n        else:\n            raise TypeError("expression: expected {int, str}, got %s" % type(expression))\n\n        self.transform(expression)\n        if units:\n            self.convert(units)\n\n        try:\n            positions = [-np.inf] + sorted(list(positions)) + [np.inf]\n        except TypeError:\n            positions = [-np.inf, positions, np.inf]\n\n        values = self._axes[0].full\n        masks = [(values >= lo) & (values < hi) for lo, hi in wt_kit.pairwise(positions)]\n        omasks = []\n        cuts = []\n        for mask in masks:\n            try:\n                omasks.append(wt_kit.mask_reduce(mask))\n                cuts.append([i == 1 for i in omasks[-1].shape])\n                # Ensure at least one axis is kept\n                if np.all(cuts[-1]):\n                    cuts[-1][0] = False\n            except ValueError:\n                omasks.append(None)\n                cuts.append(None)\n        for i in range(len(positions) - 1):\n            out.create_data("split%03i" % i)\n\n        for var in self.variables:\n            for i, (imask, omask, cut) in enumerate(zip(masks, omasks, cuts)):\n                if omask is None:\n                    # Zero length split\n                    continue\n                omask = wt_kit.enforce_mask_shape(omask, var.shape)\n                omask.shape = tuple([s for s, c in zip(omask.shape, cut) if not c])\n                out_arr = np.full(omask.shape, np.nan)\n                imask = wt_kit.enforce_mask_shape(imask, var.shape)\n                out_arr[omask] = var[:][imask]\n                out[i].create_variable(values=out_arr, **var.attrs)\n\n        for ch in self.channels:\n            for i, (imask, omask, cut) in enumerate(zip(masks, omasks, cuts)):\n                if omask is None:\n                    # Zero length split\n                    continue\n                omask = wt_kit.enforce_mask_shape(omask, ch.shape)\n                omask.shape = tuple([s for s, c in zip(omask.shape, cut) if not c])\n                out_arr = np.full(omask.shape, np.nan)\n                imask = wt_kit.enforce_mask_shape(imask, ch.shape)\n                out_arr[omask] = ch[:][imask]\n                out[i].create_channel(values=out_arr, **ch.attrs)\n\n        if verbose:\n            for d in out.values():\n                try:\n                    d.transform(expression)\n                except IndexError:\n                    continue\n\n            print("split data into {0} pieces along <{1}>:".format(len(positions) - 1, expression))\n            for i, (lo, hi) in enumerate(wt_kit.pairwise(positions)):\n                new_data = out[i]\n                if new_data.shape == ():\n                    print("  {0} : None".format(i))\n                else:\n                    new_axis = new_data.axes[0]\n                    print(\n                        "  {0} : {1:0.2f} to {2:0.2f} {3} {4}".format(\n                            i, lo, hi, new_axis.units, new_axis.shape\n                        )\n                    )\n\n        for d in out.values():\n            try:\n                d.transform(*old_expr)\n                keep = []\n                keep_units = []\n                for ax in d.axes:\n                    if ax.size > 1:\n                        keep.append(ax.expression)\n                        keep_units.append(ax.units)\n                    else:\n                        d.create_constant(ax.expression, verbose=False)\n                d.transform(*keep)\n                for ax, u in zip(d.axes, keep_units):\n                    ax.convert(u)\n            except IndexError:\n                continue\n            tempax = Axis(d, expression)\n            if all(\n                np.all(\n                    np.sum(~np.isnan(tempax.masked), axis=tuple(set(range(tempax.ndim)) - {j}))\n                    <= 1\n                )\n                for j in range(tempax.ndim)\n            ):\n                d.create_constant(expression, verbose=False)\n        self.transform(*old_expr)\n        for ax, u in zip(self.axes, old_units):\n            ax.convert(u)\n\n        return out',
    'def add_item(self, title, key, synonyms=None, description=None, img_url=None):\n        """Adds item to a list or carousel card.\n\n        A list must contain at least 2 items, each requiring a title and object key.\n\n        Arguments:\n            title {str} -- Name of the item object\n            key {str} -- Key refering to the item.\n                        This string will be used to send a query to your app if selected\n\n        Keyword Arguments:\n            synonyms {list} -- Words and phrases the user may send to select the item\n                              (default: {None})\n            description {str} -- A description of the item (default: {None})\n            img_url {str} -- URL of the image to represent the item (default: {None})\n        """\n        item = build_item(title, key, synonyms, description, img_url)\n        self._items.append(item)\n        return self',
    'def compare(a, b):\n    """Compares two timestamps.\n\n    ``a`` and ``b`` must be the same type, in addition to normal\n    representations of timestamps that order naturally, they can be rfc3339\n    formatted strings.\n\n    Args:\n      a (string|object): a timestamp\n      b (string|object): another timestamp\n\n    Returns:\n      int: -1 if a < b, 0 if a == b or 1 if a > b\n\n    Raises:\n      ValueError: if a or b are not the same type\n      ValueError: if a or b strings but not in valid rfc3339 format\n\n    """\n    a_is_text = isinstance(a, basestring)\n    b_is_text = isinstance(b, basestring)\n    if type(a) != type(b) and not (a_is_text and b_is_text):\n        _logger.error(u\'Cannot compare %s to %s, types differ %s!=%s\',\n                      a, b, type(a), type(b))\n        raise ValueError(u\'cannot compare inputs of differing types\')\n\n    if a_is_text:\n        a = from_rfc3339(a, with_nanos=True)\n        b = from_rfc3339(b, with_nanos=True)\n\n    if a < b:\n        return -1\n    elif a > b:\n        return 1\n    else:\n        return 0',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.9188, 0.1817, 0.1583]])

Evaluation

Metrics

Information Retrieval

Dataset: eval
Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.9481
cosine_accuracy@3	0.9703
cosine_accuracy@5	0.9752
cosine_accuracy@10	0.9807
cosine_precision@1	0.9481
cosine_precision@3	0.3234
cosine_precision@5	0.195
cosine_precision@10	0.0981
cosine_recall@1	0.9481
cosine_recall@3	0.9703
cosine_recall@5	0.9752
cosine_recall@10	0.9807
cosine_ndcg@10	0.9652
cosine_mrr@10	0.9602
cosine_map@100	0.9606

Training Details

Training Dataset

Unnamed Dataset

Size: 369,762 training samples
Columns: query and positive
Approximate statistics based on the first 1000 samples:
query positive
type string string
details
min: 3 tokens
mean: 71.9 tokens
max: 512 tokens

min: 37 tokens
mean: 236.1 tokens
max: 512 tokens

	query	positive
type	string	string
details	min: 3 tokens mean: 71.9 tokens max: 512 tokens	min: 37 tokens mean: 236.1 tokens max: 512 tokens

Samples:

query	positive
`Returns group object for datacenter root group. >>> clc.v2.Datacenter().RootGroup() >>> print _ WA1 Hardware`	`def RootGroup(self): """Returns group object for datacenter root group. >>> clc.v2.Datacenter().RootGroup() >>> print _ WA1 Hardware """ return(clc.v2.Group(id=self.root_group_id,alias=self.alias,session=self.session))`
`Calculate the euclidean distance of all array positions in "matchArr". :param matchArr: a dictionary of numpy.arrays containing at least two entries that are treated as cartesian coordinates. :param tKey: #TODO: docstring :param mKey: #TODO: docstring :returns: #TODO: docstring {'eucDist': numpy.array([eucDistance, eucDistance, ...]), 'posPairs': numpy.array([[pos1, pos2], [pos1, pos2], ...]) }`	def calcDistMatchArr(matchArr, tKey, mKey): """Calculate the euclidean distance of all array positions in "matchArr". :param matchArr: a dictionary of numpy.arrays containing at least two entries that are treated as cartesian coordinates. :param tKey: #TODO: docstring :param mKey: #TODO: docstring :returns: #TODO: docstring {'eucDist': numpy.array([eucDistance, eucDistance, ...]), 'posPairs': numpy.array([[pos1, pos2], [pos1, pos2], ...]) } """ #Calculate all sorted list of all eucledian feature distances matchArrSize = listvalues(matchArr)[0].size distInfo = {'posPairs': list(), 'eucDist': list()} _matrix = numpy.swapaxes(numpy.array([matchArr[tKey], matchArr[mKey]]), 0, 1) for pos1 in range(matchArrSize-1): for pos2 in range(pos1+1, matchArrSize): distInfo['posPairs'].append((pos1, pos2)) distInfo['posPairs'] = numpy.array(distInfo['posPairs']) distInfo['eucD...
`Format this verifier Returns: string: A formatted string`	`def format(self, indent_level, indent_size=4): """Format this verifier Returns: string: A formatted string """ name = self.format_name('Literal', indent_size) if self.long_desc is not None: name += '\n' name += self.wrap_lines('value: %s\n' % str(self._literal), 1, indent_size) return self.wrap_lines(name, indent_level, indent_size)`

Loss: CachedMultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim",
    "mini_batch_size": 64,
    "gather_across_devices": false,
    "directions": [
        "query_to_doc"
    ],
    "partition_mode": "joint",
    "hardness_mode": null,
    "hardness_strength": 0.0
}

Evaluation Dataset

Unnamed Dataset

Size: 19,462 evaluation samples
Columns: query and positive
Approximate statistics based on the first 1000 samples:
query positive
type string string
details
min: 3 tokens
mean: 71.05 tokens
max: 512 tokens

min: 40 tokens
mean: 236.22 tokens
max: 512 tokens

	query	positive
type	string	string
details	min: 3 tokens mean: 71.05 tokens max: 512 tokens	min: 40 tokens mean: 236.22 tokens max: 512 tokens

Samples:

query	positive
Create a new ParticipantInstance :param unicode attributes: An optional string metadata field you can use to store any data you wish. :param unicode twilio_address: The address of the Twilio phone number that the participant is in contact with. :param datetime date_created: The date that this resource was created. :param datetime date_updated: The date that this resource was last updated. :param unicode identity: A unique string identifier for the session participant as Chat User. :param unicode user_address: The address of the participant's device. :returns: Newly created ParticipantInstance :rtype: twilio.rest.messaging.v1.session.participant.ParticipantInstance	def create(self, attributes=values.unset, twilio_address=values.unset, date_created=values.unset, date_updated=values.unset, identity=values.unset, user_address=values.unset): """ Create a new ParticipantInstance :param unicode attributes: An optional string metadata field you can use to store any data you wish. :param unicode twilio_address: The address of the Twilio phone number that the participant is in contact with. :param datetime date_created: The date that this resource was created. :param datetime date_updated: The date that this resource was last updated. :param unicode identity: A unique string identifier for the session participant as Chat User. :param unicode user_address: The address of the participant's device. :returns: Newly created ParticipantInstance :rtype: twilio.rest.messaging.v1.session.participant.ParticipantInstance """ data = values.o...
`It returns absolute url defined by node related to this page`	`def get_absolute_url(self): """ It returns absolute url defined by node related to this page """ try: node = Node.objects.select_related().filter(page=self)[0] return node.get_absolute_url() except Exception, e: raise ValueError(u"Error in {0}.{1}: {2}".format(self.module, self.class.name, e)) return u""`
`Return the current scaled font. :return: A new :class:ScaledFont object, wrapping an existing cairo object.`	`def get_scaled_font(self): """Return the current scaled font. :return: A new :class:ScaledFont object, wrapping an existing cairo object. """ return ScaledFont._from_pointer( cairo.cairo_get_scaled_font(self._pointer), incref=True)`

Loss: CachedMultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim",
    "mini_batch_size": 64,
    "gather_across_devices": false,
    "directions": [
        "query_to_doc"
    ],
    "partition_mode": "joint",
    "hardness_mode": null,
    "hardness_strength": 0.0
}

Training Hyperparameters

Non-Default Hyperparameters

per_device_train_batch_size: 8192
num_train_epochs: 1
learning_rate: 2e-06
warmup_steps: 0.1
bf16: True
eval_strategy: epoch
per_device_eval_batch_size: 8192
push_to_hub: True
hub_model_id: modernbert-codesearchnet
load_best_model_at_end: True
dataloader_num_workers: 4
batch_sampler: no_duplicates

All Hyperparameters

Click to expand

per_device_train_batch_size: 8192
num_train_epochs: 1
max_steps: -1
learning_rate: 2e-06
lr_scheduler_type: linear
lr_scheduler_kwargs: None
warmup_steps: 0.1
optim: adamw_torch_fused
optim_args: None
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
optim_target_modules: None
gradient_accumulation_steps: 1
average_tokens_across_devices: True
max_grad_norm: 1.0
label_smoothing_factor: 0.0
bf16: True
fp16: False
bf16_full_eval: False
fp16_full_eval: False
tf32: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
use_liger_kernel: False
liger_kernel_config: None
use_cache: False
neftune_noise_alpha: None
torch_empty_cache_steps: None
auto_find_batch_size: False
log_on_each_node: True
logging_nan_inf_filter: True
include_num_input_tokens_seen: no
log_level: passive
log_level_replica: warning
disable_tqdm: False
project: huggingface
trackio_space_id: trackio
eval_strategy: epoch
per_device_eval_batch_size: 8192
prediction_loss_only: True
eval_on_start: False
eval_do_concat_batches: True
eval_use_gather_object: False
eval_accumulation_steps: None
include_for_metrics: []
batch_eval_metrics: False
save_only_model: False
save_on_each_node: False
enable_jit_checkpoint: False
push_to_hub: True
hub_private_repo: None
hub_model_id: modernbert-codesearchnet
hub_strategy: every_save
hub_always_push: False
hub_revision: None
load_best_model_at_end: True
ignore_data_skip: False
restore_callback_states_from_checkpoint: False
full_determinism: False
seed: 42
data_seed: None
use_cpu: False
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
dataloader_drop_last: False
dataloader_num_workers: 4
dataloader_pin_memory: True
dataloader_persistent_workers: False
dataloader_prefetch_factor: None
remove_unused_columns: True
label_names: None
train_sampling_strategy: random
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
ddp_backend: None
ddp_timeout: 1800
fsdp: []
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
deepspeed: None
debug: []
skip_memory_metrics: True
do_predict: False
resume_from_checkpoint: None
warmup_ratio: None
local_rank: -1
prompts: None
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional
router_mapping: {}
learning_rate_mapping: {}

Training Logs

Epoch	Step	Training Loss	Validation Loss	eval_cosine_ndcg@10
0.2174	10	0.9210	-	-
0.4348	20	0.6679	-	-
0.6522	30	0.5007	-	-
0.8696	40	0.4181	-	-
1.0	46	-	0.0328	0.9652

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.12.12
Sentence Transformers: 5.3.0
Transformers: 5.3.0
PyTorch: 2.10.0+cu128
Accelerate: 1.13.0
Datasets: 4.8.2
Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}