Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

fyaronskiy
/
english_code_retriever

Sentence Similarity
sentence-transformers
Safetensors
English
modernbert
code
code-retrieval
retrieval-augmented-generation
rag
python
java
go
php
feature-extraction
Generated from Trainer
loss:MultipleNegativesRankingLoss
Eval Results (legacy)
text-embeddings-inference
Model card Files Files and versions
xet
Community

Instructions to use fyaronskiy/english_code_retriever with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

  • Libraries
  • sentence-transformers

    How to use fyaronskiy/english_code_retriever with sentence-transformers:

    from sentence_transformers import SentenceTransformer
    
    model = SentenceTransformer("fyaronskiy/english_code_retriever")
    
    sentences = [
        "search_query: Finds the top long, short, and absolute positions.\n\n    Parameters\n    ----------\n    positions : pd.DataFrame\n        The positions that the strategy takes over time.\n    top : int, optional\n        How many of each to find (default 10).\n\n    Returns\n    -------\n    df_top_long : pd.DataFrame\n        Top long positions.\n    df_top_short : pd.DataFrame\n        Top short positions.\n    df_top_abs : pd.DataFrame\n        Top absolute positions.",
        "search_document: def symmetric_ema(xolds, yolds, low=None, high=None, n=512, decay_steps=1., low_counts_threshold=1e-8):\n    '''\n    perform symmetric EMA (exponential moving average)\n    smoothing and resampling to an even grid with n points.\n    Does not do extrapolation, so we assume\n    xolds[0] <= low && high <= xolds[-1]\n\n    Arguments:\n\n    xolds: array or list  - x values of data. Needs to be sorted in ascending order\n    yolds: array of list  - y values of data. Has to have the same length as xolds\n\n    low: float            - min value of the new x grid. By default equals to xolds[0]\n    high: float           - max value of the new x grid. By default equals to xolds[-1]\n\n    n: int                - number of points in new x grid\n\n    decay_steps: float    - EMA decay factor, expressed in new x grid steps.\n\n    low_counts_threshold: float or int\n                          - y values with counts less than this value will be set to NaN\n\n    Returns:\n        tuple sum_ys, count_ys where\n            xs        - array with new x grid\n            ys        - array of EMA of y at each point of the new x grid\n            count_ys  - array of EMA of y counts at each point of the new x grid\n\n    '''\n    xs, ys1, count_ys1 = one_sided_ema(xolds, yolds, low, high, n, decay_steps, low_counts_threshold=0)\n    _,  ys2, count_ys2 = one_sided_ema(-xolds[::-1], yolds[::-1], -high, -low, n, decay_steps, low_counts_threshold=0)\n    ys2 = ys2[::-1]\n    count_ys2 = count_ys2[::-1]\n    count_ys = count_ys1 + count_ys2\n    ys = (ys1 * count_ys1 + ys2 * count_ys2) / count_ys\n    ys[count_ys < low_counts_threshold] = np.nan\n    return xs, ys, count_ys",
        "search_document: def project(self, from_shape, to_shape):\n        \"\"\"\n        Project the polygon onto an image with different shape.\n\n        The relative coordinates of all points remain the same.\n        E.g. a point at (x=20, y=20) on an image (width=100, height=200) will be\n        projected on a new image (width=200, height=100) to (x=40, y=10).\n\n        This is intended for cases where the original image is resized.\n        It cannot be used for more complex changes (e.g. padding, cropping).\n\n        Parameters\n        ----------\n        from_shape : tuple of int\n            Shape of the original image. (Before resize.)\n\n        to_shape : tuple of int\n            Shape of the new image. (After resize.)\n\n        Returns\n        -------\n        imgaug.Polygon\n            Polygon object with new coordinates.\n\n        \"\"\"\n        if from_shape[0:2] == to_shape[0:2]:\n            return self.copy()\n        ls_proj = self.to_line_string(closed=False).project(\n            from_shape, to_shape)\n        return self.copy(exterior=ls_proj.coords)",
        "search_document: def get_top_long_short_abs(positions, top=10):\n    \"\"\"\n    Finds the top long, short, and absolute positions.\n\n    Parameters\n    ----------\n    positions : pd.DataFrame\n        The positions that the strategy takes over time.\n    top : int, optional\n        How many of each to find (default 10).\n\n    Returns\n    -------\n    df_top_long : pd.DataFrame\n        Top long positions.\n    df_top_short : pd.DataFrame\n        Top short positions.\n    df_top_abs : pd.DataFrame\n        Top absolute positions.\n    \"\"\"\n\n    positions = positions.drop('cash', axis='columns')\n    df_max = positions.max()\n    df_min = positions.min()\n    df_abs_max = positions.abs().max()\n    df_top_long = df_max[df_max > 0].nlargest(top)\n    df_top_short = df_min[df_min < 0].nsmallest(top)\n    df_top_abs = df_abs_max.nlargest(top)\n    return df_top_long, df_top_short, df_top_abs"
    ]
    embeddings = model.encode(sentences)
    
    similarities = model.similarity(embeddings, embeddings)
    print(similarities.shape)
    # [4, 4]
  • Notebooks
  • Google Colab
  • Kaggle
english_code_retriever
600 MB
Ctrl+K
Ctrl+K
  • 1 contributor
History: 8 commits
fyaronskiy's picture
fyaronskiy
Update README.md
14807ee verified 7 months ago
  • 1_Pooling
    Add new SentenceTransformer model 7 months ago
  • .gitattributes
    1.52 kB
    initial commit 7 months ago
  • README.md
    38.7 kB
    Update README.md 7 months ago
  • Rteb_top.jpg
    66.8 kB
    Upload Rteb_top.jpg 7 months ago
  • config.json
    1.21 kB
    Add new SentenceTransformer model 7 months ago
  • config_sentence_transformers.json
    396 Bytes
    Update config_sentence_transformers.json 7 months ago
  • model.safetensors
    596 MB
    xet
    Add new SentenceTransformer model 7 months ago
  • modules.json
    229 Bytes
    Add new SentenceTransformer model 7 months ago
  • sentence_bert_config.json
    58 Bytes
    Add new SentenceTransformer model 7 months ago
  • special_tokens_map.json
    694 Bytes
    Add new SentenceTransformer model 7 months ago
  • tokenizer.json
    3.58 MB
    Add new SentenceTransformer model 7 months ago
  • tokenizer_config.json
    21 kB
    Add new SentenceTransformer model 7 months ago