fyaronskiy's picture
Update README.md
9477a7d verified
---
language:
- ru
- en
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dense
- generated_from_trainer
- dataset_size:7211755
- loss:MatryoshkaLoss
- loss:CachedMultipleNegativesRankingLoss
- loss:CoSENTLoss
widget:
- source_sentence: Returns the number of parameters in the network.
sentences:
- |-
Удаляет журнал обучения.
Параметры
-----------
kwargs : информация для логирования
Находит элементы для удаления, оставьте пустым, чтобы удалить все логи.
Примеры
---------
Сохранение журнала обучения
>>> db.save_training_log(accuracy=0.33)
>>> db.save_training_log(accuracy=0.44)
Удаление логов, соответствующих требованиям
>>> db.delete_training_log(accuracy=0.33)
Удаление всех логов
>>> db.delete_training_log()
- |-
Создайте глубокую копию объекта Polygon.
Параметры
----------
exterior : список Keypoint или список кортежей или (N,2) ndarray, необязательный
Список точек, определяющих полигон. См. `imgaug.Polygon.__init__` для деталей.
label : None или str
Если не None, то метка скопированного объекта будет установлена в этом значении.
Возвращает
-------
imgaug.Polygon
Глубокая копия.
- Возвращает количество параметров в сети.
- source_sentence: |-
Plots total amount of stocks with an active position, either short
or long. Displays daily total, daily average per month, and
all-time daily average.
Parameters
----------
returns : pd.Series
Daily returns of the strategy, noncumulative.
- See full explanation in tears.create_full_tear_sheet.
positions : pd.DataFrame, optional
Daily net position values.
- See full explanation in tears.create_full_tear_sheet.
legend_loc : matplotlib.loc, optional
The location of the legend on the plot.
ax : matplotlib.Axes, optional
Axes upon which to plot.
**kwargs, optional
Passed to plotting function.
Returns
-------
ax : matplotlib.Axes
The axes that were plotted on.
sentences:
- >-
Графики накопленных скользящих возвратов по сравнению с некоторыми
бенчмарками.
Возвраты бэктеста отображаются зеленым цветом, а возвраты за период вне выборки (живое трейдинг)
красным цветом.
Дополнительно может быть добавлен не параметрический конусный график в область возвратов вне выборки.
Параметры
----------
returns : pd.Series
Ежедневные возвраты стратегии, накапливаемые.
- Полное объяснение см. в tears.create_full_tear_sheet.
factor_returns : pd.Series, необязательный
Ежедневные ненакапливаемые возвраты бенчмарка, к которому вычисляются беты.
Обычно это бенчмарк, например, возвраты рынка.
- Этот параметр имеет тот же стиль, что и returns.
live_start_date : datetime, необязательный
Дата, когда стратегия начала торговлю в режиме реального времени, после
периода бэктеста. Эта дата должна быть нормализована.
logy : bool, необязательный
Включает ли логарифмический масштаб оси Y.
cone_std : float или кортеж, необязательный
Если float, стандартное отклонение для конусных графиков.
Если кортеж, кортеж значений стандартного отклонения для конусных графиков
- Подробнее см. timeseries.forecast_cone_bounds.
legend_loc : matplotlib.loc, необязательный
Расположение легенды на графике.
volatility_match : bool, необязательный
Нормализует ли волатильность возвратов к волатильности бенчмарка.
Это помогает сравнивать стратегии с разной волатильностью. Требуется передача benchmark_rets.
cone_function : функция, необязательная
Функция, используемая для генерации прогнозного вероятностного конуса.
Подпись функции должна соответствовать следующему формату:
def cone(in_sample_returns (pd.Series),
days_to_project_forward (int),
cone_std= (float, или кортеж),
starting_value= (int, или float))
Пример см. в timeseries.forecast_cone_bootstrap.
ax : matplotlib.Axes, необязательный
Оси, на которых будет выполнен график.
**kwargs, необязательный
Передается в функцию отрисовки.
Возвращаемое значение
-------
ax : matplotlib.Axes
Оси, на которых был выполнен график.
- >-
Обработка партии данных с помощью заданной функции с использованием
многопоточности.
Обычно используется для увеличения данных.
Параметры
-----------
data : numpy.array или другие типы
Данные, которые нужно обработать.
thread_count : int
Количество потоков для использования.
fn : function
Функция для обработки данных.
more args : аргументы для `fn`
См. Примеры ниже.
Примеры
--------
Обработка изображений.
>>> images, _, _, _ = tl.files.load_cifar10_dataset(shape=(-1, 32, 32, 3))
>>> images = tl.prepro.threading_data(images[0:32], tl.prepro.zoom, zoom_range=[0.5, 1])
Настроенная функция предварительной обработки изображений.
>>> def distort_img(x):
>>> x = tl.prepro.flip_axis(x, axis=0, is_random=True)
>>> x = tl.prepro.flip_axis(x, axis=1, is_random=True)
>>> x = tl.prepro.crop(x, 100, 100, is_random=True)
>>> return x
>>> images = tl.prepro.threading_data(images, distort_img)
Обработка изображений и масок вместе (обычно используется для задач изображений сегментации).
>>> X, Y --> [batch_size, row, col, 1]
>>> data = tl.prepro.threading_data([_ for _ in zip(X, Y)], tl.prepro.zoom_multi, zoom_range=[0.5, 1], is_random=True)
data --> [batch_size, 2, row, col, 1]
>>> X_, Y_ = data.transpose((1,0,2,3,4))
X_, Y_ --> [batch_size, row, col, 1]
>>> tl.vis.save_image(X_, 'images.png')
>>> tl.vis.save_image(Y_, 'masks.png')
Обработка изображений и масок вместе с использованием ``thread_count``.
>>> X, Y --> [batch_size, row, col, 1]
>>> data = tl.prepro.threading_data(X, tl.prepro.zoom_multi, 8, zoom_range=[0.5, 1], is_random=True)
data --> [batch_size, 2, row, col, 1]
>>> X_, Y_ = data.transpose((1,0,2,3,4))
X_, Y_ --> [batch_size, row, col, 1]
>>> tl.vis.save_image(X_, 'after.png')
>>> tl.vis.save_image(Y_, 'before.png')
Настроенная функция для обработки изображений и масок вместе.
>>> def distort_img(data):
>>> x, y = data
>>> x, y = tl.prepro.flip_axis_multi([x, y], axis=0, is_random=True)
>>> x, y = tl.prepro.flip_axis_multi([x, y], axis=1, is_random=True)
>>> x, y = tl.prepro.crop_multi([x, y], 100, 100, is_random=True)
>>> return x, y
>>> X, Y --> [batch_size, row, col, channel]
>>> data = tl.prepro.threading_data([_ for _ in zip(X, Y)], distort_img)
>>> X_, Y_ = data.transpose((1,0,2,3,4))
Возвращает
-------
list или numpyarray
Обработанные результаты.
Ссылки
----------
- `python queue <https://pymotw.com/2/Queue/index.html#module-Queue>`__
- `run with limited queue <http://effbot.org/librarybook/queue.htm>`__
- >-
Графики общего объема акций с активным позиционированием, либо коротким,
либо длинным. Отображает общий объем ежедневно, среднее ежемесячное значение
и среднее значение за все время.
Параметры
----------
returns : pd.Series
Ежедневные возвраты стратегии, не накапливаемые.
- Полное объяснение см. в tears.create_full_tear_sheet.
positions : pd.DataFrame, опционально
Ежедневные значения чистых позиций.
- Полное объяснение см. в tears.create_full_tear_sheet.
legend_loc : matplotlib.loc, опционально
Расположение легенды на графике.
ax : matplotlib.Axes, опционально
Оси, на которых будет выполнен график.
**kwargs, опционально
Передается в функцию построения графика.
Возвращает
-------
ax : matplotlib.Axes
Оси, на которых был выполнен график.
- source_sentence: >-
T.set_default(k[,d]) -> T.get(k,d), также устанавливает T[k]=d, если k не в
T
sentences:
- |-
def compute_volume_exposures(shares_held, volumes, percentile):
"""
Returns arrays of pth percentile of long, short and gross volume exposures
of an algorithm's held shares
Parameters
----------
shares_held : pd.DataFrame
Daily number of shares held by an algorithm.
- See full explanation in create_risk_tear_sheet
volume : pd.DataFrame
Daily volume per asset
- See full explanation in create_risk_tear_sheet
percentile : float
Percentile to use when computing and plotting volume exposures
- See full explanation in create_risk_tear_sheet
"""
shares_held = shares_held.replace(0, np.nan)
shares_longed = shares_held[shares_held > 0]
shares_shorted = -1 * shares_held[shares_held < 0]
shares_grossed = shares_held.abs()
longed_frac = shares_longed.divide(volumes)
shorted_frac = shares_shorted.divide(volumes)
grossed_frac = shares_grossed.divide(volumes)
# NOTE: To work around a bug in `quantile` with nan-handling in
# pandas 0.18, use np.nanpercentile by applying to each row of
# the dataframe. This is fixed in pandas 0.19.
#
# longed_threshold = 100*longed_frac.quantile(percentile, axis='columns')
# shorted_threshold = 100*shorted_frac.quantile(percentile, axis='columns')
# grossed_threshold = 100*grossed_frac.quantile(percentile, axis='columns')
longed_threshold = 100 * longed_frac.apply(
partial(np.nanpercentile, q=100 * percentile),
axis='columns',
)
shorted_threshold = 100 * shorted_frac.apply(
partial(np.nanpercentile, q=100 * percentile),
axis='columns',
)
grossed_threshold = 100 * grossed_frac.apply(
partial(np.nanpercentile, q=100 * percentile),
axis='columns',
)
return longed_threshold, shorted_threshold, grossed_threshold
- |-
def set_default(self, key, default=None):
"""T.set_default(k[,d]) -> T.get(k,d), also set T[k]=d if k not in T"""
try:
return self.get_value(key)
except KeyError:
self.insert(key, default)
return default
- |-
def find_intersections_with(self, other):
"""
Find all intersection points between the line string and `other`.
Parameters
----------
other : tuple of number or list of tuple of number or \
list of LineString or LineString
The other geometry to use during intersection tests.
Returns
-------
list of list of tuple of number
All intersection points. One list per pair of consecutive start
and end point, i.e. `N-1` lists of `N` points. Each list may
be empty or may contain multiple points.
"""
import shapely.geometry
geom = _convert_var_to_shapely_geometry(other)
result = []
for p_start, p_end in zip(self.coords[:-1], self.coords[1:]):
ls = shapely.geometry.LineString([p_start, p_end])
intersections = ls.intersection(geom)
intersections = list(_flatten_shapely_collection(intersections))
intersections_points = []
for inter in intersections:
if isinstance(inter, shapely.geometry.linestring.LineString):
inter_start = (inter.coords[0][0], inter.coords[0][1])
inter_end = (inter.coords[-1][0], inter.coords[-1][1])
intersections_points.extend([inter_start, inter_end])
else:
assert isinstance(inter, shapely.geometry.point.Point), (
"Expected to find shapely.geometry.point.Point or "
"shapely.geometry.linestring.LineString intersection, "
"actually found %s." % (type(inter),))
intersections_points.append((inter.x, inter.y))
# sort by distance to start point, this makes it later on easier
# to remove duplicate points
inter_sorted = sorted(
intersections_points,
key=lambda p: np.linalg.norm(np.float32(p) - p_start)
)
result.append(inter_sorted)
return result
- source_sentence: |-
Обертка для _log_counter_per_token.
Аргументы:
token: Токен, для которого нужно найти количество.
Возвращает:
Количество раз, когда эта функция вызывалась с *token* в качестве аргумента (начинается с 0)
sentences:
- |-
def _GetNextLogCountPerToken(token):
"""Wrapper for _log_counter_per_token.
Args:
token: The token for which to look up the count.
Returns:
The number of times this function has been called with
*token* as an argument (starting at 0)
"""
global _log_counter_per_token # pylint: disable=global-variable-not-assigned
_log_counter_per_token[token] = 1 + _log_counter_per_token.get(token, -1)
return _log_counter_per_token[token]
- |-
def remove_out_of_image(self, fully=True, partly=False):
"""
Remove all bounding boxes that are fully or partially outside of the image.
Parameters
----------
fully : bool, optional
Whether to remove bounding boxes that are fully outside of the image.
partly : bool, optional
Whether to remove bounding boxes that are partially outside of the image.
Returns
-------
imgaug.BoundingBoxesOnImage
Reduced set of bounding boxes, with those that were fully/partially outside of
the image removed.
"""
bbs_clean = [bb for bb in self.bounding_boxes
if not bb.is_out_of_image(self.shape, fully=fully, partly=partly)]
return BoundingBoxesOnImage(bbs_clean, shape=self.shape)
- |-
def noise4d(self, x, y, z, w):
"""
Generate 4D OpenSimplex noise from X,Y,Z,W coordinates.
"""
# Place input coordinates on simplectic honeycomb.
stretch_offset = (x + y + z + w) * STRETCH_CONSTANT_4D
xs = x + stretch_offset
ys = y + stretch_offset
zs = z + stretch_offset
ws = w + stretch_offset
# Floor to get simplectic honeycomb coordinates of rhombo-hypercube super-cell origin.
xsb = floor(xs)
ysb = floor(ys)
zsb = floor(zs)
wsb = floor(ws)
# Skew out to get actual coordinates of stretched rhombo-hypercube origin. We'll need these later.
squish_offset = (xsb + ysb + zsb + wsb) * SQUISH_CONSTANT_4D
xb = xsb + squish_offset
yb = ysb + squish_offset
zb = zsb + squish_offset
wb = wsb + squish_offset
# Compute simplectic honeycomb coordinates relative to rhombo-hypercube origin.
xins = xs - xsb
yins = ys - ysb
zins = zs - zsb
wins = ws - wsb
# Sum those together to get a value that determines which region we're in.
in_sum = xins + yins + zins + wins
# Positions relative to origin po.
dx0 = x - xb
dy0 = y - yb
dz0 = z - zb
dw0 = w - wb
value = 0
extrapolate = self._extrapolate4d
if in_sum <= 1: # We're inside the pentachoron (4-Simplex) at (0,0,0,0)
# Determine which two of (0,0,0,1), (0,0,1,0), (0,1,0,0), (1,0,0,0) are closest.
a_po = 0x01
a_score = xins
b_po = 0x02
b_score = yins
if a_score >= b_score and zins > b_score:
b_score = zins
b_po = 0x04
elif a_score < b_score and zins > a_score:
a_score = zins
a_po = 0x04
if a_score >= b_score and wins > b_score:
b_score = wins
b_po = 0x08
elif a_score < b_score and wins > a_score:
a_score = wins
a_po = 0x08
# Now we determine the three lattice pos not part of the pentachoron that may contribute.
# This depends on the closest two pentachoron vertices, including (0,0,0,0)
uins = 1 - in_sum
if uins > a_score or uins > b_score: # (0,0,0,0) is one of the closest two pentachoron vertices.
c = b_po if (b_score > a_score) else a_po # Our other closest vertex is the closest out of a and b.
if (c & 0x01) == 0:
xsv_ext0 = xsb - 1
xsv_ext1 = xsv_ext2 = xsb
dx_ext0 = dx0 + 1
dx_ext1 = dx_ext2 = dx0
else:
xsv_ext0 = xsv_ext1 = xsv_ext2 = xsb + 1
dx_ext0 = dx_ext1 = dx_ext2 = dx0 - 1
if (c & 0x02) == 0:
ysv_ext0 = ysv_ext1 = ysv_ext2 = ysb
dy_ext0 = dy_ext1 = dy_ext2 = dy0
if (c & 0x01) == 0x01:
ysv_ext0 -= 1
dy_ext0 += 1
else:
ysv_ext1 -= 1
dy_ext1 += 1
else:
ysv_ext0 = ysv_ext1 = ysv_ext2 = ysb + 1
dy_ext0 = dy_ext1 = dy_ext2 = dy0 - 1
if (c & 0x04) == 0:
zsv_ext0 = zsv_ext1 = zsv_ext2 = zsb
dz_ext0 = dz_ext1 = dz_ext2 = dz0
if (c & 0x03) != 0:
if (c & 0x03) == 0x03:
zsv_ext0 -= 1
dz_ext0 += 1
else:
zsv_ext1 -= 1
dz_ext1 += 1
else:
zsv_ext2 -= 1
dz_ext2 += 1
else:
zsv_ext0 = zsv_ext1 = zsv_ext2 = zsb + 1
dz_ext0 = dz_ext1 = dz_ext2 = dz0 - 1
if (c & 0x08) == 0:
wsv_ext0 = wsv_ext1 = wsb
wsv_ext2 = wsb - 1
dw_ext0 = dw_ext1 = dw0
dw_ext2 = dw0 + 1
else:
wsv_ext0 = wsv_ext1 = wsv_ext2 = wsb + 1
dw_ext0 = dw_ext1 = dw_ext2 = dw0 - 1
else: # (0,0,0,0) is not one of the closest two pentachoron vertices.
c = (a_po | b_po) # Our three extra vertices are determined by the closest two.
if (c & 0x01) == 0:
xsv_ext0 = xsv_ext2 = xsb
xsv_ext1 = xsb - 1
dx_ext0 = dx0 - 2 * SQUISH_CONSTANT_4D
dx_ext1 = dx0 + 1 - SQUISH_CONSTANT_4D
dx_ext2 = dx0 - SQUISH_CONSTANT_4D
else:
xsv_ext0 = xsv_ext1 = xsv_ext2 = xsb + 1
dx_ext0 = dx0 - 1 - 2 * SQUISH_CONSTANT_4D
dx_ext1 = dx_ext2 = dx0 - 1 - SQUISH_CONSTANT_4D
if (c & 0x02) == 0:
ysv_ext0 = ysv_ext1 = ysv_ext2 = ysb
dy_ext0 = dy0 - 2 * SQUISH_CONSTANT_4D
dy_ext1 = dy_ext2 = dy0 - SQUISH_CONSTANT_4D
if (c & 0x01) == 0x01:
ysv_ext1 -= 1
dy_ext1 += 1
else:
ysv_ext2 -= 1
dy_ext2 += 1
else:
ysv_ext0 = ysv_ext1 = ysv_ext2 = ysb + 1
dy_ext0 = dy0 - 1 - 2 * SQUISH_CONSTANT_4D
dy_ext1 = dy_ext2 = dy0 - 1 - SQUISH_CONSTANT_4D
if (c & 0x04) == 0:
zsv_ext0 = zsv_ext1 = zsv_ext2 = zsb
dz_ext0 = dz0 - 2 * SQUISH_CONSTANT_4D
dz_ext1 = dz_ext2 = dz0 - SQUISH_CONSTANT_4D
if (c & 0x03) == 0x03:
zsv_ext1 -= 1
dz_ext1 += 1
else:
zsv_ext2 -= 1
dz_ext2 += 1
else:
zsv_ext0 = zsv_ext1 = zsv_ext2 = zsb + 1
dz_ext0 = dz0 - 1 - 2 * SQUISH_CONSTANT_4D
dz_ext1 = dz_ext2 = dz0 - 1 - SQUISH_CONSTANT_4D
if (c & 0x08) == 0:
wsv_ext0 = wsv_ext1 = wsb
wsv_ext2 = wsb - 1
dw_ext0 = dw0 - 2 * SQUISH_CONSTANT_4D
dw_ext1 = dw0 - SQUISH_CONSTANT_4D
dw_ext2 = dw0 + 1 - SQUISH_CONSTANT_4D
else:
wsv_ext0 = wsv_ext1 = wsv_ext2 = wsb + 1
dw_ext0 = dw0 - 1 - 2 * SQUISH_CONSTANT_4D
dw_ext1 = dw_ext2 = dw0 - 1 - SQUISH_CONSTANT_4D
# Contribution (0,0,0,0)
attn0 = 2 - dx0 * dx0 - dy0 * dy0 - dz0 * dz0 - dw0 * dw0
if attn0 > 0:
attn0 *= attn0
value += attn0 * attn0 * extrapolate(xsb + 0, ysb + 0, zsb + 0, wsb + 0, dx0, dy0, dz0, dw0)
# Contribution (1,0,0,0)
dx1 = dx0 - 1 - SQUISH_CONSTANT_4D
dy1 = dy0 - 0 - SQUISH_CONSTANT_4D
dz1 = dz0 - 0 - SQUISH_CONSTANT_4D
dw1 = dw0 - 0 - SQUISH_CONSTANT_4D
attn1 = 2 - dx1 * dx1 - dy1 * dy1 - dz1 * dz1 - dw1 * dw1
if attn1 > 0:
attn1 *= attn1
value += attn1 * attn1 * extrapolate(xsb + 1, ysb + 0, zsb + 0, wsb + 0, dx1, dy1, dz1, dw1)
# Contribution (0,1,0,0)
dx2 = dx0 - 0 - SQUISH_CONSTANT_4D
dy2 = dy0 - 1 - SQUISH_CONSTANT_4D
dz2 = dz1
dw2 = dw1
attn2 = 2 - dx2 * dx2 - dy2 * dy2 - dz2 * dz2 - dw2 * dw2
if attn2 > 0:
attn2 *= attn2
value += attn2 * attn2 * extrapolate(xsb + 0, ysb + 1, zsb + 0, wsb + 0, dx2, dy2, dz2, dw2)
# Contribution (0,0,1,0)
dx3 = dx2
dy3 = dy1
dz3 = dz0 - 1 - SQUISH_CONSTANT_4D
dw3 = dw1
attn3 = 2 - dx3 * dx3 - dy3 * dy3 - dz3 * dz3 - dw3 * dw3
if attn3 > 0:
attn3 *= attn3
value += attn3 * attn3 * extrapolate(xsb + 0, ysb + 0, zsb + 1, wsb + 0, dx3, dy3, dz3, dw3)
# Contribution (0,0,0,1)
dx4 = dx2
dy4 = dy1
dz4 = dz1
dw4 = dw0 - 1 - SQUISH_CONSTANT_4D
attn4 = 2 - dx4 * dx4 - dy4 * dy4 - dz4 * dz4 - dw4 * dw4
if attn4 > 0:
attn4 *= attn4
value += attn4 * attn4 * extrapolate(xsb + 0, ysb + 0, zsb + 0, wsb + 1, dx4, dy4, dz4, dw4)
elif in_sum >= 3: # We're inside the pentachoron (4-Simplex) at (1,1,1,1)
# Determine which two of (1,1,1,0), (1,1,0,1), (1,0,1,1), (0,1,1,1) are closest.
a_po = 0x0E
a_score = xins
b_po = 0x0D
b_score = yins
if a_score <= b_score and zins < b_score:
b_score = zins
b_po = 0x0B
elif a_score > b_score and zins < a_score:
a_score = zins
a_po = 0x0B
if a_score <= b_score and wins < b_score:
b_score = wins
b_po = 0x07
elif a_score > b_score and wins < a_score:
a_score = wins
a_po = 0x07
# Now we determine the three lattice pos not part of the pentachoron that may contribute.
# This depends on the closest two pentachoron vertices, including (0,0,0,0)
uins = 4 - in_sum
if uins < a_score or uins < b_score: # (1,1,1,1) is one of the closest two pentachoron vertices.
c = b_po if (b_score < a_score) else a_po # Our other closest vertex is the closest out of a and b.
if (c & 0x01) != 0:
xsv_ext0 = xsb + 2
xsv_ext1 = xsv_ext2 = xsb + 1
dx_ext0 = dx0 - 2 - 4 * SQUISH_CONSTANT_4D
dx_ext1 = dx_ext2 = dx0 - 1 - 4 * SQUISH_CONSTANT_4D
else:
xsv_ext0 = xsv_ext1 = xsv_ext2 = xsb
dx_ext0 = dx_ext1 = dx_ext2 = dx0 - 4 * SQUISH_CONSTANT_4D
if (c & 0x02) != 0:
ysv_ext0 = ysv_ext1 = ysv_ext2 = ysb + 1
dy_ext0 = dy_ext1 = dy_ext2 = dy0 - 1 - 4 * SQUISH_CONSTANT_4D
if (c & 0x01) != 0:
ysv_ext1 += 1
dy_ext1 -= 1
else:
ysv_ext0 += 1
dy_ext0 -= 1
else:
ysv_ext0 = ysv_ext1 = ysv_ext2 = ysb
dy_ext0 = dy_ext1 = dy_ext2 = dy0 - 4 * SQUISH_CONSTANT_4D
if (c & 0x04) != 0:
zsv_ext0 = zsv_ext1 = zsv_ext2 = zsb + 1
dz_ext0 = dz_ext1 = dz_ext2 = dz0 - 1 - 4 * SQUISH_CONSTANT_4D
if (c & 0x03) != 0x03:
if (c & 0x03) == 0:
zsv_ext0 += 1
dz_ext0 -= 1
else:
zsv_ext1 += 1
dz_ext1 -= 1
else:
zsv_ext2 += 1
dz_ext2 -= 1
else:
zsv_ext0 = zsv_ext1 = zsv_ext2 = zsb
dz_ext0 = dz_ext1 = dz_ext2 = dz0 - 4 * SQUISH_CONSTANT_4D
if (c & 0x08) != 0:
wsv_ext0 = wsv_ext1 = wsb + 1
wsv_ext2 = wsb + 2
dw_ext0 = dw_ext1 = dw0 - 1 - 4 * SQUISH_CONSTANT_4D
dw_ext2 = dw0 - 2 - 4 * SQUISH_CONSTANT_4D
else:
wsv_ext0 = wsv_ext1 = wsv_ext2 = wsb
dw_ext0 = dw_ext1 = dw_ext2 = dw0 - 4 * SQUISH_CONSTANT_4D
else: # (1,1,1,1) is not one of the closest two pentachoron vertices.
c = (a_po & b_po) # Our three extra vertices are determined by the closest two.
if (c & 0x01) != 0:
xsv_ext0 = xsv_ext2 = xsb + 1
xsv_ext1 = xsb + 2
dx_ext0 = dx0 - 1 - 2 * SQUISH_CONSTANT_4D
dx_ext1 = dx0 - 2 - 3 * SQUISH_CONSTANT_4D
dx_ext2 = dx0 - 1 - 3 * SQUISH_CONSTANT_4D
else:
xsv_ext0 = xsv_ext1 = xsv_ext2 = xsb
dx_ext0 = dx0 - 2 * SQUISH_CONSTANT_4D
dx_ext1 = dx_ext2 = dx0 - 3 * SQUISH_CONSTANT_4D
if (c & 0x02) != 0:
ysv_ext0 = ysv_ext1 = ysv_ext2 = ysb + 1
dy_ext0 = dy0 - 1 - 2 * SQUISH_CONSTANT_4D
dy_ext1 = dy_ext2 = dy0 - 1 - 3 * SQUISH_CONSTANT_4D
if (c & 0x01) != 0:
ysv_ext2 += 1
dy_ext2 -= 1
else:
ysv_ext1 += 1
dy_ext1 -= 1
else:
ysv_ext0 = ysv_ext1 = ysv_ext2 = ysb
dy_ext0 = dy0 - 2 * SQUISH_CONSTANT_4D
dy_ext1 = dy_ext2 = dy0 - 3 * SQUISH_CONSTANT_4D
if (c & 0x04) != 0:
zsv_ext0 = zsv_ext1 = zsv_ext2 = zsb + 1
dz_ext0 = dz0 - 1 - 2 * SQUISH_CONSTANT_4D
dz_ext1 = dz_ext2 = dz0 - 1 - 3 * SQUISH_CONSTANT_4D
if (c & 0x03) != 0:
zsv_ext2 += 1
dz_ext2 -= 1
else:
zsv_ext1 += 1
dz_ext1 -= 1
else:
zsv_ext0 = zsv_ext1 = zsv_ext2 = zsb
dz_ext0 = dz0 - 2 * SQUISH_CONSTANT_4D
dz_ext1 = dz_ext2 = dz0 - 3 * SQUISH_CONSTANT_4D
if (c & 0x08) != 0:
wsv_ext0 = wsv_ext1 = wsb + 1
wsv_ext2 = wsb + 2
dw_ext0 = dw0 - 1 - 2 * SQUISH_CONSTANT_4D
dw_ext1 = dw0 - 1 - 3 * SQUISH_CONSTANT_4D
dw_ext2 = dw0 - 2 - 3 * SQUISH_CONSTANT_4D
else:
wsv_ext0 = wsv_ext1 = wsv_ext2 = wsb
dw_ext0 = dw0 - 2 * SQUISH_CONSTANT_4D
dw_ext1 = dw_ext2 = dw0 - 3 * SQUISH_CONSTANT_4D
# Contribution (1,1,1,0)
dx4 = dx0 - 1 - 3 * SQUISH_CONSTANT_4D
dy4 = dy0 - 1 - 3 * SQUISH_CONSTANT_4D
dz4 = dz0 - 1 - 3 * SQUISH_CONSTANT_4D
dw4 = dw0 - 3 * SQUISH_CONSTANT_4D
attn4 = 2 - dx4 * dx4 - dy4 * dy4 - dz4 * dz4 - dw4 * dw4
if attn4 > 0:
attn4 *= attn4
value += attn4 * attn4 * extrapolate(xsb + 1, ysb + 1, zsb + 1, wsb + 0, dx4, dy4, dz4, dw4)
# Contribution (1,1,0,1)
dx3 = dx4
dy3 = dy4
dz3 = dz0 - 3 * SQUISH_CONSTANT_4D
dw3 = dw0 - 1 - 3 * SQUISH_CONSTANT_4D
attn3 = 2 - dx3 * dx3 - dy3 * dy3 - dz3 * dz3 - dw3 * dw3
if attn3 > 0:
attn3 *= attn3
value += attn3 * attn3 * extrapolate(xsb + 1, ysb + 1, zsb + 0, wsb + 1, dx3, dy3, dz3, dw3)
# Contribution (1,0,1,1)
dx2 = dx4
dy2 = dy0 - 3 * SQUISH_CONSTANT_4D
dz2 = dz4
dw2 = dw3
attn2 = 2 - dx2 * dx2 - dy2 * dy2 - dz2 * dz2 - dw2 * dw2
if attn2 > 0:
attn2 *= attn2
value += attn2 * attn2 * extrapolate(xsb + 1, ysb + 0, zsb + 1, wsb + 1, dx2, dy2, dz2, dw2)
# Contribution (0,1,1,1)
dx1 = dx0 - 3 * SQUISH_CONSTANT_4D
dz1 = dz4
dy1 = dy4
dw1 = dw3
attn1 = 2 - dx1 * dx1 - dy1 * dy1 - dz1 * dz1 - dw1 * dw1
if attn1 > 0:
attn1 *= attn1
value += attn1 * attn1 * extrapolate(xsb + 0, ysb + 1, zsb + 1, wsb + 1, dx1, dy1, dz1, dw1)
# Contribution (1,1,1,1)
dx0 = dx0 - 1 - 4 * SQUISH_CONSTANT_4D
dy0 = dy0 - 1 - 4 * SQUISH_CONSTANT_4D
dz0 = dz0 - 1 - 4 * SQUISH_CONSTANT_4D
dw0 = dw0 - 1 - 4 * SQUISH_CONSTANT_4D
attn0 = 2 - dx0 * dx0 - dy0 * dy0 - dz0 * dz0 - dw0 * dw0
if attn0 > 0:
attn0 *= attn0
value += attn0 * attn0 * extrapolate(xsb + 1, ysb + 1, zsb + 1, wsb + 1, dx0, dy0, dz0, dw0)
elif in_sum <= 2: # We're inside the first dispentachoron (Rectified 4-Simplex)
a_is_bigger_side = True
b_is_bigger_side = True
# Decide between (1,1,0,0) and (0,0,1,1)
if xins + yins > zins + wins:
a_score = xins + yins
a_po = 0x03
else:
a_score = zins + wins
a_po = 0x0C
# Decide between (1,0,1,0) and (0,1,0,1)
if xins + zins > yins + wins:
b_score = xins + zins
b_po = 0x05
else:
b_score = yins + wins
b_po = 0x0A
# Closer between (1,0,0,1) and (0,1,1,0) will replace the further of a and b, if closer.
if xins + wins > yins + zins:
score = xins + wins
if a_score >= b_score and score > b_score:
b_score = score
b_po = 0x09
elif a_score < b_score and score > a_score:
a_score = score
a_po = 0x09
else:
score = yins + zins
if a_score >= b_score and score > b_score:
b_score = score
b_po = 0x06
elif a_score < b_score and score > a_score:
a_score = score
a_po = 0x06
# Decide if (1,0,0,0) is closer.
p1 = 2 - in_sum + xins
if a_score >= b_score and p1 > b_score:
b_score = p1
b_po = 0x01
b_is_bigger_side = False
elif a_score < b_score and p1 > a_score:
a_score = p1
a_po = 0x01
a_is_bigger_side = False
# Decide if (0,1,0,0) is closer.
p2 = 2 - in_sum + yins
if a_score >= b_score and p2 > b_score:
b_score = p2
b_po = 0x02
b_is_bigger_side = False
elif a_score < b_score and p2 > a_score:
a_score = p2
a_po = 0x02
a_is_bigger_side = False
# Decide if (0,0,1,0) is closer.
p3 = 2 - in_sum + zins
if a_score >= b_score and p3 > b_score:
b_score = p3
b_po = 0x04
b_is_bigger_side = False
elif a_score < b_score and p3 > a_score:
a_score = p3
a_po = 0x04
a_is_bigger_side = False
# Decide if (0,0,0,1) is closer.
p4 = 2 - in_sum + wins
if a_score >= b_score and p4 > b_score:
b_po = 0x08
b_is_bigger_side = False
elif a_score < b_score and p4 > a_score:
a_po = 0x08
a_is_bigger_side = False
# Where each of the two closest pos are determines how the extra three vertices are calculated.
if a_is_bigger_side == b_is_bigger_side:
if a_is_bigger_side: # Both closest pos on the bigger side
c1 = (a_po | b_po)
c2 = (a_po & b_po)
if (c1 & 0x01) == 0:
xsv_ext0 = xsb
xsv_ext1 = xsb - 1
dx_ext0 = dx0 - 3 * SQUISH_CONSTANT_4D
dx_ext1 = dx0 + 1 - 2 * SQUISH_CONSTANT_4D
else:
xsv_ext0 = xsv_ext1 = xsb + 1
dx_ext0 = dx0 - 1 - 3 * SQUISH_CONSTANT_4D
dx_ext1 = dx0 - 1 - 2 * SQUISH_CONSTANT_4D
if (c1 & 0x02) == 0:
ysv_ext0 = ysb
ysv_ext1 = ysb - 1
dy_ext0 = dy0 - 3 * SQUISH_CONSTANT_4D
dy_ext1 = dy0 + 1 - 2 * SQUISH_CONSTANT_4D
else:
ysv_ext0 = ysv_ext1 = ysb + 1
dy_ext0 = dy0 - 1 - 3 * SQUISH_CONSTANT_4D
dy_ext1 = dy0 - 1 - 2 * SQUISH_CONSTANT_4D
if (c1 & 0x04) == 0:
zsv_ext0 = zsb
zsv_ext1 = zsb - 1
dz_ext0 = dz0 - 3 * SQUISH_CONSTANT_4D
dz_ext1 = dz0 + 1 - 2 * SQUISH_CONSTANT_4D
else:
zsv_ext0 = zsv_ext1 = zsb + 1
dz_ext0 = dz0 - 1 - 3 * SQUISH_CONSTANT_4D
dz_ext1 = dz0 - 1 - 2 * SQUISH_CONSTANT_4D
if (c1 & 0x08) == 0:
wsv_ext0 = wsb
wsv_ext1 = wsb - 1
dw_ext0 = dw0 - 3 * SQUISH_CONSTANT_4D
dw_ext1 = dw0 + 1 - 2 * SQUISH_CONSTANT_4D
else:
wsv_ext0 = wsv_ext1 = wsb + 1
dw_ext0 = dw0 - 1 - 3 * SQUISH_CONSTANT_4D
dw_ext1 = dw0 - 1 - 2 * SQUISH_CONSTANT_4D
# One combination is a _permutation of (0,0,0,2) based on c2
xsv_ext2 = xsb
ysv_ext2 = ysb
zsv_ext2 = zsb
wsv_ext2 = wsb
dx_ext2 = dx0 - 2 * SQUISH_CONSTANT_4D
dy_ext2 = dy0 - 2 * SQUISH_CONSTANT_4D
dz_ext2 = dz0 - 2 * SQUISH_CONSTANT_4D
dw_ext2 = dw0 - 2 * SQUISH_CONSTANT_4D
if (c2 & 0x01) != 0:
xsv_ext2 += 2
dx_ext2 -= 2
elif (c2 & 0x02) != 0:
ysv_ext2 += 2
dy_ext2 -= 2
elif (c2 & 0x04) != 0:
zsv_ext2 += 2
dz_ext2 -= 2
else:
wsv_ext2 += 2
dw_ext2 -= 2
else: # Both closest pos on the smaller side
# One of the two extra pos is (0,0,0,0)
xsv_ext2 = xsb
ysv_ext2 = ysb
zsv_ext2 = zsb
wsv_ext2 = wsb
dx_ext2 = dx0
dy_ext2 = dy0
dz_ext2 = dz0
dw_ext2 = dw0
# Other two pos are based on the omitted axes.
c = (a_po | b_po)
if (c & 0x01) == 0:
xsv_ext0 = xsb - 1
xsv_ext1 = xsb
dx_ext0 = dx0 + 1 - SQUISH_CONSTANT_4D
dx_ext1 = dx0 - SQUISH_CONSTANT_4D
else:
xsv_ext0 = xsv_ext1 = xsb + 1
dx_ext0 = dx_ext1 = dx0 - 1 - SQUISH_CONSTANT_4D
if (c & 0x02) == 0:
ysv_ext0 = ysv_ext1 = ysb
dy_ext0 = dy_ext1 = dy0 - SQUISH_CONSTANT_4D
if (c & 0x01) == 0x01:
ysv_ext0 -= 1
dy_ext0 += 1
else:
ysv_ext1 -= 1
dy_ext1 += 1
else:
ysv_ext0 = ysv_ext1 = ysb + 1
dy_ext0 = dy_ext1 = dy0 - 1 - SQUISH_CONSTANT_4D
if (c & 0x04) == 0:
zsv_ext0 = zsv_ext1 = zsb
dz_ext0 = dz_ext1 = dz0 - SQUISH_CONSTANT_4D
if (c & 0x03) == 0x03:
zsv_ext0 -= 1
dz_ext0 += 1
else:
zsv_ext1 -= 1
dz_ext1 += 1
else:
zsv_ext0 = zsv_ext1 = zsb + 1
dz_ext0 = dz_ext1 = dz0 - 1 - SQUISH_CONSTANT_4D
if (c & 0x08) == 0:
wsv_ext0 = wsb
wsv_ext1 = wsb - 1
dw_ext0 = dw0 - SQUISH_CONSTANT_4D
dw_ext1 = dw0 + 1 - SQUISH_CONSTANT_4D
else:
wsv_ext0 = wsv_ext1 = wsb + 1
dw_ext0 = dw_ext1 = dw0 - 1 - SQUISH_CONSTANT_4D
else: # One po on each "side"
if a_is_bigger_side:
c1 = a_po
c2 = b_po
else:
c1 = b_po
c2 = a_po
# Two contributions are the bigger-sided po with each 0 replaced with -1.
if (c1 & 0x01) == 0:
xsv_ext0 = xsb - 1
xsv_ext1 = xsb
dx_ext0 = dx0 + 1 - SQUISH_CONSTANT_4D
dx_ext1 = dx0 - SQUISH_CONSTANT_4D
else:
xsv_ext0 = xsv_ext1 = xsb + 1
dx_ext0 = dx_ext1 = dx0 - 1 - SQUISH_CONSTANT_4D
if (c1 & 0x02) == 0:
ysv_ext0 = ysv_ext1 = ysb
dy_ext0 = dy_ext1 = dy0 - SQUISH_CONSTANT_4D
if (c1 & 0x01) == 0x01:
ysv_ext0 -= 1
dy_ext0 += 1
else:
ysv_ext1 -= 1
dy_ext1 += 1
else:
ysv_ext0 = ysv_ext1 = ysb + 1
dy_ext0 = dy_ext1 = dy0 - 1 - SQUISH_CONSTANT_4D
if (c1 & 0x04) == 0:
zsv_ext0 = zsv_ext1 = zsb
dz_ext0 = dz_ext1 = dz0 - SQUISH_CONSTANT_4D
if (c1 & 0x03) == 0x03:
zsv_ext0 -= 1
dz_ext0 += 1
else:
zsv_ext1 -= 1
dz_ext1 += 1
else:
zsv_ext0 = zsv_ext1 = zsb + 1
dz_ext0 = dz_ext1 = dz0 - 1 - SQUISH_CONSTANT_4D
if (c1 & 0x08) == 0:
wsv_ext0 = wsb
wsv_ext1 = wsb - 1
dw_ext0 = dw0 - SQUISH_CONSTANT_4D
dw_ext1 = dw0 + 1 - SQUISH_CONSTANT_4D
else:
wsv_ext0 = wsv_ext1 = wsb + 1
dw_ext0 = dw_ext1 = dw0 - 1 - SQUISH_CONSTANT_4D
# One contribution is a _permutation of (0,0,0,2) based on the smaller-sided po
xsv_ext2 = xsb
ysv_ext2 = ysb
zsv_ext2 = zsb
wsv_ext2 = wsb
dx_ext2 = dx0 - 2 * SQUISH_CONSTANT_4D
dy_ext2 = dy0 - 2 * SQUISH_CONSTANT_4D
dz_ext2 = dz0 - 2 * SQUISH_CONSTANT_4D
dw_ext2 = dw0 - 2 * SQUISH_CONSTANT_4D
if (c2 & 0x01) != 0:
xsv_ext2 += 2
dx_ext2 -= 2
elif (c2 & 0x02) != 0:
ysv_ext2 += 2
dy_ext2 -= 2
elif (c2 & 0x04) != 0:
zsv_ext2 += 2
dz_ext2 -= 2
else:
wsv_ext2 += 2
dw_ext2 -= 2
# Contribution (1,0,0,0)
dx1 = dx0 - 1 - SQUISH_CONSTANT_4D
dy1 = dy0 - 0 - SQUISH_CONSTANT_4D
dz1 = dz0 - 0 - SQUISH_CONSTANT_4D
dw1 = dw0 - 0 - SQUISH_CONSTANT_4D
attn1 = 2 - dx1 * dx1 - dy1 * dy1 - dz1 * dz1 - dw1 * dw1
if attn1 > 0:
attn1 *= attn1
value += attn1 * attn1 * extrapolate(xsb + 1, ysb + 0, zsb + 0, wsb + 0, dx1, dy1, dz1, dw1)
# Contribution (0,1,0,0)
dx2 = dx0 - 0 - SQUISH_CONSTANT_4D
dy2 = dy0 - 1 - SQUISH_CONSTANT_4D
dz2 = dz1
dw2 = dw1
attn2 = 2 - dx2 * dx2 - dy2 * dy2 - dz2 * dz2 - dw2 * dw2
if attn2 > 0:
attn2 *= attn2
value += attn2 * attn2 * extrapolate(xsb + 0, ysb + 1, zsb + 0, wsb + 0, dx2, dy2, dz2, dw2)
# Contribution (0,0,1,0)
dx3 = dx2
dy3 = dy1
dz3 = dz0 - 1 - SQUISH_CONSTANT_4D
dw3 = dw1
attn3 = 2 - dx3 * dx3 - dy3 * dy3 - dz3 * dz3 - dw3 * dw3
if attn3 > 0:
attn3 *= attn3
value += attn3 * attn3 * extrapolate(xsb + 0, ysb + 0, zsb + 1, wsb + 0, dx3, dy3, dz3, dw3)
# Contribution (0,0,0,1)
dx4 = dx2
dy4 = dy1
dz4 = dz1
dw4 = dw0 - 1 - SQUISH_CONSTANT_4D
attn4 = 2 - dx4 * dx4 - dy4 * dy4 - dz4 * dz4 - dw4 * dw4
if attn4 > 0:
attn4 *= attn4
value += attn4 * attn4 * extrapolate(xsb + 0, ysb + 0, zsb + 0, wsb + 1, dx4, dy4, dz4, dw4)
# Contribution (1,1,0,0)
dx5 = dx0 - 1 - 2 * SQUISH_CONSTANT_4D
dy5 = dy0 - 1 - 2 * SQUISH_CONSTANT_4D
dz5 = dz0 - 0 - 2 * SQUISH_CONSTANT_4D
dw5 = dw0 - 0 - 2 * SQUISH_CONSTANT_4D
attn5 = 2 - dx5 * dx5 - dy5 * dy5 - dz5 * dz5 - dw5 * dw5
if attn5 > 0:
attn5 *= attn5
value += attn5 * attn5 * extrapolate(xsb + 1, ysb + 1, zsb + 0, wsb + 0, dx5, dy5, dz5, dw5)
# Contribution (1,0,1,0)
dx6 = dx0 - 1 - 2 * SQUISH_CONSTANT_4D
dy6 = dy0 - 0 - 2 * SQUISH_CONSTANT_4D
dz6 = dz0 - 1 - 2 * SQUISH_CONSTANT_4D
dw6 = dw0 - 0 - 2 * SQUISH_CONSTANT_4D
attn6 = 2 - dx6 * dx6 - dy6 * dy6 - dz6 * dz6 - dw6 * dw6
if attn6 > 0:
attn6 *= attn6
value += attn6 * attn6 * extrapolate(xsb + 1, ysb + 0, zsb + 1, wsb + 0, dx6, dy6, dz6, dw6)
# Contribution (1,0,0,1)
dx7 = dx0 - 1 - 2 * SQUISH_CONSTANT_4D
dy7 = dy0 - 0 - 2 * SQUISH_CONSTANT_4D
dz7 = dz0 - 0 - 2 * SQUISH_CONSTANT_4D
dw7 = dw0 - 1 - 2 * SQUISH_CONSTANT_4D
attn7 = 2 - dx7 * dx7 - dy7 * dy7 - dz7 * dz7 - dw7 * dw7
if attn7 > 0:
attn7 *= attn7
value += attn7 * attn7 * extrapolate(xsb + 1, ysb + 0, zsb + 0, wsb + 1, dx7, dy7, dz7, dw7)
# Contribution (0,1,1,0)
dx8 = dx0 - 0 - 2 * SQUISH_CONSTANT_4D
dy8 = dy0 - 1 - 2 * SQUISH_CONSTANT_4D
dz8 = dz0 - 1 - 2 * SQUISH_CONSTANT_4D
dw8 = dw0 - 0 - 2 * SQUISH_CONSTANT_4D
attn8 = 2 - dx8 * dx8 - dy8 * dy8 - dz8 * dz8 - dw8 * dw8
if attn8 > 0:
attn8 *= attn8
value += attn8 * attn8 * extrapolate(xsb + 0, ysb + 1, zsb + 1, wsb + 0, dx8, dy8, dz8, dw8)
# Contribution (0,1,0,1)
dx9 = dx0 - 0 - 2 * SQUISH_CONSTANT_4D
dy9 = dy0 - 1 - 2 * SQUISH_CONSTANT_4D
dz9 = dz0 - 0 - 2 * SQUISH_CONSTANT_4D
dw9 = dw0 - 1 - 2 * SQUISH_CONSTANT_4D
attn9 = 2 - dx9 * dx9 - dy9 * dy9 - dz9 * dz9 - dw9 * dw9
if attn9 > 0:
attn9 *= attn9
value += attn9 * attn9 * extrapolate(xsb + 0, ysb + 1, zsb + 0, wsb + 1, dx9, dy9, dz9, dw9)
# Contribution (0,0,1,1)
dx10 = dx0 - 0 - 2 * SQUISH_CONSTANT_4D
dy10 = dy0 - 0 - 2 * SQUISH_CONSTANT_4D
dz10 = dz0 - 1 - 2 * SQUISH_CONSTANT_4D
dw10 = dw0 - 1 - 2 * SQUISH_CONSTANT_4D
attn10 = 2 - dx10 * dx10 - dy10 * dy10 - dz10 * dz10 - dw10 * dw10
if attn10 > 0:
attn10 *= attn10
value += attn10 * attn10 * extrapolate(xsb + 0, ysb + 0, zsb + 1, wsb + 1, dx10, dy10, dz10, dw10)
else: # We're inside the second dispentachoron (Rectified 4-Simplex)
a_is_bigger_side = True
b_is_bigger_side = True
# Decide between (0,0,1,1) and (1,1,0,0)
if xins + yins < zins + wins:
a_score = xins + yins
a_po = 0x0C
else:
a_score = zins + wins
a_po = 0x03
# Decide between (0,1,0,1) and (1,0,1,0)
if xins + zins < yins + wins:
b_score = xins + zins
b_po = 0x0A
else:
b_score = yins + wins
b_po = 0x05
# Closer between (0,1,1,0) and (1,0,0,1) will replace the further of a and b, if closer.
if xins + wins < yins + zins:
score = xins + wins
if a_score <= b_score and score < b_score:
b_score = score
b_po = 0x06
elif a_score > b_score and score < a_score:
a_score = score
a_po = 0x06
else:
score = yins + zins
if a_score <= b_score and score < b_score:
b_score = score
b_po = 0x09
elif a_score > b_score and score < a_score:
a_score = score
a_po = 0x09
# Decide if (0,1,1,1) is closer.
p1 = 3 - in_sum + xins
if a_score <= b_score and p1 < b_score:
b_score = p1
b_po = 0x0E
b_is_bigger_side = False
elif a_score > b_score and p1 < a_score:
a_score = p1
a_po = 0x0E
a_is_bigger_side = False
# Decide if (1,0,1,1) is closer.
p2 = 3 - in_sum + yins
if a_score <= b_score and p2 < b_score:
b_score = p2
b_po = 0x0D
b_is_bigger_side = False
elif a_score > b_score and p2 < a_score:
a_score = p2
a_po = 0x0D
a_is_bigger_side = False
# Decide if (1,1,0,1) is closer.
p3 = 3 - in_sum + zins
if a_score <= b_score and p3 < b_score:
b_score = p3
b_po = 0x0B
b_is_bigger_side = False
elif a_score > b_score and p3 < a_score:
a_score = p3
a_po = 0x0B
a_is_bigger_side = False
# Decide if (1,1,1,0) is closer.
p4 = 3 - in_sum + wins
if a_score <= b_score and p4 < b_score:
b_po = 0x07
b_is_bigger_side = False
elif a_score > b_score and p4 < a_score:
a_po = 0x07
a_is_bigger_side = False
# Where each of the two closest pos are determines how the extra three vertices are calculated.
if a_is_bigger_side == b_is_bigger_side:
if a_is_bigger_side: # Both closest pos on the bigger side
c1 = (a_po & b_po)
c2 = (a_po | b_po)
# Two contributions are _permutations of (0,0,0,1) and (0,0,0,2) based on c1
xsv_ext0 = xsv_ext1 = xsb
ysv_ext0 = ysv_ext1 = ysb
zsv_ext0 = zsv_ext1 = zsb
wsv_ext0 = wsv_ext1 = wsb
dx_ext0 = dx0 - SQUISH_CONSTANT_4D
dy_ext0 = dy0 - SQUISH_CONSTANT_4D
dz_ext0 = dz0 - SQUISH_CONSTANT_4D
dw_ext0 = dw0 - SQUISH_CONSTANT_4D
dx_ext1 = dx0 - 2 * SQUISH_CONSTANT_4D
dy_ext1 = dy0 - 2 * SQUISH_CONSTANT_4D
dz_ext1 = dz0 - 2 * SQUISH_CONSTANT_4D
dw_ext1 = dw0 - 2 * SQUISH_CONSTANT_4D
if (c1 & 0x01) != 0:
xsv_ext0 += 1
dx_ext0 -= 1
xsv_ext1 += 2
dx_ext1 -= 2
elif (c1 & 0x02) != 0:
ysv_ext0 += 1
dy_ext0 -= 1
ysv_ext1 += 2
dy_ext1 -= 2
elif (c1 & 0x04) != 0:
zsv_ext0 += 1
dz_ext0 -= 1
zsv_ext1 += 2
dz_ext1 -= 2
else:
wsv_ext0 += 1
dw_ext0 -= 1
wsv_ext1 += 2
dw_ext1 -= 2
# One contribution is a _permutation of (1,1,1,-1) based on c2
xsv_ext2 = xsb + 1
ysv_ext2 = ysb + 1
zsv_ext2 = zsb + 1
wsv_ext2 = wsb + 1
dx_ext2 = dx0 - 1 - 2 * SQUISH_CONSTANT_4D
dy_ext2 = dy0 - 1 - 2 * SQUISH_CONSTANT_4D
dz_ext2 = dz0 - 1 - 2 * SQUISH_CONSTANT_4D
dw_ext2 = dw0 - 1 - 2 * SQUISH_CONSTANT_4D
if (c2 & 0x01) == 0:
xsv_ext2 -= 2
dx_ext2 += 2
elif (c2 & 0x02) == 0:
ysv_ext2 -= 2
dy_ext2 += 2
elif (c2 & 0x04) == 0:
zsv_ext2 -= 2
dz_ext2 += 2
else:
wsv_ext2 -= 2
dw_ext2 += 2
else: # Both closest pos on the smaller side
# One of the two extra pos is (1,1,1,1)
xsv_ext2 = xsb + 1
ysv_ext2 = ysb + 1
zsv_ext2 = zsb + 1
wsv_ext2 = wsb + 1
dx_ext2 = dx0 - 1 - 4 * SQUISH_CONSTANT_4D
dy_ext2 = dy0 - 1 - 4 * SQUISH_CONSTANT_4D
dz_ext2 = dz0 - 1 - 4 * SQUISH_CONSTANT_4D
dw_ext2 = dw0 - 1 - 4 * SQUISH_CONSTANT_4D
# Other two pos are based on the shared axes.
c = (a_po & b_po)
if (c & 0x01) != 0:
xsv_ext0 = xsb + 2
xsv_ext1 = xsb + 1
dx_ext0 = dx0 - 2 - 3 * SQUISH_CONSTANT_4D
dx_ext1 = dx0 - 1 - 3 * SQUISH_CONSTANT_4D
else:
xsv_ext0 = xsv_ext1 = xsb
dx_ext0 = dx_ext1 = dx0 - 3 * SQUISH_CONSTANT_4D
if (c & 0x02) != 0:
ysv_ext0 = ysv_ext1 = ysb + 1
dy_ext0 = dy_ext1 = dy0 - 1 - 3 * SQUISH_CONSTANT_4D
if (c & 0x01) == 0:
ysv_ext0 += 1
dy_ext0 -= 1
else:
ysv_ext1 += 1
dy_ext1 -= 1
else:
ysv_ext0 = ysv_ext1 = ysb
dy_ext0 = dy_ext1 = dy0 - 3 * SQUISH_CONSTANT_4D
if (c & 0x04) != 0:
zsv_ext0 = zsv_ext1 = zsb + 1
dz_ext0 = dz_ext1 = dz0 - 1 - 3 * SQUISH_CONSTANT_4D
if (c & 0x03) == 0:
zsv_ext0 += 1
dz_ext0 -= 1
else:
zsv_ext1 += 1
dz_ext1 -= 1
else:
zsv_ext0 = zsv_ext1 = zsb
dz_ext0 = dz_ext1 = dz0 - 3 * SQUISH_CONSTANT_4D
if (c & 0x08) != 0:
wsv_ext0 = wsb + 1
wsv_ext1 = wsb + 2
dw_ext0 = dw0 - 1 - 3 * SQUISH_CONSTANT_4D
dw_ext1 = dw0 - 2 - 3 * SQUISH_CONSTANT_4D
else:
wsv_ext0 = wsv_ext1 = wsb
dw_ext0 = dw_ext1 = dw0 - 3 * SQUISH_CONSTANT_4D
else: # One po on each "side"
if a_is_bigger_side:
c1 = a_po
c2 = b_po
else:
c1 = b_po
c2 = a_po
# Two contributions are the bigger-sided po with each 1 replaced with 2.
if (c1 & 0x01) != 0:
xsv_ext0 = xsb + 2
xsv_ext1 = xsb + 1
dx_ext0 = dx0 - 2 - 3 * SQUISH_CONSTANT_4D
dx_ext1 = dx0 - 1 - 3 * SQUISH_CONSTANT_4D
else:
xsv_ext0 = xsv_ext1 = xsb
dx_ext0 = dx_ext1 = dx0 - 3 * SQUISH_CONSTANT_4D
if (c1 & 0x02) != 0:
ysv_ext0 = ysv_ext1 = ysb + 1
dy_ext0 = dy_ext1 = dy0 - 1 - 3 * SQUISH_CONSTANT_4D
if (c1 & 0x01) == 0:
ysv_ext0 += 1
dy_ext0 -= 1
else:
ysv_ext1 += 1
dy_ext1 -= 1
else:
ysv_ext0 = ysv_ext1 = ysb
dy_ext0 = dy_ext1 = dy0 - 3 * SQUISH_CONSTANT_4D
if (c1 & 0x04) != 0:
zsv_ext0 = zsv_ext1 = zsb + 1
dz_ext0 = dz_ext1 = dz0 - 1 - 3 * SQUISH_CONSTANT_4D
if (c1 & 0x03) == 0:
zsv_ext0 += 1
dz_ext0 -= 1
else:
zsv_ext1 += 1
dz_ext1 -= 1
else:
zsv_ext0 = zsv_ext1 = zsb
dz_ext0 = dz_ext1 = dz0 - 3 * SQUISH_CONSTANT_4D
if (c1 & 0x08) != 0:
wsv_ext0 = wsb + 1
wsv_ext1 = wsb + 2
dw_ext0 = dw0 - 1 - 3 * SQUISH_CONSTANT_4D
dw_ext1 = dw0 - 2 - 3 * SQUISH_CONSTANT_4D
else:
wsv_ext0 = wsv_ext1 = wsb
dw_ext0 = dw_ext1 = dw0 - 3 * SQUISH_CONSTANT_4D
# One contribution is a _permutation of (1,1,1,-1) based on the smaller-sided po
xsv_ext2 = xsb + 1
ysv_ext2 = ysb + 1
zsv_ext2 = zsb + 1
wsv_ext2 = wsb + 1
dx_ext2 = dx0 - 1 - 2 * SQUISH_CONSTANT_4D
dy_ext2 = dy0 - 1 - 2 * SQUISH_CONSTANT_4D
dz_ext2 = dz0 - 1 - 2 * SQUISH_CONSTANT_4D
dw_ext2 = dw0 - 1 - 2 * SQUISH_CONSTANT_4D
if (c2 & 0x01) == 0:
xsv_ext2 -= 2
dx_ext2 += 2
elif (c2 & 0x02) == 0:
ysv_ext2 -= 2
dy_ext2 += 2
elif (c2 & 0x04) == 0:
zsv_ext2 -= 2
dz_ext2 += 2
else:
wsv_ext2 -= 2
dw_ext2 += 2
# Contribution (1,1,1,0)
dx4 = dx0 - 1 - 3 * SQUISH_CONSTANT_4D
dy4 = dy0 - 1 - 3 * SQUISH_CONSTANT_4D
dz4 = dz0 - 1 - 3 * SQUISH_CONSTANT_4D
dw4 = dw0 - 3 * SQUISH_CONSTANT_4D
attn4 = 2 - dx4 * dx4 - dy4 * dy4 - dz4 * dz4 - dw4 * dw4
if attn4 > 0:
attn4 *= attn4
value += attn4 * attn4 * extrapolate(xsb + 1, ysb + 1, zsb + 1, wsb + 0, dx4, dy4, dz4, dw4)
# Contribution (1,1,0,1)
dx3 = dx4
dy3 = dy4
dz3 = dz0 - 3 * SQUISH_CONSTANT_4D
dw3 = dw0 - 1 - 3 * SQUISH_CONSTANT_4D
attn3 = 2 - dx3 * dx3 - dy3 * dy3 - dz3 * dz3 - dw3 * dw3
if attn3 > 0:
attn3 *= attn3
value += attn3 * attn3 * extrapolate(xsb + 1, ysb + 1, zsb + 0, wsb + 1, dx3, dy3, dz3, dw3)
# Contribution (1,0,1,1)
dx2 = dx4
dy2 = dy0 - 3 * SQUISH_CONSTANT_4D
dz2 = dz4
dw2 = dw3
attn2 = 2 - dx2 * dx2 - dy2 * dy2 - dz2 * dz2 - dw2 * dw2
if attn2 > 0:
attn2 *= attn2
value += attn2 * attn2 * extrapolate(xsb + 1, ysb + 0, zsb + 1, wsb + 1, dx2, dy2, dz2, dw2)
# Contribution (0,1,1,1)
dx1 = dx0 - 3 * SQUISH_CONSTANT_4D
dz1 = dz4
dy1 = dy4
dw1 = dw3
attn1 = 2 - dx1 * dx1 - dy1 * dy1 - dz1 * dz1 - dw1 * dw1
if attn1 > 0:
attn1 *= attn1
value += attn1 * attn1 * extrapolate(xsb + 0, ysb + 1, zsb + 1, wsb + 1, dx1, dy1, dz1, dw1)
# Contribution (1,1,0,0)
dx5 = dx0 - 1 - 2 * SQUISH_CONSTANT_4D
dy5 = dy0 - 1 - 2 * SQUISH_CONSTANT_4D
dz5 = dz0 - 0 - 2 * SQUISH_CONSTANT_4D
dw5 = dw0 - 0 - 2 * SQUISH_CONSTANT_4D
attn5 = 2 - dx5 * dx5 - dy5 * dy5 - dz5 * dz5 - dw5 * dw5
if attn5 > 0:
attn5 *= attn5
value += attn5 * attn5 * extrapolate(xsb + 1, ysb + 1, zsb + 0, wsb + 0, dx5, dy5, dz5, dw5)
# Contribution (1,0,1,0)
dx6 = dx0 - 1 - 2 * SQUISH_CONSTANT_4D
dy6 = dy0 - 0 - 2 * SQUISH_CONSTANT_4D
dz6 = dz0 - 1 - 2 * SQUISH_CONSTANT_4D
dw6 = dw0 - 0 - 2 * SQUISH_CONSTANT_4D
attn6 = 2 - dx6 * dx6 - dy6 * dy6 - dz6 * dz6 - dw6 * dw6
if attn6 > 0:
attn6 *= attn6
value += attn6 * attn6 * extrapolate(xsb + 1, ysb + 0, zsb + 1, wsb + 0, dx6, dy6, dz6, dw6)
# Contribution (1,0,0,1)
dx7 = dx0 - 1 - 2 * SQUISH_CONSTANT_4D
dy7 = dy0 - 0 - 2 * SQUISH_CONSTANT_4D
dz7 = dz0 - 0 - 2 * SQUISH_CONSTANT_4D
dw7 = dw0 - 1 - 2 * SQUISH_CONSTANT_4D
attn7 = 2 - dx7 * dx7 - dy7 * dy7 - dz7 * dz7 - dw7 * dw7
if attn7 > 0:
attn7 *= attn7
value += attn7 * attn7 * extrapolate(xsb + 1, ysb + 0, zsb + 0, wsb + 1, dx7, dy7, dz7, dw7)
# Contribution (0,1,1,0)
dx8 = dx0 - 0 - 2 * SQUISH_CONSTANT_4D
dy8 = dy0 - 1 - 2 * SQUISH_CONSTANT_4D
dz8 = dz0 - 1 - 2 * SQUISH_CONSTANT_4D
dw8 = dw0 - 0 - 2 * SQUISH_CONSTANT_4D
attn8 = 2 - dx8 * dx8 - dy8 * dy8 - dz8 * dz8 - dw8 * dw8
if attn8 > 0:
attn8 *= attn8
value += attn8 * attn8 * extrapolate(xsb + 0, ysb + 1, zsb + 1, wsb + 0, dx8, dy8, dz8, dw8)
# Contribution (0,1,0,1)
dx9 = dx0 - 0 - 2 * SQUISH_CONSTANT_4D
dy9 = dy0 - 1 - 2 * SQUISH_CONSTANT_4D
dz9 = dz0 - 0 - 2 * SQUISH_CONSTANT_4D
dw9 = dw0 - 1 - 2 * SQUISH_CONSTANT_4D
attn9 = 2 - dx9 * dx9 - dy9 * dy9 - dz9 * dz9 - dw9 * dw9
if attn9 > 0:
attn9 *= attn9
value += attn9 * attn9 * extrapolate(xsb + 0, ysb + 1, zsb + 0, wsb + 1, dx9, dy9, dz9, dw9)
# Contribution (0,0,1,1)
dx10 = dx0 - 0 - 2 * SQUISH_CONSTANT_4D
dy10 = dy0 - 0 - 2 * SQUISH_CONSTANT_4D
dz10 = dz0 - 1 - 2 * SQUISH_CONSTANT_4D
dw10 = dw0 - 1 - 2 * SQUISH_CONSTANT_4D
attn10 = 2 - dx10 * dx10 - dy10 * dy10 - dz10 * dz10 - dw10 * dw10
if attn10 > 0:
attn10 *= attn10
value += attn10 * attn10 * extrapolate(xsb + 0, ysb + 0, zsb + 1, wsb + 1, dx10, dy10, dz10, dw10)
# First extra vertex
attn_ext0 = 2 - dx_ext0 * dx_ext0 - dy_ext0 * dy_ext0 - dz_ext0 * dz_ext0 - dw_ext0 * dw_ext0
if attn_ext0 > 0:
attn_ext0 *= attn_ext0
value += attn_ext0 * attn_ext0 * extrapolate(xsv_ext0, ysv_ext0, zsv_ext0, wsv_ext0, dx_ext0, dy_ext0, dz_ext0, dw_ext0)
# Second extra vertex
attn_ext1 = 2 - dx_ext1 * dx_ext1 - dy_ext1 * dy_ext1 - dz_ext1 * dz_ext1 - dw_ext1 * dw_ext1
if attn_ext1 > 0:
attn_ext1 *= attn_ext1
value += attn_ext1 * attn_ext1 * extrapolate(xsv_ext1, ysv_ext1, zsv_ext1, wsv_ext1, dx_ext1, dy_ext1, dz_ext1, dw_ext1)
# Third extra vertex
attn_ext2 = 2 - dx_ext2 * dx_ext2 - dy_ext2 * dy_ext2 - dz_ext2 * dz_ext2 - dw_ext2 * dw_ext2
if attn_ext2 > 0:
attn_ext2 *= attn_ext2
value += attn_ext2 * attn_ext2 * extrapolate(xsv_ext2, ysv_ext2, zsv_ext2, wsv_ext2, dx_ext2, dy_ext2, dz_ext2, dw_ext2)
return value / NORM_CONSTANT_4D
- source_sentence: |-
Method which returns a dictionary of field statistics received from the
input source.
Returns:
fieldStats: dict of dicts where the first level is the field name and
the second level is the statistic. ie. fieldStats['pounds']['min']
sentences:
- |-
def customize(func):
"""
Decorator to set plotting context and axes style during function call.
"""
@wraps(func)
def call_w_context(*args, **kwargs):
set_context = kwargs.pop('set_context', True)
if set_context:
with plotting_context(), axes_style():
return func(*args, **kwargs)
else:
return func(*args, **kwargs)
return call_w_context
- |-
def Vgg19_simple_api(rgb):
"""
Build the VGG 19 Model
Parameters
-----------
rgb : rgb image placeholder [batch, height, width, 3] values scaled [0, 1]
"""
start_time = time.time()
print("build model started")
rgb_scaled = rgb * 255.0
# Convert RGB to BGR
red, green, blue = tf.split(rgb_scaled, 3, 3)
if red.get_shape().as_list()[1:] != [224, 224, 1]:
raise Exception("image size unmatch")
if green.get_shape().as_list()[1:] != [224, 224, 1]:
raise Exception("image size unmatch")
if blue.get_shape().as_list()[1:] != [224, 224, 1]:
raise Exception("image size unmatch")
bgr = tf.concat([
blue - VGG_MEAN[0],
green - VGG_MEAN[1],
red - VGG_MEAN[2],
], axis=3)
if bgr.get_shape().as_list()[1:] != [224, 224, 3]:
raise Exception("image size unmatch")
# input layer
net_in = InputLayer(bgr, name='input')
# conv1
net = Conv2d(net_in, 64, filter_size=(3, 3), strides=(1, 1), act=tf.nn.relu, padding='SAME', name='conv1_1')
net = Conv2d(net, n_filter=64, filter_size=(3, 3), strides=(1, 1), act=tf.nn.relu, padding='SAME', name='conv1_2')
net = MaxPool2d(net, filter_size=(2, 2), strides=(2, 2), padding='SAME', name='pool1')
# conv2
net = Conv2d(net, n_filter=128, filter_size=(3, 3), strides=(1, 1), act=tf.nn.relu, padding='SAME', name='conv2_1')
net = Conv2d(net, n_filter=128, filter_size=(3, 3), strides=(1, 1), act=tf.nn.relu, padding='SAME', name='conv2_2')
net = MaxPool2d(net, filter_size=(2, 2), strides=(2, 2), padding='SAME', name='pool2')
# conv3
net = Conv2d(net, n_filter=256, filter_size=(3, 3), strides=(1, 1), act=tf.nn.relu, padding='SAME', name='conv3_1')
net = Conv2d(net, n_filter=256, filter_size=(3, 3), strides=(1, 1), act=tf.nn.relu, padding='SAME', name='conv3_2')
net = Conv2d(net, n_filter=256, filter_size=(3, 3), strides=(1, 1), act=tf.nn.relu, padding='SAME', name='conv3_3')
net = Conv2d(net, n_filter=256, filter_size=(3, 3), strides=(1, 1), act=tf.nn.relu, padding='SAME', name='conv3_4')
net = MaxPool2d(net, filter_size=(2, 2), strides=(2, 2), padding='SAME', name='pool3')
# conv4
net = Conv2d(net, n_filter=512, filter_size=(3, 3), strides=(1, 1), act=tf.nn.relu, padding='SAME', name='conv4_1')
net = Conv2d(net, n_filter=512, filter_size=(3, 3), strides=(1, 1), act=tf.nn.relu, padding='SAME', name='conv4_2')
net = Conv2d(net, n_filter=512, filter_size=(3, 3), strides=(1, 1), act=tf.nn.relu, padding='SAME', name='conv4_3')
net = Conv2d(net, n_filter=512, filter_size=(3, 3), strides=(1, 1), act=tf.nn.relu, padding='SAME', name='conv4_4')
net = MaxPool2d(net, filter_size=(2, 2), strides=(2, 2), padding='SAME', name='pool4')
# conv5
net = Conv2d(net, n_filter=512, filter_size=(3, 3), strides=(1, 1), act=tf.nn.relu, padding='SAME', name='conv5_1')
net = Conv2d(net, n_filter=512, filter_size=(3, 3), strides=(1, 1), act=tf.nn.relu, padding='SAME', name='conv5_2')
net = Conv2d(net, n_filter=512, filter_size=(3, 3), strides=(1, 1), act=tf.nn.relu, padding='SAME', name='conv5_3')
net = Conv2d(net, n_filter=512, filter_size=(3, 3), strides=(1, 1), act=tf.nn.relu, padding='SAME', name='conv5_4')
net = MaxPool2d(net, filter_size=(2, 2), strides=(2, 2), padding='SAME', name='pool5')
# fc 6~8
net = FlattenLayer(net, name='flatten')
net = DenseLayer(net, n_units=4096, act=tf.nn.relu, name='fc6')
net = DenseLayer(net, n_units=4096, act=tf.nn.relu, name='fc7')
net = DenseLayer(net, n_units=1000, act=None, name='fc8')
print("build model finished: %fs" % (time.time() - start_time))
return net
- |-
def _getFieldStats(self):
"""
Method which returns a dictionary of field statistics received from the
input source.
Returns:
fieldStats: dict of dicts where the first level is the field name and
the second level is the statistic. ie. fieldStats['pounds']['min']
"""
fieldStats = dict()
fieldNames = self._inputSource.getFieldNames()
for field in fieldNames:
curStats = dict()
curStats['min'] = self._inputSource.getFieldMin(field)
curStats['max'] = self._inputSource.getFieldMax(field)
fieldStats[field] = curStats
return fieldStats
datasets:
- fyaronskiy/cornstack_python_ru_en
- fyaronskiy/code_search_net_ru_en
- ai-forever/solyanka
pipeline_tag: text-ranking
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
model-index:
- name: SentenceTransformer
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: Unknown
type: unknown
metrics:
- type: cosine_accuracy@1
value: 0.8683666666666666
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.9439333333333333
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.9566333333333333
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9668333333333333
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.8683666666666666
name: Cosine Precision@1
- type: cosine_recall@1
value: 0.8683666666666666
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.9439333333333333
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.9566333333333333
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.9668333333333333
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.9224025873017736
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.9076358333333253
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.9082959802184539
name: Cosine Map@100
- type: cosine_accuracy@1
value: 0.8741666666666666
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.9425
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.9548666666666666
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9644333333333334
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.8741666666666666
name: Cosine Precision@1
- type: cosine_recall@1
value: 0.8741666666666666
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.9425
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.9548666666666666
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.9644333333333334
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.9234437208756444
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.9098453571428485
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.9105416505961587
name: Cosine Map@100
license: apache-2.0
base_model:
- deepvk/RuModernBERT-base
---
# SentenceTransformer
This is a [sentence-transformers](https://www.SBERT.net) model trained on the [cornstack_python](https://huggingface.co/datasets/fyaronskiy/cornstack_python_ru_en),
cornstack_python_pairs, [codesearchnet](https://huggingface.co/datasets/fyaronskiy/code_search_net_ru_en), [codesearchnet_pairs](https://huggingface.co/datasets/fyaronskiy/code_search_net_ru_en) and [solyanka_qa](https://huggingface.co/datasets/ai-forever/solyanka) datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space.
Model can be used for text-to-code, code-to-text retrieval tasks where text is in **Russian/English** and code is in **Python/Java/Javascript/Go/Php/Ruby**. Queries, documents also can be mix of natural language text and code. Perfomance of code-to-code tasks wasn't measured.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [RuModernBERT-base](https://huggingface.co/deepvk/RuModernBERT-base)
- **Maximum Sequence Length:** 8192 tokens
- **Output Dimensionality:** 768 dimensions
- **Similarity Function:** Cosine Similarity
- **Training Datasets:**
- [cornstack_python](https://huggingface.co/datasets/fyaronskiy/cornstack_python_ru_en)
- cornstack_python_pairs
- [codesearchnet](https://huggingface.co/datasets/fyaronskiy/code_search_net_ru_en)
- [codesearchnet_pairs](https://huggingface.co/datasets/fyaronskiy/code_search_net_ru_en)
- [solyanka_qa](https://huggingface.co/datasets/ai-forever/solyanka)
<!-- - **License:** Unknown -->
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
import torch
from sentence_transformers import SentenceTransformer, util
device = "cuda" if torch.cuda.is_available() else "cpu"
model = SentenceTransformer("fyaronskiy/code_retriever_ru_en").to(device)
queries_ru = [
"Напиши функцию на Python, которая рекурсивно вычисляет факториал числа.",
"Как проверить, является ли строка палиндромом?",
"Объедини два отсортированных списка в один отсортированный список."
]
corpus_ru = [
# Релевантный для Q1
"""def factorial(n):
if n == 0:
return 1
return n * factorial(n - 1)""",
# Hard negative для Q1
"""def sum_recursive(n):
if n == 0:
return 0
return n + sum_recursive(n - 1)""",
# Релевантный для Q2
"""def is_palindrome(s: str) -> bool:
s = s.lower().replace(" ", "")
return s == s[::-1]""",
# Hard negative для Q2
"""def reverse_string(s: str) -> str:
return s[::-1]""",
# Релевантный для Q3
"""def merge_sorted_lists(a, b):
result = []
i = j = 0
while i < len(a) and j < len(b):
if a[i] < b[j]:
result.append(a[i])
i += 1
else:
result.append(b[j])
j += 1
result.extend(a[i:])
result.extend(b[j:])
return result""",
# Hard negative для Q3
"""def add_lists(a, b):
return [x + y for x, y in zip(a, b)]"""
]
doc_embeddings = model.encode(corpus_ru, convert_to_tensor=True, device=device)
query_embeddings = model.encode(queries_ru, convert_to_tensor=True, device=device)
# Выполняем поиск по каждому запросу
for i, query in enumerate(queries_ru):
scores = util.cos_sim(query_embeddings[i], doc_embeddings)[0]
best_idx = torch.argmax(scores).item()
print(f"\nЗапрос {i+1}: {query}")
print('Скоры всех документов в корпусе: ', scores)
print(f"Наиболее подходящий документ (Скор={scores[best_idx]:.4f}):\n{corpus_ru[best_idx]}")
```
Model was trained with Matryoshka Loss with dims: 768, 512, 256, 128, 64.
So for decreasing memory for your vector databaset and make inference faster you can truncate embeddings.
To do this you need to initialize model as follows:
```python
matryoshka_dim = 128
model = SentenceTransformer("fyaronskiy/code_retriever_ru_en", truncate_dim=matryoshka_dim).to(device)
```
<!--
### Direct Usage (Transformers)
<details><summary>Click to see the direct usage in Transformers</summary>
</details>
-->
<!--
### Downstream Usage (Sentence Transformers)
You can finetune this model on your own dataset.
<details><summary>Click to expand</summary>
</details>
-->
<!--
### Out-of-Scope Use
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->
## Evaluation
### Metrics
#### Information Retrieval
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
| Metric | Value |
|:--------------------|:-----------|
| cosine_accuracy@1 | 0.8684 |
| cosine_accuracy@3 | 0.9439 |
| cosine_accuracy@5 | 0.9566 |
| cosine_accuracy@10 | 0.9668 |
| cosine_precision@1 | 0.8684 |
| cosine_precision@3 | 0.3146 |
| cosine_precision@5 | 0.1913 |
| cosine_precision@10 | 0.0967 |
| cosine_recall@1 | 0.8684 |
| cosine_recall@3 | 0.9439 |
| cosine_recall@5 | 0.9566 |
| cosine_recall@10 | 0.9668 |
| **cosine_ndcg@10** | **0.9224** |
| cosine_mrr@10 | 0.9076 |
| cosine_map@100 | 0.9083 |
#### Information Retrieval
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
| Metric | Value |
|:--------------------|:-----------|
| cosine_accuracy@1 | 0.8742 |
| cosine_accuracy@3 | 0.9425 |
| cosine_accuracy@5 | 0.9549 |
| cosine_accuracy@10 | 0.9644 |
| cosine_precision@1 | 0.8742 |
| cosine_precision@3 | 0.3142 |
| cosine_precision@5 | 0.191 |
| cosine_precision@10 | 0.0964 |
| cosine_recall@1 | 0.8742 |
| cosine_recall@3 | 0.9425 |
| cosine_recall@5 | 0.9549 |
| cosine_recall@10 | 0.9644 |
| **cosine_ndcg@10** | **0.9234** |
| cosine_mrr@10 | 0.9098 |
| cosine_map@100 | 0.9105 |
<!--
## Bias, Risks and Limitations
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->
<!--
### Recommendations
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->
## Training Details
### Training Datasets
<details><summary>cornstack_python</summary>
#### cornstack_python
* Dataset: cornstack_python
* Size: 2,869,969 training samples
* Columns: <code>ru_query</code>, <code>document</code>, <code>negative_0</code>, <code>negative_1</code>, <code>negative_2</code>, <code>negative_3</code>, <code>negative_4</code>, <code>negative_5</code>, <code>negative_6</code>, <code>negative_7</code>, <code>negative_8</code>, <code>negative_9</code>, <code>negative_10</code>, <code>negative_11</code>, <code>negative_12</code>, <code>negative_13</code>, <code>negative_14</code>, and <code>negative_15</code>
* Approximate statistics based on the first 1000 samples:
| | ru_query | document | negative_0 | negative_1 | negative_2 | negative_3 | negative_4 | negative_5 | negative_6 | negative_7 | negative_8 | negative_9 | negative_10 | negative_11 | negative_12 | negative_13 | negative_14 | negative_15 |
|:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
| type | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string |
| details | <ul><li>min: 7 tokens</li><li>mean: 27.46 tokens</li><li>max: 162 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 304.38 tokens</li><li>max: 5574 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 237.08 tokens</li><li>max: 3627 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 229.94 tokens</li><li>max: 6691 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 230.06 tokens</li><li>max: 6229 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 230.7 tokens</li><li>max: 4876 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 220.57 tokens</li><li>max: 4876 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 236.08 tokens</li><li>max: 5880 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 247.91 tokens</li><li>max: 6621 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 207.62 tokens</li><li>max: 3350 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 222.54 tokens</li><li>max: 6863 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 221.53 tokens</li><li>max: 4976 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 216.06 tokens</li><li>max: 4876 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 197.03 tokens</li><li>max: 4763 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 200.83 tokens</li><li>max: 8192 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 204.94 tokens</li><li>max: 3210 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 188.51 tokens</li><li>max: 2754 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 188.27 tokens</li><li>max: 4876 tokens</li></ul> |
* Samples:
| ru_query | document | negative_0 | negative_1 | negative_2 | negative_3 | negative_4 | negative_5 | negative_6 | negative_7 | negative_8 | negative_9 | negative_10 | negative_11 | negative_12 | negative_13 | negative_14 | negative_15 |
|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>установите значение business_id сообщения данных в конкретное значение</code> | <code>def step_impl_the_ru_is_set_to(context, business_id):<br> context.bdd_helper.message_data["business_id"] = business_id</code> | <code>def business_id(self, business_id):<br><br> self._business_id = business_id</code> | <code>def business_phone(self, business_phone):<br><br> self._business_phone = business_phone</code> | <code>def business_phone_number(self, business_phone_number):<br><br> self._business_phone_number = business_phone_number</code> | <code>def bus_ob_id(self, bus_ob_id):<br><br> self._bus_ob_id = bus_ob_id</code> | <code>def bus_ob_id(self, bus_ob_id):<br><br> self._bus_ob_id = bus_ob_id</code> | <code>def _set_id(self, value):<br> pass</code> | <code>def business_email(self, business_email):<br><br> self._business_email = business_email</code> | <code>def mailing_id(self, val: str):<br> self._mailing_id = val</code> | <code>def message_id(self, val: str):<br> self._message_id = val</code> | <code>def business_model(self, business_model):<br><br> self._business_model = business_model</code> | <code>def business_account(self, business_account):<br><br> self._business_account = business_account</code> | <code>def update_business(current_user, businessId):<br> business = Business.query.get(int(businessId))<br><br> if not business:<br> return make_json_reply('message', 'Business id does not exist'), 404<br><br> if business.user_id != current_user.id:<br> return make_json_reply('message', 'Cannot update business'), 400<br><br> data = request.get_json(force=True)<br> name = location = category = description = None<br><br> if 'name' in data.keys():<br> name = data['name']<br><br> if 'location' in data.keys():<br> location = data['location']<br><br> if 'category' in data.keys():<br> category = data['category']<br><br> if 'description' in data.keys():<br> description = data['description']<br><br> if check_validity_of_input(name=name):<br> business.name = name<br><br> if check_validity_of_input(location=location):<br> business.location = location<br><br> if check_validity_of_input(category=category):<br> business.category = category<br><br> if check_validity_of_input(description=description):<br> ...</code> | <code>def set_company_id_value(self, company_id_value):<br> self.company_id_value = company_id_value</code> | <code>def id(self, value):<br> self._id = value</code> | <code>def set_bribe(self, bribe_amount):
<br> self.bribe = bribe_amount</code> | <code>def business_owner(self, business_owner):<br><br> self._business_owner = business_owner</code> |
| <code>Установить состояние правил sid</code> | <code>def set_state_sid_request(ruleset_name, sid):<br> message = json.loads(request.stream.read().decode('utf-8'))<br> message['sid'] = sid<br> result = host.patch_state(ruleset_name, message)<br> return jsonify(result)</code> | <code>def sid(self, sid):<br> self._sid = sid</code> | <code>def set_state(self,s):<br> self.state = s</code> | <code>def set_state(self, state: int):</code> | <code>def __setstate__(self, state):<br><br> self.set(DER = state)</code> | <code>def set_rule(self, rule):<br> self.rule.load_state_dict(rule, strict=True)</code> | <code>def _set_state(self, state):<br> #print("** set state from %d to %d" % (self.state, state))<br> self.state = state</code> | <code>def set_state( self ):</code> | <code>def set_ident(self, new_ident: int):<br> if not isinstance(new_ident, int):<br> raise TypeError("Spectrum set identifiers may ONLY be positive integers")<br> self._set_ident = new_ident</code> | <code>def set_state(self, state):<br> #print("ComponentBase.set_state")<br> for k,v in state.items():<br> #print(" Set {:14s} to {:s}".format(k,str(v)))<br> if k == "connectors":<br> for con_state in v:<br> self.add_connector() <br> self.connectors[-1].set_state(con_state)<br> else:<br> setattr(self, k, v)</code> | <code>def __setstate__(self, state):<br><br> self.list = state</code> | <code>def __setstate__(self, state):<br><br> self.list = state</code> | <code>def state_id(self, state_id):<br><br> self._state_id = state_id</code> | <code>def set_state(self, state: int):<br> self.state = state</code> | <code>def set_domain_sid(self, sid):<br> dsdb._samdb_set_domain_sid(self, sid)</code> | <code>def set_state(self,state):<br> self.__state = state</code> | <code>def set_srid(self, srid: ir.IntegerValue) -> GeoSpatialValue:<br> return ops.GeoSetSRID(self, srid=srid).to_expr()</code> |
| <code>Отправить события sid в ruleset</code> | <code>def post_sid_events(ruleset_name, sid):<br> message = json.loads(request.stream.read().decode('utf-8'))<br> message['sid'] = sid<br> result = host.post(ruleset_name, message)<br> return jsonify(result)</code> | <code>def post_events(ruleset_name):<br> message = json.loads(request.stream.read().decode('utf-8'))<br> result = host.post(ruleset_name, message)<br> return jsonify(result)</code> | <code>def set_state_sid_request(ruleset_name, sid):<br> message = json.loads(request.stream.read().decode('utf-8'))<br> message['sid'] = sid<br> result = host.patch_state(ruleset_name, message)<br> return jsonify(result)</code> | <code>def sid(self, sid):<br> self._sid = sid</code> | <code>def post(self, request, *args, **kwargs):<br> <br> id = args[0] if args else list(kwargs.values())[0]<br> try:<br> ssn = Subscription.objects.get(id=id)<br> except Subscription.DoesNotExist:<br> logger.error(<br> f'Received unwanted subscription {id} POST request! Sending status '<br> '410 back to hub.'<br> )<br> return Response('Unwanted subscription', status=410)<br> <br> ssn.update(time_last_event_received=now())<br> self.handler_task.delay(request.data)<br> return Response('') # TODO</code> | <code>def informed_consent_on_post_save(sender, instance, raw, created, **kwargs):<br> if not raw:<br> if created:<br> pass<br> # instance.registration_update_or_create()<br> # update_model_fields(instance=instance,<br> # model_cls=['subject_identifier', instance.subject_identifier])<br> try:<br> OnSchedule.objects.get(<br> subject_identifier=instance.subject_identifier, )<br> except OnSchedule.DoesNotExist:<br> onschedule_model = 'training_subject.onschedule'<br> put_on_schedule(schedule_name='training_subject_visit_schedule', instance=instance, onschedule_model=onschedule_model)</code> | <code>def post_event(self, event):
<br> from evennia.scripts.models import ScriptDB
<br>
<br> if event.public_event:
<br> event_manager = ScriptDB.objects.get(db_key="Event Manager")
<br> event_manager.post_event(event, self.owner.player, event.display())</code> | <code>def post(self, event, *args, **kwargs):<br> self.inq.Signal((event, args, kwargs))</code> | <code>def post(self, request):<br> return self.serviceHandler.addEvent(request.data)</code> | <code>def register_to_event(request):<br> pass</code> | <code>def setFilterOnRule(request):<br> <br> logger = logging.getLogger(__name__)<br> <br> # Get some initial post values for processing.<br> ruleIds = request.POST.getlist('id')<br> sensors = request.POST.getlist('sensors')<br> commentString = request.POST['comment']<br> force = request.POST['force']<br> response = []<br> <br> # If the ruleIds list is empty, it means a SID has been entered manually.<br> if len(ruleIds) == 0:<br> # Grab the value from the POST.<br> ruleSID = request.POST['sid']<br> <br> # Match the GID:SID pattern, if its not there, throw exception.<br> try:<br> matchPattern = r"(\d+):(\d+)"<br> pattern = re.compile(matchPattern)<br> result = pattern.match(ruleSID)<br> <br> ruleGID = result.group(1)<br> ruleSID = result.group(2)<br> except:<br> response.append({'response': 'invalidGIDSIDFormat', 'text': 'Please format in the GID:SID syntax.'})<br> logger.warning("Invalid GID:SID syntax provided: "+str(ruleSID)+".")<br> return HttpResponse(json.dumps(response))<br> <br> # Try to find a generator object with the GID supplied, if it does...</code> | <code>def store_event(self, violations):<br> current_time = datetime.now().strftime("%Y/%m/%d %H:%M:%S")<br> insert_query = """INSERT INTO social_distancing (Location, Local_Time, Violations) VALUES ('{}', '{}', {})""".format(self.location, current_time, violations)<br> self.off_chain.insert(insert_query)<br><br> event_id = self.off_chain.select("""SELECT LAST_INSERT_ID() FROM social_distancing""")[0][0]<br> self.on_chain.store_hash(event_id, self.location, current_time, violations)</code> | <code>def test_post_event_on_schedule_page(self):<br> json_data = {<br> 'title': 'Test Event',<br> 'start': '2017-8-8T12:00:00',<br> 'end': '2017-8-8T12:00:00',<br> 'group': '3'<br> }<br><br> response = self.app.post("/saveEvent", data=json.dumps(json_data),<br> content_type='application/json')<br> self.assertTrue(response.status_code, 200)</code> | <code>def _push(self, server):<br> defns = [self.get_id(ident) for ident in list(self.ids)]<br> #for ident in list(self.ids):<br> # defn = self.get_id(ident)<br> if len(defns) == 0:<br> return<br> self.app.logger.info(f"Updating {server} with {len(defns)} records")<br> url = f"{server}/add_record"<br> try:<br> resp = requests.post(url, json=defns)<br> except Exception as e:<br> self.app.logger.error(str(e))<br> return<br> if not resp.ok:<br> self.app.logger.error(f"{resp.reason} {resp.content}")<br> return<br> self._server_updated[server] = True</code> | <code>def post(self, slug = None, eid = None):<br> uid = self.request.form.get("uid")<br> status = self.request.form.get("status") # can be join, maybe, notgoubg<br> event = self.barcamp.get_event(eid)<br> <br> user = self.app.module_map.userbase.get_user_by_id(uid)<br><br> reg = RegistrationService(self, user)<br> try:<br> status = reg.set_status(eid, status, force=True)<br> except RegistrationError, e:<br> print "a registration error occurred", e<br> raise ProcessingError(str(e))<br> return <br><br> return {'status' : 'success', 'reload' : True}</code> | <code>def events(self):</code> | <code>def post(self):<br><br> # we need a unique tx number so we can look these back up again<br> # as well as for logging<br> # FIXME: how can we guarantee uniqueness here?<br> tx = int(time.time() * 100000) + random.randrange(10000, 99999)<br><br> log.info("EVENTS [{}]: Creating events".format(tx))<br><br> try:<br> user = self.jbody["user"]<br> if not EMAIL_REGEX.match(user):<br> user += "@" + self.domain<br> event_type_id = self.jbody.get("eventTypeId", None)<br> category = self.jbody.get("category", None)<br> state = self.jbody.get("state", None)<br> note = self.jbody.get("note", None)<br> except KeyError as err:<br> raise exc.BadRequest(<br> "Missing Required Argument: {}".format(err.message)<br> )<br> except ValueError as err:<br> raise exc.BadRequest(err.message)<br><br> if not event_type_id and (not category and not state):<br> raise exc.BadRequest(<br> ...</code> |
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
```json
{
"loss": "CachedMultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
```
</details>
<details><summary>cornstack_python_pairs</summary>
#### cornstack_python_pairs
* Dataset: cornstack_python_pairs
* Size: 1,434,984 training samples
* Columns: <code>en_query</code>, <code>ru_query</code>, and <code>label</code>
* Approximate statistics based on the first 1000 samples:
| | en_query | ru_query | label |
|:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:--------------------------------------------------------------|
| type | string | string | float |
| details | <ul><li>min: 7 tokens</li><li>mean: 26.96 tokens</li><li>max: 150 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 27.46 tokens</li><li>max: 162 tokens</li></ul> | <ul><li>min: 1.0</li><li>mean: 1.0</li><li>max: 1.0</li></ul> |
* Samples:
| en_query | ru_query | label |
|:------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-----------------|
| <code>set the message data business_id to a specific value</code> | <code>установите значение business_id сообщения данных в конкретное значение</code> | <code>1.0</code> |
| <code>Set ruleset state sid</code> | <code>Установить состояние правил sid</code> | <code>1.0</code> |
| <code>Post sid events to the ruleset</code> | <code>Отправить события sid в ruleset</code> | <code>1.0</code> |
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
```json
{
"loss": "CoSENTLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
```
</details>
<details><summary>codesearchnet</summary>
#### codesearchnet
* Dataset: [codesearchnet](https://huggingface.co/datasets/fyaronskiy/code_search_net_ru_en) at [3f90200](https://huggingface.co/datasets/fyaronskiy/code_search_net_ru_en/tree/3f9020072f2e6d5ac5445b39e566e5b669a1661b)
* Size: 1,880,853 training samples
* Columns: <code>ru_func_documentation_string</code> and <code>func_code_string</code>
* Approximate statistics based on the first 1000 samples:
| | ru_func_documentation_string | func_code_string |
|:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|
| type | string | string |
| details | <ul><li>min: 5 tokens</li><li>mean: 95.0 tokens</li><li>max: 619 tokens</li></ul> | <ul><li>min: 62 tokens</li><li>mean: 522.56 tokens</li><li>max: 8192 tokens</li></ul> |
* Samples:
| ru_func_documentation_string | func_code_string |
|:--------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>Мультипроцессинг-целевой объект для устройства очереди zmq</code> | <code>def zmq_device(self):<br> '''<br> Multiprocessing target for the zmq queue device<br> '''<br> self.__setup_signals()<br> salt.utils.process.appendproctitle('MWorkerQueue')<br> self.context = zmq.Context(self.opts['worker_threads'])<br> # Prepare the zeromq sockets<br> self.uri = 'tcp://{interface}:{ret_port}'.format(**self.opts)<br> self.clients = self.context.socket(zmq.ROUTER)<br> if self.opts['ipv6'] is True and hasattr(zmq, 'IPV4ONLY'):<br> # IPv6 sockets work for both IPv6 and IPv4 addresses<br> self.clients.setsockopt(zmq.IPV4ONLY, 0)<br> self.clients.setsockopt(zmq.BACKLOG, self.opts.get('zmq_backlog', 1000))<br> self._start_zmq_monitor()<br> self.workers = self.context.socket(zmq.DEALER)<br><br> if self.opts.get('ipc_mode', '') == 'tcp':<br> self.w_uri = 'tcp://127.0.0.1:{0}'.format(<br> self.opts.get('tcp_master_workers', 4515)<br> )<br> else:<br> self.w_uri = 'ipc:...</code> |
| <code>Чисто завершите работу сокета роутера</code> | <code>def close(self):<br> '''<br> Cleanly shutdown the router socket<br> '''<br> if self._closing:<br> return<br> log.info('MWorkerQueue under PID %s is closing', os.getpid())<br> self._closing = True<br> # pylint: disable=E0203<br> if getattr(self, '_monitor', None) is not None:<br> self._monitor.stop()<br> self._monitor = None<br> if getattr(self, '_w_monitor', None) is not None:<br> self._w_monitor.stop()<br> self._w_monitor = None<br> if hasattr(self, 'clients') and self.clients.closed is False:<br> self.clients.close()<br> if hasattr(self, 'workers') and self.workers.closed is False:<br> self.workers.close()<br> if hasattr(self, 'stream'):<br> self.stream.close()<br> if hasattr(self, '_socket') and self._socket.closed is False:<br> self._socket.close()<br> if hasattr(self, 'context') and self.context.closed is False:<br> self.context.term()</code> |
| <code>До форка нам нужно создать устройство zmq роутера<br><br> :param func process_manager: Экземпляр класса salt.utils.process.ProcessManager</code> | <code>def pre_fork(self, process_manager):<br> '''<br> Pre-fork we need to create the zmq router device<br><br> :param func process_manager: An instance of salt.utils.process.ProcessManager<br> '''<br> salt.transport.mixins.auth.AESReqServerMixin.pre_fork(self, process_manager)<br> process_manager.add_process(self.zmq_device)</code> |
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
```json
{
"loss": "CachedMultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
```
</details>
<details><summary>codesearchnet_pairs</summary>
#### codesearchnet_pairs
* Dataset: [codesearchnet_pairs](https://huggingface.co/datasets/fyaronskiy/code_search_net_ru_en) at [3f90200](https://huggingface.co/datasets/fyaronskiy/code_search_net_ru_en/tree/3f9020072f2e6d5ac5445b39e566e5b669a1661b)
* Size: 940,426 training samples
* Columns: <code>en_func_documentation_string</code>, <code>ru_func_documentation_string</code>, and <code>label</code>
* Approximate statistics based on the first 1000 samples:
| | en_func_documentation_string | ru_func_documentation_string | label |
|:--------|:-------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------|
| type | string | string | float |
| details | <ul><li>min: 5 tokens</li><li>mean: 102.69 tokens</li><li>max: 1485 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 95.0 tokens</li><li>max: 619 tokens</li></ul> | <ul><li>min: 1.0</li><li>mean: 1.0</li><li>max: 1.0</li></ul> |
* Samples:
| en_func_documentation_string | ru_func_documentation_string | label |
|:-----------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
| <code>Multiprocessing target for the zmq queue device</code> | <code>Мультипроцессинг-целевой объект для устройства очереди zmq</code> | <code>1.0</code> |
| <code>Cleanly shutdown the router socket</code> | <code>Чисто завершите работу сокета роутера</code> | <code>1.0</code> |
| <code>Pre-fork we need to create the zmq router device<br><br> :param func process_manager: An instance of salt.utils.process.ProcessManager</code> | <code>До форка нам нужно создать устройство zmq роутера<br><br> :param func process_manager: Экземпляр класса salt.utils.process.ProcessManager</code> | <code>1.0</code> |
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
```json
{
"loss": "CoSENTLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
```
</details>
<details><summary>solyanka_qa</summary>
#### solyanka_qa
* Dataset: [solyanka_qa](https://huggingface.co/datasets/ai-forever/solyanka) at [deeac62](https://huggingface.co/datasets/ai-forever/solyanka/tree/deeac621d4142d2754fa28f0eb58502b966383c3)
* Size: 85,523 training samples
* Columns: <code>anchor</code> and <code>positive</code>
* Approximate statistics based on the first 1000 samples:
| | anchor | positive |
|:--------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
| type | string | string |
| details | <ul><li>min: 19 tokens</li><li>mean: 202.49 tokens</li><li>max: 518 tokens</li></ul> | <ul><li>min: 16 tokens</li><li>mean: 196.36 tokens</li><li>max: 524 tokens</li></ul> |
* Samples:
| anchor | positive |
|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>Как происходит взаимодействие нескольких языков программирования? Понятно, что большинство (если не все) крупные энтерпрайз сервисы, приложения и тд. (не только веб) написаны с использованием не одного языка программирования, а нескольких. И эти составные части, написанные на разных языках, как-то взаимодействуют между собой (фронт, бизнес-логика, еще что-то).<br>Опыта разработки подобных систем у меня нет, поэтому не совсем могу представить, как это происходит. Подозреваю, что взаимодействие идет через независимые от языков средства. Например, нечто написанное на одном языке, шлет через TCP-IP пакет, который ловится и обрабатывается чем-то написанным на другом языке. Либо через HTTP запросы. Либо через запись/чтение из БД. Либо через файловый обмен, XML например.<br>Хотелось бы, чтобы знающие люди привели пару примеров, как это обычно происходит. Не просто в двух словах, мол "фронт на яваскрипте, бэк на яве", а с техническими нюансами. Заранее спасибо.</code> | <code>Несколько языков могут сосуществовать как в рамках одного процесса, так и в рамках нескольких.<br>Проще всего сосуществовать в рамках нескольких процессов: если процессы обмениваются данными, то совершенно всё равно (ну, в известных рамках), на каком языке эти данные были созданы, и какой язык их читает. Например, вы можете генерировать данные в виде HTML сервером на ASP.NET, а читать браузером, написанным на C++. (Да, пара из сервера и клиента — тоже взаимодействие языков.)<br>Теперь, если мы хотим взаимодействие в рамках одного процесса, нам нужно уметь вызывать друг друга. Для этого нужен общий стандарт вызова. Часто таким общим стандартом являются бинарные соглашения C (`extern "C"`, экспорт из DLL в Windows).<br>Ещё пример общего стандарта — COM: COM-объекты можно писать на многих языках, так что если в языке есть часть, реализующая стандарт COM, он может вполне пользоваться им.<br>Отдельная возможность, популярная сейчас — языки, компилирующиеся в общий промежуточный код. Например, Java и Sc...</code> |
| <code>Слэши и ковычки после использования stringify Есть подобный скрипт:<br>[code]<br> var output = {<br> lol: [<br> {name: "hahaha"}<br> ]<br> };<br> console.log(output);<br> output = JSON.stringify(output);<br> console.log(output);<br>[/code]<br>в итоге получаем<br>почему он вставил слэши и кавычки там, где не надо?</code> | <code>Может сразу сделать валидный JSON<br>[code]<br> var output = {<br> lol: {name: "hahaha"}<br> };<br> console.log(output);<br> output = JSON.stringify(output);<br> console.log(output);<br>[/code]<br>Правда я незнаю что за переменная `name`</code> |
| <code>Оптимизация поиска числа в списке Есть функция. Она принимает число от 1 до 9 (мы ищем, есть ли оно в списке), и список, в котором мы его ищем)<br>[code]<br> def is_number_already_in(number, line):<br> equality = False<br> for i in line:<br> if i == number:<br> equality = True<br> if equality:<br> return True<br> else:<br> return False<br>[/code]<br>Как можно этот код оптимизировать и как называется способ (тема) оптимизации, чтобы я мог загуглить<br>Только не через лямбду, пожалуйста)</code> | <code>><br>[code]<br>> if equality:<br>> return True<br>> else:<br>> return False<br>><br>[/code]<br>[code]<br> return equality<br>[/code]<br>><br>[code]<br>> equality = False<br>> for i in line:<br>> if i == number:<br>> equality = True<br>><br>[/code]<br>[code]<br> equality = any(i == number for i in line)<br>[/code]<br>Всё целиком:<br>[code]<br> def is_number_already_in(number, line):<br> return any(i == number for i in line)<br>[/code]<br>Хотя на самом деле вроде бы можно гораздо проще<br>[code]<br> def is_number_already_in(number, line):<br> return number in line<br>[/code]<br>PS: Не проверял, но в любом случае идея должна быть понятна.</code> |
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
```json
{
"loss": "CachedMultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
```
</details>
### Evaluation Datasets
<details><summary>codesearchnet</summary>
#### codesearchnet
* Dataset: [codesearchnet](https://huggingface.co/datasets/fyaronskiy/code_search_net_ru_en) at [3f90200](https://huggingface.co/datasets/fyaronskiy/code_search_net_ru_en/tree/3f9020072f2e6d5ac5445b39e566e5b669a1661b)
* Size: 30,000 evaluation samples
* Columns: <code>ru_func_documentation_string</code> and <code>func_code_string</code>
* Approximate statistics based on the first 1000 samples:
| | ru_func_documentation_string | func_code_string |
|:--------|:-------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|
| type | string | string |
| details | <ul><li>min: 6 tokens</li><li>mean: 194.76 tokens</li><li>max: 1278 tokens</li></ul> | <ul><li>min: 58 tokens</li><li>mean: 580.66 tokens</li><li>max: 8192 tokens</li></ul> |
* Samples:
| ru_func_documentation_string | func_code_string |
|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>Обучить модель deepq.<br><br> Параметры<br> -------<br> env: gym.Env<br> среда для обучения<br> network: строка или функция<br> нейронная сеть, используемая в качестве аппроксиматора функции Q. Если строка, она должна быть одной из имен зарегистрированных моделей в baselines.common.models<br> (mlp, cnn, conv_only). Если функция, она должна принимать тензор наблюдения и возвращать тензор скрытой переменной, которая<br> будет отображена в головы функции Q (см. build_q_func в baselines.deepq.models для деталей по этому поводу)<br> seed: int или None<br> seed генератора случайных чисел. Запуски с одинаковым seed "должны" давать одинаковые результаты. Если None, используется отсутствие семени.<br> lr: float<br> скорость обучения для оптимизатора Adam<br> total_timesteps: int<br> количество шагов среды для оптимизации<br> buffer_size: int<br> размер буфера воспроизведения<br> exploration_fraction: float<br> доля всего периода обучения, в течение которого прои...</code> | <code>def learn(env,<br> network,<br> seed=None,<br> lr=5e-4,<br> total_timesteps=100000,<br> buffer_size=50000,<br> exploration_fraction=0.1,<br> exploration_final_eps=0.02,<br> train_freq=1,<br> batch_size=32,<br> print_freq=100,<br> checkpoint_freq=10000,<br> checkpoint_path=None,<br> learning_starts=1000,<br> gamma=1.0,<br> target_network_update_freq=500,<br> prioritized_replay=False,<br> prioritized_replay_alpha=0.6,<br> prioritized_replay_beta0=0.4,<br> prioritized_replay_beta_iters=None,<br> prioritized_replay_eps=1e-6,<br> param_noise=False,<br> callback=None,<br> load_path=None,<br> **network_kwargs<br> ):<br> """Train a deepq model.<br><br> Parameters<br> -------<br> env: gym.Env<br> environment to train on<br> network: string or a function<br> neural network to use as a q function approximator. If string, has to be one of the ...</code> |
| <code>Сохранить модель в pickle, расположенный по пути `path`</code> | <code>def save_act(self, path=None):<br> """Save model to a pickle located at `path`"""<br> if path is None:<br> path = os.path.join(logger.get_dir(), "model.pkl")<br><br> with tempfile.TemporaryDirectory() as td:<br> save_variables(os.path.join(td, "model"))<br> arc_name = os.path.join(td, "packed.zip")<br> with zipfile.ZipFile(arc_name, 'w') as zipf:<br> for root, dirs, files in os.walk(td):<br> for fname in files:<br> file_path = os.path.join(root, fname)<br> if file_path != arc_name:<br> zipf.write(file_path, os.path.relpath(file_path, td))<br> with open(arc_name, "rb") as f:<br> model_data = f.read()<br> with open(path, "wb") as f:<br> cloudpickle.dump((model_data, self._act_params), f)</code> |
| <code>CNN из статьи Nature.</code> | <code>def nature_cnn(unscaled_images, **conv_kwargs):<br> """<br> CNN from Nature paper.<br> """<br> scaled_images = tf.cast(unscaled_images, tf.float32) / 255.<br> activ = tf.nn.relu<br> h = activ(conv(scaled_images, 'c1', nf=32, rf=8, stride=4, init_scale=np.sqrt(2),<br> **conv_kwargs))<br> h2 = activ(conv(h, 'c2', nf=64, rf=4, stride=2, init_scale=np.sqrt(2), **conv_kwargs))<br> h3 = activ(conv(h2, 'c3', nf=64, rf=3, stride=1, init_scale=np.sqrt(2), **conv_kwargs))<br> h3 = conv_to_fc(h3)<br> return activ(fc(h3, 'fc1', nh=512, init_scale=np.sqrt(2)))</code> |
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
```json
{
"loss": "CachedMultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
```
</details>
<details><summary>codesearchnet_en</summary>
#### codesearchnet_en
* Dataset: [codesearchnet_en](https://huggingface.co/datasets/fyaronskiy/code_search_net_ru_en) at [3f90200](https://huggingface.co/datasets/fyaronskiy/code_search_net_ru_en/tree/3f9020072f2e6d5ac5445b39e566e5b669a1661b)
* Size: 30,000 evaluation samples
* Columns: <code>en_func_documentation_string</code> and <code>func_code_string</code>
* Approximate statistics based on the first 1000 samples:
| | en_func_documentation_string | func_code_string |
|:--------|:-------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|
| type | string | string |
| details | <ul><li>min: 6 tokens</li><li>mean: 200.33 tokens</li><li>max: 2498 tokens</li></ul> | <ul><li>min: 58 tokens</li><li>mean: 580.66 tokens</li><li>max: 8192 tokens</li></ul> |
* Samples:
| en_func_documentation_string | func_code_string |
|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>Train a deepq model.<br><br> Parameters<br> -------<br> env: gym.Env<br> environment to train on<br> network: string or a function<br> neural network to use as a q function approximator. If string, has to be one of the names of registered models in baselines.common.models<br> (mlp, cnn, conv_only). If a function, should take an observation tensor and return a latent variable tensor, which<br> will be mapped to the Q function heads (see build_q_func in baselines.deepq.models for details on that)<br> seed: int or None<br> prng seed. The runs with the same seed "should" give the same results. If None, no seeding is used.<br> lr: float<br> learning rate for adam optimizer<br> total_timesteps: int<br> number of env steps to optimizer for<br> buffer_size: int<br> size of the replay buffer<br> exploration_fraction: float<br> fraction of entire training period over which the exploration rate is annealed<br> exploration_final_eps: float<br> final value of ra...</code> | <code>def learn(env,<br> network,<br> seed=None,<br> lr=5e-4,<br> total_timesteps=100000,<br> buffer_size=50000,<br> exploration_fraction=0.1,<br> exploration_final_eps=0.02,<br> train_freq=1,<br> batch_size=32,<br> print_freq=100,<br> checkpoint_freq=10000,<br> checkpoint_path=None,<br> learning_starts=1000,<br> gamma=1.0,<br> target_network_update_freq=500,<br> prioritized_replay=False,<br> prioritized_replay_alpha=0.6,<br> prioritized_replay_beta0=0.4,<br> prioritized_replay_beta_iters=None,<br> prioritized_replay_eps=1e-6,<br> param_noise=False,<br> callback=None,<br> load_path=None,<br> **network_kwargs<br> ):<br> """Train a deepq model.<br><br> Parameters<br> -------<br> env: gym.Env<br> environment to train on<br> network: string or a function<br> neural network to use as a q function approximator. If string, has to be one of the ...</code> |
| <code>Save model to a pickle located at `path`</code> | <code>def save_act(self, path=None):<br> """Save model to a pickle located at `path`"""<br> if path is None:<br> path = os.path.join(logger.get_dir(), "model.pkl")<br><br> with tempfile.TemporaryDirectory() as td:<br> save_variables(os.path.join(td, "model"))<br> arc_name = os.path.join(td, "packed.zip")<br> with zipfile.ZipFile(arc_name, 'w') as zipf:<br> for root, dirs, files in os.walk(td):<br> for fname in files:<br> file_path = os.path.join(root, fname)<br> if file_path != arc_name:<br> zipf.write(file_path, os.path.relpath(file_path, td))<br> with open(arc_name, "rb") as f:<br> model_data = f.read()<br> with open(path, "wb") as f:<br> cloudpickle.dump((model_data, self._act_params), f)</code> |
| <code>CNN from Nature paper.</code> | <code>def nature_cnn(unscaled_images, **conv_kwargs):<br> """<br> CNN from Nature paper.<br> """<br> scaled_images = tf.cast(unscaled_images, tf.float32) / 255.<br> activ = tf.nn.relu<br> h = activ(conv(scaled_images, 'c1', nf=32, rf=8, stride=4, init_scale=np.sqrt(2),<br> **conv_kwargs))<br> h2 = activ(conv(h, 'c2', nf=64, rf=4, stride=2, init_scale=np.sqrt(2), **conv_kwargs))<br> h3 = activ(conv(h2, 'c3', nf=64, rf=3, stride=1, init_scale=np.sqrt(2), **conv_kwargs))<br> h3 = conv_to_fc(h3)<br> return activ(fc(h3, 'fc1', nh=512, init_scale=np.sqrt(2)))</code> |
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
```json
{
"loss": "CachedMultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
```
</details>
<details><summary>codesearchnet_pairs</summary>
#### codesearchnet_pairs
* Dataset: [codesearchnet_pairs](https://huggingface.co/datasets/fyaronskiy/code_search_net_ru_en) at [3f90200](https://huggingface.co/datasets/fyaronskiy/code_search_net_ru_en/tree/3f9020072f2e6d5ac5445b39e566e5b669a1661b)
* Size: 30,000 evaluation samples
* Columns: <code>en_func_documentation_string</code>, <code>ru_func_documentation_string</code>, and <code>label</code>
* Approximate statistics based on the first 1000 samples:
| | en_func_documentation_string | ru_func_documentation_string | label |
|:--------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:--------------------------------------------------------------|
| type | string | string | float |
| details | <ul><li>min: 6 tokens</li><li>mean: 200.33 tokens</li><li>max: 2498 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 194.76 tokens</li><li>max: 1278 tokens</li></ul> | <ul><li>min: 1.0</li><li>mean: 1.0</li><li>max: 1.0</li></ul> |
* Samples:
| en_func_documentation_string | ru_func_documentation_string | label |
|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
| <code>Train a deepq model.<br><br> Parameters<br> -------<br> env: gym.Env<br> environment to train on<br> network: string or a function<br> neural network to use as a q function approximator. If string, has to be one of the names of registered models in baselines.common.models<br> (mlp, cnn, conv_only). If a function, should take an observation tensor and return a latent variable tensor, which<br> will be mapped to the Q function heads (see build_q_func in baselines.deepq.models for details on that)<br> seed: int or None<br> prng seed. The runs with the same seed "should" give the same results. If None, no seeding is used.<br> lr: float<br> learning rate for adam optimizer<br> total_timesteps: int<br> number of env steps to optimizer for<br> buffer_size: int<br> size of the replay buffer<br> exploration_fraction: float<br> fraction of entire training period over which the exploration rate is annealed<br> exploration_final_eps: float<br> final value of ra...</code> | <code>Обучить модель deepq.<br><br> Параметры<br> -------<br> env: gym.Env<br> среда для обучения<br> network: строка или функция<br> нейронная сеть, используемая в качестве аппроксиматора функции Q. Если строка, она должна быть одной из имен зарегистрированных моделей в baselines.common.models<br> (mlp, cnn, conv_only). Если функция, она должна принимать тензор наблюдения и возвращать тензор скрытой переменной, которая<br> будет отображена в головы функции Q (см. build_q_func в baselines.deepq.models для деталей по этому поводу)<br> seed: int или None<br> seed генератора случайных чисел. Запуски с одинаковым seed "должны" давать одинаковые результаты. Если None, используется отсутствие семени.<br> lr: float<br> скорость обучения для оптимизатора Adam<br> total_timesteps: int<br> количество шагов среды для оптимизации<br> buffer_size: int<br> размер буфера воспроизведения<br> exploration_fraction: float<br> доля всего периода обучения, в течение которого прои...</code> | <code>1.0</code> |
| <code>Save model to a pickle located at `path`</code> | <code>Сохранить модель в pickle, расположенный по пути `path`</code> | <code>1.0</code> |
| <code>CNN from Nature paper.</code> | <code>CNN из статьи Nature.</code> | <code>1.0</code> |
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
```json
{
"loss": "CoSENTLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
```
</details>
<details><summary>solyanka_qa</summary>
#### solyanka_qa
* Dataset: [solyanka_qa](https://huggingface.co/datasets/ai-forever/solyanka) at [deeac62](https://huggingface.co/datasets/ai-forever/solyanka/tree/deeac621d4142d2754fa28f0eb58502b966383c3)
* Size: 5,000 evaluation samples
* Columns: <code>anchor</code> and <code>positive</code>
* Approximate statistics based on the first 1000 samples:
| | anchor | positive |
|:--------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
| type | string | string |
| details | <ul><li>min: 17 tokens</li><li>mean: 200.35 tokens</li><li>max: 533 tokens</li></ul> | <ul><li>min: 19 tokens</li><li>mean: 202.53 tokens</li><li>max: 525 tokens</li></ul> |
* Samples:
| anchor | positive |
|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>Atom IDE произвольное изменение строк Пользуюсь Atom IDE, установлены плагины для GIT'а, использую тему Material theme (может быть кому то это что то даст), в общем проблема такая, что в php файлах при сохранении файла, даже если я изменил всего один символ, он добавляет изменения очень странные,берет 2-3 строки (хз как выбирает) и удаляет их, а потом вставялет их же, без каких то либо изменений. При этом GIT фиксирует это изменение...<br>Вот скрин в blob формате: "blob:https://web.telegram.org/04094604-204d-47b0-a083-f8cd090bdfa0"</code> | <code>Проблема заключалась в том, что все IDE испльзуют свой символ перехода на следующую строку, если в команде разработчики используют разные IDE, у которых разный перенос строки, то при сохранении файла чужие переносы строк будут заменяться на свои :)</code> |
| <code>print() с частью текста и форматированием как переменная Python3 Есть повторяющаяся функция `print('\n' + f'{" ЗАПУСКАЕМ ТЕСТ ":=^120}' + '\n')`<br>на выходе получаем чтото типа<br>================ ЗАПУСКАЕМ ТЕСТ ================<br>или с другим текстом<br>================= КОНЕЦ ТЕСТА ==================<br>Текст внутри может меняться, форматирование - нет.<br>Как обернуть `print('\n' + f'{"":=^120}' + '\n')` в переменную, с возможностью подставлять нужный текст, типа `print_var('ПРИМЕР ТЕКСТА')`?</code> | <code>[code]<br> def print_var(str):<br> print(f'\n{" " + str + " ":=^120}\n')<br>[/code]<br>В результате:<br>[code]<br> >>> print_var('КАКОЙ_ТО ТЕКСТ')<br> ===================================================== КАКОЙ_ТО ТЕКСТ =====================================================<br>[/code]</code> |
| <code>Не получается перегрузить оператор присваивания в шаблонном классе Нужно перегрузить оператор присваивания в шаблонном классе, не могу понять, почему не работает стандартный синтаксис, при реализации выдает эту ошибку (/home/anton/Programming/tree/tree.h:96: ошибка: overloaded 'operator=' must be a binary operator (has 1 parameter)). Объявление и реализация в одном .h файле.<br>Объявление:<br>[code]<br> tree<T>& operator = (tree<T> &other);<br>[/code]<br>реалицация:<br>[code]<br> template <class T><br> tree<T>& operator = (tree<T> &other)<br> {<br> }<br>[/code]</code> | <code>Ну надо указать, какому классу он принадлежит... А так вы пытались реализовать унарный оператор `=`...<br>[code]<br> template <class T><br> tree<T>& tree<T>::operator = (tree<T> &other)<br> {<br> }<br>[/code]<br>И еще - вы точно планируете при присваивании менять присваиваемое? Может, лучше<br>[code]<br> template <class T><br> tree<T>& tree<T>::operator = (const tree<T> &other)<br> {<br> }<br>[/code]</code> |
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
```json
{
"loss": "CachedMultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
```
</details>
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `per_device_train_batch_size`: 4
- `per_device_eval_batch_size`: 16
- `gradient_accumulation_steps`: 32
- `learning_rate`: 2e-05
- `num_train_epochs`: 2
- `warmup_ratio`: 0.1
- `bf16`: True
- `resume_from_checkpoint`: ../models/RuModernBERT-base_bs128_lr_2e-05_2nd_epoch/checkpoint-27400
- `auto_find_batch_size`: True
- `batch_sampler`: no_duplicates
#### All Hyperparameters
<details><summary>Click to expand</summary>
- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 4
- `per_device_eval_batch_size`: 16
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 32
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 2e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 2
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.1
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: True
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: ../models/RuModernBERT-base_bs128_lr_2e-05_2nd_epoch/checkpoint-27400
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`:
- `auto_find_batch_size`: True
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: False
- `prompts`: None
- `batch_sampler`: no_duplicates
- `multi_dataset_batch_sampler`: proportional
- `router_mapping`: {}
- `learning_rate_mapping`: {}
</details>
### Framework Versions
- Python: 3.10.11
- Sentence Transformers: 5.1.2
- Transformers: 4.52.3
- PyTorch: 2.6.0+cu124
- Accelerate: 1.12.0
- Datasets: 4.0.0
- Tokenizers: 0.21.4
## Citation
### BibTeX
#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
```
#### MatryoshkaLoss
```bibtex
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```
#### CachedMultipleNegativesRankingLoss
```bibtex
@misc{gao2021scaling,
title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
year={2021},
eprint={2101.06983},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```
#### CoSENTLoss
```bibtex
@article{10531646,
author={Huang, Xiang and Peng, Hao and Zou, Dongcheng and Liu, Zhiwei and Li, Jianxin and Liu, Kay and Wu, Jia and Su, Jianlin and Yu, Philip S.},
journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
title={CoSENT: Consistent Sentence Embedding via Similarity Ranking},
year={2024},
doi={10.1109/TASLP.2024.3402087}
}
```
<!--
## Glossary
*Clearly define terms in order to be accessible across audiences.*
-->
<!--
## Model Card Authors
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->
<!--
## Model Card Contact
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->