Instructions to use JALLAJ/5epo with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use JALLAJ/5epo with sentence-transformers:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("JALLAJ/5epo")

sentences = [
    "How many more requests can the proposed method handle effectively compared to the RMLSA-OFC method before performance declines, based on the data in Figure 6?",
    "is a great challenge to synchronize the pulse trains, and this will induce extra interpulse jitter. Perfect repetition rate multiplication can be realized by the temporal Talbot or self-imaging effect through propagating a periodic temporal signal in a dispersive medium under the first-order dispersion conditions.25 Further, repetition rate demultiplication could also be realized by introducing a suitable periodic temporal phase modulation to the original signal and carefully controlling the amount of dispersion.26 However, both of these two schemes mentioned above require extra optical systems out of the laser cavities to modify the repetition rates of the pulse sources. They increase the system complexity and make these systems loose their attractiveness for portable and robust device operation. An intralaser cavity method is by harmonic mode-locking (HML), where the pulse energy is quantized by the peak-power-limiting effect. Generally, much higher pump power is needed for passive HML lasers to boost its repetition rates to tens of GHz.27,28 Both the intracavity noise fluctuation and the chance of the pulse drop in-out increase along with the increase of the harmonic order.,9,29 This prevents boosting the laser repetition rate beyond 100 GHz. Although pulse sources at repetition rates beyond $100\\ \\mathrm{GHz}$ can be realized by employing active HML schemes,3 a stable radio frequency source is needed and it restricts the dimension and cost for integration applications of the active HML lasers.",
    "$$\nY=\\frac{\\gamma^{2}L^{2}}{\\theta}X^{3}-\\frac{2\\delta_{1}\\gamma L}{\\theta}X^{2}+\\frac{(\\delta_{1}^{2}+\\alpha^{2})}{\\theta}X.\n$$  \nwhere $Y=\\left|\\psi_{\\mathrm{in}}\\right|^{2}$ is the pump power, $X=\\left|\\psi\\right|^{2}$ is the intracavity power, ${\\alpha}=({\\alpha}_{0}L+\\theta)/2$ is the total loss per roundtrip. The turning points of the function $Y(X)$ can be calculated by setting the first derivative $d Y/d X$ to zero as  \n$$\n3\\gamma^{2}L^{2}X^{2}-4\\delta_{1}\\gamma L X+(\\delta_{1}^{2}+\\alpha^{2})=0.\n$$  \nThe function $X(Y)$ given by Eq. (6) has a bistable region when Eq. (7) has two different real roots of $X.$ which requires $\\delta_{1}^{2}>3\\alpha^{2}$ . The two real roots corresponding to the two turning points of the bistable curve are given by  \n$$\nX_{1,2}=\\frac{2\\delta_{1}\\pm\\sqrt{\\delta_{1}^{2}-3\\alpha^{2}}}{3\\gamma L}.\n$$",
    "To compare the proposed method with widely used state-of-the-art allocation methods, three approaches were considered: the RMLSA-OFC method [19], the First Fit (FF) algorithm, and the Random Wavelength Assignment (RWA) algorithm [20,21]. The method in [19] employs a heuristic algorithm for allocation using optical frequency combs (OFCs). In contrast, the FF method sequentially assigns resources by selecting the first available carrier that meets the bandwidth requirement, while the RWA method allocates wavelengths in a random manner, ensuring that each selected wavelength satisfies the transmission requirements.  \nFigure 6 demonstrates that the method proposed in [19] can effectively allocate up to 110 requests with a low BBR, reaching a maximum value of 1.5. This indicates a lower performance compared to our method (see Figure 5). The ellipses in Figure 6 highlight regions with the highest BBR values, marking critical performance areas. A dashed line indicates the threshold at approximately 110 requests, beyond which the allocation method's performance declines as the number of requests increases. This visual representation emphasizes the system's limitations under higher demand. In contrast, our approach maintains effective allocation up to 170 requests, showing a clear difference of 60 clients.  \nWhen comparing the results of our RMLSA-ILP-OFC approach to those of the FF method (see Figure 7), a marked increase in the Blocking-to-Bandwidth Ratio (BBR) is observed, with rejections exceeding 1 in several instances (marked in yellow and light blue). The BBR values, ranging from 1.5 to 3.0 (highlighted with red ellipses), indicate critical points that significantly affect Quality of Service (QoS). Additionally, the allocation method begins to fail at approximately 105 requests (red dotted line), leading to a sharp rise in BBR. These findings underscore the limitations of the FF method, which struggles to manage bandwidth efficiently under high-demand scenarios, highlighting the need for more robust solutions. Our approach demonstrates its superiority by maintaining resource allocations effectively up to 170 requests."
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]

Notebooks
Google Colab
Kaggle

SentenceTransformer based on BAAI/bge-small-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-small-en-v1.5. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: BAAI/bge-small-en-v1.5
Maximum Sequence Length: 512 tokens
Output Dimensionality: 384 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'What is required to access soliton states in terms of the pump laser frequency tuning direction?',
    'microcomb into a THz wave. The optical power of the auxiliary light is monitored by a third photodiode (not shown in Fig. 1(a)). The frequency of the auxiliary laser is tuned into its resonance from the blue detuned side and fixed on the blue side of the resonance. Note that, the auxiliary laser is free-running without feedback control on the laser frequency or the power during the soliton generation. By optimizing the laser detuning and optical power of the auxiliary laser, soliton states can be accessed by slowly tuning the pump laser frequency into a soliton regime from the blue detuned side. A detailed description of the soliton generation process can be found in the Ref. [21].',
    'To gain further insights into the CW and CCW fields, we define modified detuning as the frequency difference between the pump laser and the XPM shifted resonance, given by  \n$$\n\\Delta\\omega_{\\mathrm{mod,CW}}=\\Delta\\omega-(2-f_{R})P_{\\mathrm{CCW}}\n$$  \n$$\n\\Delta\\omega_{\\mathrm{mod,CCW}}=\\Delta\\omega-(2-f_{R})P_{\\mathrm{CW}}\n$$  \nwhere $P_{\\mathrm{CCW}}={\\overline{{|B|^{2}}}}$ and $P_{\\mathrm{CW}}={\\overline{{|A|^{2}}}}$ are the average power in the corresponding directions. By substituting the modified detunings into Eqs. (4) and (5), the equations become a form similar to the unidirectionally driven Lugiato-Lefever equation [31].  \nFor a general model with unidirectional pump, the soliton peak power is mainly determined by the cold cavity detuning [31,32]. A larger detuning corresponds to a higher soliton peak power. As illustrated in Fig. 4, if the CW direction has a larger pump power than the CCW direction, the CW intracavity power will be higher, i.e., $P_{\\mathrm{CW}}{>}P_{\\mathrm{CCW}}$ . Thus the modified detuning $\\Delta\\omega_{\\mathrm{mod,CW}}{>}\\Delta\\omega_{\\mathrm{mod,CCW}}$ . As a consequence, the soliton peak power in the CW direction becomes larger than that in the CCW direction. Therefore, by tuning the MZI to change the pump splitting ratio, different modified detunings are introduced in the CW and CCW directions through the XPM effect, leading to different soliton peak power in the two directions. The evolution of the average intracavity power and soliton peak power shown in Figs. 3(i) and 3(j) is consistent with the above analysis. To further validate the conclusion, we run simulations with different pump spliting ratios and calculate the modified detunings. Figure 5(a) illustrates the relationship between the pump splitting ratio and the modified detuning. Figure 5(b) illustrates the relationship between the soliton peak power and the modified detuning. Due to symmetry, the curves are degenerated in the CW and CCW directions.  \nIt has been known that the soliton group velocity and repetition rate can be changed due to Raman induced soliton self-frequency shift [17]. Therefore, the different soliton peak power in the CW and CCW directions will cause different Raman self-frequency shifts and different soliton repetition rates. Theoretically, the normalized repetition rate difference is related to the detuning by [17]  \n$$',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.5185, 0.1030],
#         [0.5185, 1.0000, 0.2032],
#         [0.1030, 0.2032, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

Size: 1,466 training samples
Columns: sentence_0, sentence_1, and sentence_2

Approximate statistics based on the first 1000 samples:

	sentence_0	sentence_1	sentence_2
type	string	string	string
details	min: 13 tokens mean: 27.84 tokens max: 77 tokens	min: 90 tokens mean: 402.23 tokens max: 512 tokens	min: 25 tokens mean: 394.46 tokens max: 512 tokens

Samples:

sentence_0	sentence_1	sentence_2
`How does the behavior of front solutions differ between high and low drive powers in a normal dispersion Kerr resonator without spectral filtering?`	Here, we develop some insight into the difference between dark solitons in the LLE and LLE-F as well as into the formation of chirped-pulse solitons unique to the LLE-F. For fixed detuning, Fig. 7 indicates that the drive power is a suitable parameter for traversing between the different solution types. We therefore examine the variation of steady-state solutions along the dashed line in Fig. 7. For a normal dispersion Kerr resonator without spectral filtering, front solutions (also known as domain walls or switching waves) often move in the reference frame of the driving field [19,54,55,66]. To examine the moving properties of front solutions, we initialize the simulation with a two-front intensity variation in the time domain. The equation is numerically solved with this initial condition and examined as a function of propagation distance until the waveform converges. At large drive powers without a filter [Fig. 8(a) and a from Fig. 7], the front solutions move together and vanish to...	Dissipative solitons are self-localised structures resulting from the double balance of dispersion by nonlinearity and dissipation by a driving force arising in numerous systems. In Kerr-nonlinear optical resonators, temporal solitons permit the formation of light pulses in the cavity and the generation of coherent optical frequency combs. Apart from shape-invariant stationary solitons, these systems can support breathing dissipative solitons exhibiting a periodic oscillatory behaviour. Here, we generate and study single and multiple breathing solitons in coherently driven microresonators. We present a deterministic route to induce soliton breathing, allowing a detailed exploration of the breathing dynamics in two microresonator platforms. We measure the relation between the breathing frequency and two control parameters--pump laser power and effective-detuning--and observe transitions to higher periodicity, irregular oscillations and switching, in agreement with numerical predictions....
`What are the key advantages of microcombs that make them suitable for portable applications?`	Microresonator based optical frequency comb (often termed "microcomb" or "Kerr comb) generation was first demonstrated in 2007 [1]. It quickly attracted people's great interest and evolved to a hot research area. Microcombs are very promising for portable applications because they have many unique advantages including the capability of generating ultra-broad comb spectra (even more than one octave [2,3]), chip-level integration [4,5], and low power consumption. The basic scheme of microcomb generation is shown in Fig. 1(a). The frequency of a pump laser is tuned into the resonance of one high-quality-factor $(\boldsymbol{Q})$ microresonator which is made of Kerr nonlinear material. When the pump power exceeds some threshold, new frequency lines grow due to parametric gain. More lines are generated through cascaded four-wave mixing between the pump and initial lines, forming a broad frequency comb [6]. Intense studies have been performed to investigate microcomb generation. Various mate...	We briefly review the physics of the parametric process in microresonators, discussed in detail in (30, 80). Kerr frequency combs were initially discovered in silica microtoroids, and experiments proved that the parametrically generated (11, 81) sidebands were equidistant to at least one part in $10^{-17}$ as compared with the optical carrier. In these early experiments, the combs repetition rate was in the terahertz range, and a femtosecond-laser frequency comb was used to bridge and verify the equidistant nature of the teeth spacing. It is today understood that such highly coherent combs only exist in certain regimes.
`What is the formula for determining the number of rolls that appear in the azimuthal direction when the cavity is pumped just above the threshold of modulational instability?`	Roll patterns emerge from noise after the breakdown of an unstable flat background through modulational instability, when the resonator is pumped above a certain threshold. This mechanism preferably occurs in the regime of anomalous GVD, but, however, rolls can also be sustained in the normal GVD regime, although under very marginal conditions (typically, very large detuning, see refs. [9, 18, 47]). When the pump is below the threshold, there is only one excited mode in the resonator $\left(l\ =\ 0\right)$ , while all the sidemodes amplitudes $\mathcal{A}{l}$ with $l\neq0$ are null. From the spatiotemporal standpoint, the intracavity feld is constant (flat background). Under certain conditions, when the pump $F$ is increased beyond a certain threshold value $F{\mathrm{th}}$ , the flat background solution becomes unstable and breaks down into a roll pattern characterized by a periodic modulation of the intracavity power as a function of the azimuthal angle (see Fig. 6). This phenome...	$$ Note that $\mathcal{N}{h}=S\mathcal{M}{h}S^{-1}$ and $\mathcal{I N}{h}=S\mathcal{I M}{h}S^{-1}$ , so the spectra of the full linearized operator, $\mathcal{I N}{h}$ , is equivalent to $\mathcal{I M}{h}$ . Also, $\sigma(\mathcal{N}{h})$ is equivalent to $\sigma(\mathcal{M}{h})$ Since the two problems are equivalent, we note that the form (4.3) of the eigenvalue problem is more suggestive of our approach. For $h=0$ , we have two dimensional $\mathrm{Ker}[\mathcal{M}{0}]$ , spanned by the vectors is?. $\big(\begin{array}{c}{\varphi{0}^{\prime}}\ {0}\end{array}\big)$ and $\big(\begin{array}{c}{0}\ {\varphi_{0}}\end{array}\big)$ . We need to see what the evolution of the modulational eigenvalue is as $h:0

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim",
    "gather_across_devices": false
}

Training Hyperparameters

Non-Default Hyperparameters

num_train_epochs: 5
fp16: True
multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: no
prediction_loss_only: True
per_device_train_batch_size: 8
per_device_eval_batch_size: 8
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 5
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: True
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
tp_size: 0
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin
router_mapping: {}
learning_rate_mapping: {}

Training Logs

Epoch	Step	Training Loss
2.7174	500	0.2217

Framework Versions

Python: 3.9.19
Sentence Transformers: 5.1.0
Transformers: 4.51.0
PyTorch: 2.5.0+cu124
Accelerate: 0.34.2
Datasets: 2.19.0
Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}