SoFairVerifier / README.md
mdocekal's picture
Update README.md
7c69a60 verified
---
base_model:
- answerdotai/ModernBERT-large
---
Model for validating software mention candidates.
## Input
The model expects text with software mention that should be verified. The candidate must be enclosed in <verify> </verify> tags. If Available it also expects search results obtained for given candidate.
There is en example of input for the model:
```
GOSeq (52). RPKM calculations in features were done using <verify> bedtools </verify> (53). CLIPseq/RNA-seq coverage plots were calculated using cover-ageBed from the bedtools toolset and visualized using custom scripts in
Search results for the target candidate are:
{"title":"bedtools: a powerful toolset for genome arithmetic β€”","link":"https://bedtools.readthedocs.io/en/latest/","snippet":"As of version 2.18, bedtools is substantially more scalable thanks to improvements we have made in the algorithm used to process datasets that are ..."}
{"title":"The BEDTools suite β€” bedtools 2.31.0 documentation","link":"https://bedtools.readthedocs.io/en/latest/content/bedtools-suite.html","snippet":"bedtools consists of a suite of sub-commands that are invoked as follows: ... The full list of bedtools sub-commands."}
{"title":"intersect β€” bedtools 2.31.0 documentation","link":"https://bedtools.readthedocs.io/en/latest/content/tools/intersect.html","snippet":"By default, bedtools intersect will report an overlap between A and B so long as there is at least one base pair is overlapping."}
{"title":"Installation β€” bedtools 2.31.0 documentation","link":"https://bedtools.readthedocs.io/en/latest/content/installation.html","snippet":"Installing bedtools involves either downloading the source code and compiling it manually, or installing stable release from package managers such as ..."}
{"title":"coverage β€” bedtools 2.31.0 documentation","link":"https://bedtools.readthedocs.io/en/latest/content/tools/coverage.html","snippet":"For example, bedtools coverage can compute the coverage of sequence alignments (file B) across 1 kilobase (arbitrary) windows (file A) tiling a ..."}
{"title":"GitHub - arq5x/bedtools2: bedtools - the swiss army knife for","link":"https://github.com/arq5x/bedtools2","snippet":"As of version 2.18, bedtools is substantially more scalable thanks to improvements we have made in the algorithm used to process datasets that are ..."}
{"title":"'bedtools' tag wiki - Bioinformatics Stack Exchange","link":"https://bioinformatics.stackexchange.com/tags/bedtools/info","snippet":"For example, bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic ..."}
{"title":"BEDTools: a flexible suite of utilities for comparing genomic","link":"https://pubmed.ncbi.nlm.nih.gov/20110278/","snippet":"BEDTools can be combined with one another as well as with standard UNIX commands, thus facilitating routine genomics tasks as well as pipelines that ..."}
{"title":"BEDTools: The Swiss-Army Tool for Genome Feature Analysis","link":"https://pubmed.ncbi.nlm.nih.gov/25199790/","snippet":"Several protocols are presented for common genomic analyses, demonstrating how simple BEDTools operations may be combined to create bespoke pipelines ..."}
{"title":"dna sequence - How to use bedtools coverage to assess genome","link":"https://stackoverflow.com/questions/53733419/how-to-use-bedtools-coverage-to-assess-genome-assembly","snippet":"I have been attempting to use \\" bedtools coverage\\" command to try and assess the coverage of my genomic assembly and the existance of any inversions ..."}
```
There is jinja2 template used for training the model:
```
{{marked_input_text}}
{% if search_results | length > 0 %}
Search results for the target candidate are:
{% for r in search_results %}
{{r | model_dump_json}}
{%endfor %}
{% endif %}
```