Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,53 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
base_model:
|
| 3 |
+
- answerdotai/ModernBERT-large
|
| 4 |
+
---
|
| 5 |
+
|
| 6 |
+
Model for validating software mention candidates.
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
## Input
|
| 10 |
+
The model expects text with software mention that should be verified. The candidate must be enclosed in <verify> </verify> tags. If Available it also expects search results obtained for given candidate.
|
| 11 |
+
|
| 12 |
+
There is en example of input for the model:
|
| 13 |
+
|
| 14 |
+
```
|
| 15 |
+
GOSeq (52). RPKM calculations in features were done using <verify> bedtools </verify> (53). CLIPseq/RNA-seq coverage plots were calculated using cover-ageBed from the bedtools toolset and visualized using custom scripts in
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
Search results for the target candidate are:
|
| 19 |
+
|
| 20 |
+
{"title":"bedtools: a powerful toolset for genome arithmetic —","link":"https://bedtools.readthedocs.io/en/latest/","snippet":"As of version 2.18, bedtools is substantially more scalable thanks to improvements we have made in the algorithm used to process datasets that are ..."}
|
| 21 |
+
|
| 22 |
+
{"title":"The BEDTools suite — bedtools 2.31.0 documentation","link":"https://bedtools.readthedocs.io/en/latest/content/bedtools-suite.html","snippet":"bedtools consists of a suite of sub-commands that are invoked as follows: ... The full list of bedtools sub-commands."}
|
| 23 |
+
|
| 24 |
+
{"title":"intersect — bedtools 2.31.0 documentation","link":"https://bedtools.readthedocs.io/en/latest/content/tools/intersect.html","snippet":"By default, bedtools intersect will report an overlap between A and B so long as there is at least one base pair is overlapping."}
|
| 25 |
+
|
| 26 |
+
{"title":"Installation — bedtools 2.31.0 documentation","link":"https://bedtools.readthedocs.io/en/latest/content/installation.html","snippet":"Installing bedtools involves either downloading the source code and compiling it manually, or installing stable release from package managers such as ..."}
|
| 27 |
+
|
| 28 |
+
{"title":"coverage — bedtools 2.31.0 documentation","link":"https://bedtools.readthedocs.io/en/latest/content/tools/coverage.html","snippet":"For example, bedtools coverage can compute the coverage of sequence alignments (file B) across 1 kilobase (arbitrary) windows (file A) tiling a ..."}
|
| 29 |
+
|
| 30 |
+
{"title":"GitHub - arq5x/bedtools2: bedtools - the swiss army knife for","link":"https://github.com/arq5x/bedtools2","snippet":"As of version 2.18, bedtools is substantially more scalable thanks to improvements we have made in the algorithm used to process datasets that are ..."}
|
| 31 |
+
|
| 32 |
+
{"title":"'bedtools' tag wiki - Bioinformatics Stack Exchange","link":"https://bioinformatics.stackexchange.com/tags/bedtools/info","snippet":"For example, bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic ..."}
|
| 33 |
+
|
| 34 |
+
{"title":"BEDTools: a flexible suite of utilities for comparing genomic","link":"https://pubmed.ncbi.nlm.nih.gov/20110278/","snippet":"BEDTools can be combined with one another as well as with standard UNIX commands, thus facilitating routine genomics tasks as well as pipelines that ..."}
|
| 35 |
+
|
| 36 |
+
{"title":"BEDTools: The Swiss-Army Tool for Genome Feature Analysis","link":"https://pubmed.ncbi.nlm.nih.gov/25199790/","snippet":"Several protocols are presented for common genomic analyses, demonstrating how simple BEDTools operations may be combined to create bespoke pipelines ..."}
|
| 37 |
+
|
| 38 |
+
{"title":"dna sequence - How to use bedtools coverage to assess genome","link":"https://stackoverflow.com/questions/53733419/how-to-use-bedtools-coverage-to-assess-genome-assembly","snippet":"I have been attempting to use \\" bedtools coverage\\" command to try and assess the coverage of my genomic assembly and the existance of any inversions ..."}
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
```
|
| 42 |
+
|
| 43 |
+
There is jinja2 template used for training the model:
|
| 44 |
+
|
| 45 |
+
{{marked_input_text}}
|
| 46 |
+
|
| 47 |
+
{% if search_results | length > 0 %}
|
| 48 |
+
Search results for the target candidate are:
|
| 49 |
+
{% for r in search_results %}
|
| 50 |
+
{{r | model_dump_json}}
|
| 51 |
+
{%endfor %}
|
| 52 |
+
{% endif %}
|
| 53 |
+
|