A newer version of the Gradio SDK is available: 6.11.0
Data Sources & API Endpoints
K R&D Lab — Cancer Research Suite Author: Oksana Kolisnyk | kosatiks-group.pp.ua Repo: github.com/TEZv/K-RnD-Lab-PHYLO-03_2026 Generated: 2026-03-07
Real Data APIs (Group A Tabs)
1. PubMed E-utilities (NCBI)
| Property | Value |
|---|---|
| Base URL | https://eutils.ncbi.nlm.nih.gov/entrez/eutils |
| Auth | None required (free, no API key) |
| Rate limit | 3 requests/sec without key; enforced via time.sleep(0.34) |
| Endpoints used | esearch.fcgi — search & count; esummary.fcgi — fetch metadata |
| Used in tabs | A1 (paper counts per process), A4 (papers per year), A2 (gene paper counts) |
| Docs | https://www.ncbi.nlm.nih.gov/books/NBK25501/ |
| Terms of use | https://www.ncbi.nlm.nih.gov/home/about/policies/ |
Example call (paper count):
GET https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi
?db=pubmed
&term="ferroptosis" AND "GBM"[tiab]
&rettype=count
&retmode=json
2. ClinVar E-utilities (NCBI)
| Property | Value |
|---|---|
| Base URL | https://eutils.ncbi.nlm.nih.gov/entrez/eutils |
| Auth | None required |
| Rate limit | Same as PubMed (3 req/sec) |
| Endpoints used | esearch.fcgi?db=clinvar — variant search; esummary.fcgi?db=clinvar — classification |
| Used in tabs | A3 (Real Variant Lookup) |
| Docs | https://www.ncbi.nlm.nih.gov/clinvar/docs/api_http/ |
| Data policy | All ClinVar data is public domain |
Example call:
GET https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi
?db=clinvar
&term=NM_007294.4:c.5266dupC
&retmode=json
&retmax=5
3. OpenTargets Platform GraphQL API
| Property | Value |
|---|---|
| Base URL | https://api.platform.opentargets.org/api/v4/graphql |
| Auth | None required (free, open access) |
| Rate limit | No hard limit; reasonable use expected |
| Endpoints used | GraphQL POST — disease associations, tractability, known drugs |
| Used in tabs | A1 (process associations), A2 (target gap index), A5 (druggable orphans) |
| Docs | https://platform-docs.opentargets.org/data-access/graphql-api |
| Data release | Updated quarterly; cite as "Open Targets Platform [release date]" |
| License | CC0 (public domain) |
Example query (disease-associated targets):
query AssocTargets($efoId: String!, $size: Int!) {
disease(efoId: $efoId) {
associatedTargets(page: {index: 0, size: $size}) {
rows {
target { approvedSymbol approvedName }
score
}
}
}
}
EFO IDs used:
| Cancer | EFO ID |
|---|---|
| GBM | EFO_0000519 |
| PDAC | EFO_0002618 |
| SCLC | EFO_0000702 |
| UVM | EFO_0004339 |
| DIPG | EFO_0009708 |
| ACC | EFO_0003060 |
| MCC | EFO_0005558 |
| PCNSL | EFO_0005543 |
| Pediatric AML | EFO_0000222 |
4. gnomAD GraphQL API
| Property | Value |
|---|---|
| Base URL | https://gnomad.broadinstitute.org/api |
| Auth | None required |
| Rate limit | No hard limit; reasonable use expected |
| Endpoints used | GraphQL POST — variantSearch query |
| Dataset | gnomad_r4 (v4, 807,162 individuals) |
| Used in tabs | A3 (Real Variant Lookup — allele frequency) |
| Docs | https://gnomad.broadinstitute.org/api |
| License | ODC Open Database License (ODbL) |
Example query:
query VariantSearch($query: String!, $dataset: DatasetId!) {
variantSearch(query: $query, dataset: $dataset) {
variant_id
rsids
exome { af }
genome { af }
}
}
5. ClinicalTrials.gov API v2
| Property | Value |
|---|---|
| Base URL | https://clinicaltrials.gov/api/v2 |
| Auth | None required |
| Rate limit | No hard limit documented; polite use recommended |
| Endpoints used | GET /studies — trial search by gene + cancer type |
| Used in tabs | A2 (trial counts per gene), A5 (orphan target trial check) |
| Docs | https://clinicaltrials.gov/data-api/api |
| Data policy | Public domain (US government) |
Example call:
GET https://clinicaltrials.gov/api/v2/studies
?query.term=KRAS GBM
&pageSize=1
&format=json
6. DepMap Public Data
| Property | Value |
|---|---|
| Source | Broad Institute DepMap Portal |
| URL | https://depmap.org/portal/download/all/ |
| File | CRISPR_gene_effect.csv (Chronos scores) |
| Auth | None required (public download) |
| Used in tabs | A2 (essentiality scores for gap index) |
| Score convention | Negative = essential (−1 = median essential gene effect); inverted in app per know-how guide |
| License | CC BY 4.0 |
| Citation | Broad Institute DepMap, [release]. DepMap Public [release]. figshare. |
Implementation note: The app uses a curated reference gene set with representative scores as a lightweight proxy. For full analysis, download the complete CRISPR_gene_effect.csv (~500 MB) from depmap.org and replace
_load_depmap_sample()inapp.py.
Simulated Data Sources (Group B Tabs)
All Group B tabs use rule-based computational models — no external APIs.
| Tab | Model Type | Basis |
|---|---|---|
| B1 — miRNA Explorer | Curated lookup table | Published miRNA-target databases (miRDB, TargetScan concepts) |
| B2 — siRNA Targets | Curated efficacy estimates | Published siRNA screen literature |
| B3 — LNP Corona | Langmuir adsorption model | Corona proteomics literature (Monopoli et al. 2012; Lundqvist et al. 2017) |
| B4 — Flow Corona | Competitive Langmuir kinetics | Vroman effect literature (Vroman 1962; Hirsh et al. 2013) |
| B5 — Variant Concepts | ACMG/AMP 2015 rule set | Richards et al. 2015 ACMG guidelines |
⚠️ All Group B outputs are labeled SIMULATED in the UI and must not be used for clinical or research decisions.
RAG Chatbot (Tab A6)
| Property | Value |
|---|---|
| Embedding model | all-MiniLM-L6-v2 (sentence-transformers) |
| Model size | ~80 MB, CPU-compatible |
| Vector index | FAISS IndexFlatIP (cosine similarity on L2-normalized vectors) |
| Corpus | 20 curated paper abstracts (see chatbot.py PAPER_CORPUS) |
| Source | PubMed abstracts (public domain) |
| No external API | Fully offline after model download |
20 Indexed PMIDs (all verified against PubMed esummary + efetch, 2026-03-07):
| PMID | First Author | Topic | Journal | Year |
|---|---|---|---|---|
| 34394960 | Hou X | LNP mRNA delivery review | Nat Rev Mater | 2021 |
| 32251383 | Cheng Q | SORT LNPs organ selectivity | Nat Nanotechnol | 2020 |
| 29653760 | Sabnis S | Novel amino lipid series for mRNA | Mol Ther | 2018 |
| 22782619 | Jayaraman M | Ionizable lipid siRNA LNP potency | Angew Chem Int Ed | 2012 |
| 33208369 | Rosenblum D | CRISPR-Cas9 LNP cancer therapy | Sci Adv | 2020 |
| 18809927 | Lundqvist M | Nanoparticle size/surface protein corona | PNAS | 2008 |
| 22086677 | Walkey CD | Nanomaterial-protein interactions | Chem Soc Rev | 2012 |
| 31565943 | Park M | Accessible surface area nanoparticle corona | Nano Lett | 2019 |
| 33754708 | Sebastiani F | ApoE binding drives LNP rearrangement | ACS Nano | 2021 |
| 20461061 | Akinc A | Endogenous ApoE-mediated LNP liver delivery | Mol Ther | 2010 |
| 30096302 | Bailey MH | Cancer driver genes TCGA pan-cancer | Cell | 2018 |
| 30311387 | Landrum MJ | ClinVar at five years | Hum Mutat | 2018 |
| 32461654 | Karczewski KJ | gnomAD mutational constraint 141,456 humans | Nature | 2020 |
| 27328919 | Bouaoun L | TP53 variations IARC database | Hum Mutat | 2016 |
| 31820981 | Lanman BA | KRAS G12C covalent inhibitor AMG 510 | J Med Chem | 2020 |
| 28678784 | Sahin U | Personalized RNA mutanome vaccines | Nature | 2017 |
| 31348638 | Kozma GT | Anti-PEG IgM complement activation LNP | ACS Nano | 2019 |
| 33016924 | Cafri G | mRNA neoantigen T cell immunity GI cancer | J Clin Invest | 2020 |
| 31142840 | Cristiano S | Genome-wide cfDNA fragmentation in cancer | Nature | 2019 |
| 33883548 | Larson MH | Cell-free transcriptome tissue biomarkers | Nat Commun | 2021 |
Caching System
All real API calls are cached locally to reduce latency and respect rate limits.
| Property | Value |
|---|---|
| Cache directory | ./cache/ |
| TTL | 24 hours |
| Key format | {endpoint}_{md5(query)}.json |
| Format | JSON |
| Invalidation | Automatic on TTL expiry; manual by deleting ./cache/ |
Lab Journal
| Property | Value |
|---|---|
| File | ./lab_journal.csv |
| Format | CSV (timestamp, tab, action, result_summary, note) |
| Auto-logged | Every tab run automatically logs an entry |
| Manual notes | Via sidebar note field |
Data Sources documented by K R&D Lab Cancer Research Suite | 2026-03-07