Spaces:
Sleeping
Data Schema β DDI Checker
27 normalised tables produced by parser/run_all.py from the DrugBank XML.
Key Statistics
| Full (step1) | Approved-only (step3) | |
|---|---|---|
| Drugs | 19,842 | 4,795 |
| DDI pairs (directed) | 2,911,156 | β |
| DDI pairs (undirected) | 1,455,878 | 824,249 |
| Products | 475,225 | 473,660 |
| Polypeptides | 5,394 | 3,439 |
| Interactants (BE-IDs) | 5,449 | 3,458 |
| Pathways | 48,627 | 48,622 |
| References | 43,553 | 35,721 |
| Drug-protein links | 34,931 | 20,700 |
FK Convention
Every drugbank_id column is a FK to drugs.drugbank_id unless noted.
Tables
1. drugs
One row per drug. Scalar fields + inlined ClassyFire classification.
| Column | Description |
|---|---|
| drugbank_id | PK β primary DrugBank ID (e.g. DB00001) |
| name | Drug name |
| drug_type | small molecule or biotech |
| description | Full drug description |
| cas_number | CAS Registry Number |
| unii | FDA Unique Ingredient Identifier |
| average_mass / monoisotopic_mass | Molecular masses (float) |
| state | solid / liquid / gas |
| indication | Therapeutic indications |
| pharmacodynamics | Pharmacodynamics description |
| mechanism_of_action | Mechanism of action |
| toxicity | Toxicity and overdose information |
| metabolism | Metabolic pathway description |
| absorption / half_life / protein_binding | PK parameters |
| route_of_elimination / volume_of_distribution / clearance | PK parameters |
| classification_description | ClassyFire description |
| classification_direct_parent / kingdom / superclass / class / subclass | ClassyFire hierarchy |
| created_date / updated_date | Record timestamps |
2. drug_ids
| Column | Description |
|---|---|
| drugbank_id | FK β drugs |
| legacy_id | ID value (DB#####, BIOD#####, BTD#####, APRD#####, EXPT#####, NUTR#####) |
| is_primary | True for the canonical PK used in drugs |
3. drug_attributes
Catch-all for 9 multi-valued list fields. Filter by attr_type.
| attr_type | value | value2 | value3 |
|---|---|---|---|
group |
approved / withdrawn / experimental / investigational / illicit / nutraceutical / vet_approved |
β | β |
synonym |
synonym text | language code | coder |
affected_organism |
organism name | β | β |
food_interaction |
description | β | β |
sequence |
FASTA string | format | β |
ahfs_code |
AHFS code | β | β |
pdb_entry |
PDB ID | β | β |
classification_alt_parent |
ClassyFire alt parent | β | β |
classification_substituent |
ClassyFire substituent | β | β |
4. drug_properties
| Column | Description |
|---|---|
| drugbank_id | FK β drugs |
| property_class | calculated (ChemAxon/ALOGPS) or experimental |
| kind | Property name (logP, SMILES, Melting Point, Water Solubility, IUPAC Name, β¦) |
| value | Property value |
| source | Source tool (ChemAxon, ALOGPS, β¦) |
5. external_identifiers
| Column | Description |
|---|---|
| entity_type | drug / polypeptide / salt / drug_link |
| entity_id | PK of the entity |
| resource | Database name (ChEBI, ChEMBL, PubChem, KEGG, BindingDB, PharmGKB, ZINC, RxCUI, HGNC, β¦) |
| identifier | ID value or URL |
6. references + 7. reference_associations
Globally deduplicated bibliography. Dedup keys: articles β pubmed_id; textbooks β isbn+citation; links β url; attachments β title+url.
8. salts, 9. products, 10. drug_commercial_entities, 11. mixtures, 12. prices
Standard commercial/formulation tables.
13. categories + 14. drug_categories
Normalized MeSH pharmacological categories.
15. dosages, 16. atc_codes, 17. patents
atc_codes has full 4-level hierarchy: l1_code/l1_name β¦ l4_code/l4_name.
18. drug_interactions
Core edge table (directed). Both AβB and BβA stored. Use drug_interactions_dedup.csv for undirected edges.
| Column | Description |
|---|---|
| drugbank_id | Source drug FK |
| interacting_drugbank_id | Target drug FK |
| description | Interaction description text |
19. drug_snp_data, 20. pathways, 21. pathway_members, 22. reactions
Pharmacogenomics (SNP), SMPDB pathways, metabolic reactions.
23. interactants + 24. drug_interactants
Binding entities (targets, enzymes, carriers, transporters).
| Column | Description |
|---|---|
| interactant_id | BE-ID (e.g. BE0000048) |
| role | target / enzyme / carrier / transporter |
| known_action | yes / no / unknown |
| actions | Pipe-delimited (e.g. inhibitor|substrate) |
| inhibition_strength / induction_strength | Enzyme-only |
25. polypeptides, 26. interactant_polypeptides, 27. polypeptide_attributes
UniProt protein records, globally deduplicated. Attributes: synonyms, Pfam domains, GO classifiers.
drug_interactions_dedup.csv
Undirected DDI pairs with integer PK. Present in step2_dedup/ and step3_approved/.
| Column | Description |
|---|---|
| interaction_id | PK β auto-increment integer (1-based) |
| drugbank_id_a | Drug A (lexicographically smaller ID) |
| drugbank_id_b | Drug B (lexicographically larger ID) |
| description | Merged description (both directions joined with | if they differ) |