Research Areas
Here are some resources on drugs, drug effects, pharmacological pathways, and genetic interactions. All are free and open for academic use (and for most other uses as well). Please acknowledge and cite our work. If you have any questions, please do not hesitate to contact us.
You can browse our publicly available code here and our publicly available resources here.
OnSIDES, side effects extracted from FDA Structured Product Labels
OnSIDES (ON label SIDE effectS resource) is the newest member of the NSIDES family. The initial release (v01) of the OnSIDES database of adverse reactions and boxed warnings extracted from the FDA structured product labels. All labels available to download from DailyMed as of April 2022 were processed in this analysis. In total 2.7 million adverse reactions were extracted from 42,000 labels for just under 2,000 drug ingredients or combination of ingredients.
We created OnSIDES using the ClinicalBERT language model and 200 manually curated labels available from Denmer-Fushman et al.. The model achieves an F1 score of 0.86, AUROC of 0.88, and AUPR of 0.91 at extracting effects from the ADVERSE REACTIONS section of the label and an F1 score of 0.66, AUROC of 0.71, and AUPR of 0.60 at extracting effects from the BOXED WARNINGS section.
Browse the data at onsidesdb.org.
Read more at nsides.io.
Tanaka, Y., Chen, H.Y., Belloni, P., Gisladottir, U., Kefeli, J., Patterson, J., Srinivasan, A., Zietz, M., Sirdeshmukh, G., Berkowitz, J., Larow Brown, K., Tatonetti, N. (2024). OnSIDES (ON-label SIDE effectS resource) Database : Extracting Adverse Drug Events from Drug Labels using Natural Language Processing Models. medRxiv. 10.1101/2024.03.22.24304724.
Sex-specific side effects (AwareDX)
Adverse drug effects posing sex-specific risks. Risks in this database were predicted by AwareDX to publicly available data. Over 20,000 sex risks spanning over 800 drugs and 300 side effects.
Browse the repository.
Chandak P, Tatonetti NP. Using Machine Learning to Identify Adverse Drug Effects Posing Increased Risk to Women. Patterns (NY). 2020;1(7):100108. doi:10.1016/j.patter.2020.100108
Drug side effects and drug-drug interactions were mined from publicly available data. OffSIDES is a database of drug side-effects that were found, but are not listed on the official FDA label. TwoSIDES is the only comprehensive database drug-drug-effect relationships. Over 3,300 drugs and 63,000 combinations connected to millions of potential adverse reactions.
Read more and access the data at nsides.io.
Tatonetti NP, Ye PP, Daneshjou R, Altman RB. Data-driven prediction of drug effects and interactions. Sci Transl Med. 2012;4(125):125ra31. doi:10.1126/scitranslmed.3003377
Family Relationship and Disease Data
De-identified family data on over 3,000 conditions at two sites. Data are from approximately 1.5 million patients across the two sites and all identifying information has been removed. Further, ages have been replaced with a random poisson distribution with lambda set to the actual age of the patient. Data are compatible with the observation heritability estimation software (https://github.com/tatonetti-lab/h2o).
If you would like the 500 significant traits as reported in Polubriaginof, et al. in Cell, go to this page.
All code to generate the relationships from hospital data is publicly available in our RIFTEHR github.
Polubriaginof FCG, Vanguri R, Quinnies K, et al. Disease Heritability Inferred from Familial Relationships Reported in Medical Records. Cell. 2018;173(7):1692-1704.e11. doi:10.1016/j.cell.2018.04.032
DATE
Downstream effects of targeted proteins is essential to drug design. We introduce a data-driven method named DATE, which integrates drug-target relationships with gene expression, protein-protein interaction, and pathway annotation data to connect Drugs to target pAthways by the Tissue Expression. Links drugs to tissue-specific target pathways.
467,396 connections for 1,034 drugs and 954 pathways in 259 tissues/cell lines available.
Hao Y, Quinnies K, Realubit R, Karan C, Tatonetti NP. Tissue-Specific Analysis of Pharmacological Pathways. CPT Pharmacometrics Syst Pharmacol. 2018;7(7):453-463. doi:10.1002/psp4.12305
GOTE
G protein-coupled receptors (GPCRs) are central to how cells respond to their environment and a major class of pharmacological targets. We developed a data-driven method named GOTE, that connects Gpcrs to dOwnstream cellular pathways by the Tissue Expression. Links G-protein coupled receptors to tissue-specific molecular pathways.
93,012 connections for 213 GPCRs and 654 pathways in 196 tissues/cell types available. Code available here.
Hao Y, Tatonetti NP. Predicting G protein-coupled receptor downstream signaling by tissue expression. Bioinformatics. 2016;32(22):3435-3443. doi:10.1093/bioinformatics/btw510
Network analysis framework that identifies adverse event (AE) neighborhoods within the human interactome (protein-protein interaction network). Drugs targeting proteins within this neighborhood are predicted to be involved in mediating the AE. Links drugs to seed sets of proteins and phenotypes, like drug side-effects and diseases.
A description of the algorithm is available here. Code in Python available on GitHub.
Lorberbaum T, Nasir M, Keiser MJ, Vilar S, Hripcsak G, Tatonetti NP. Systems pharmacology augments drug safety surveillance. Clin Pharmacol Ther. 2015;97(2):151-158. doi:10.1002/cpt.2
VenomKB
The world’s first comprehensive knowledge base for therapeutic uses of venoms. As of its original release, contains 39,000 mined from MEDLINE describing potentially therapeutic effects of venoms on the human body. Links venom compounds to physiological effects.
39K venom/effect associations in three databases available for download.
Code available on GitHub.
Romano JD, Tatonetti NP. VenomKB, a new knowledge base for facilitating the validation of putative venom therapies. Sci Data. 2015;2:150065. Published 2015 Nov 24. doi:10.1038/sdata.2015.65
SINaTRA
Interspecies, network-based predictions of synthetic lethality and the first genome-wide scale prediction of synthetic lethality in humans. Scores were validated against three independent databases of synthetic lethal pairs in humans, mouse, and yeast. The original release contains ~109 million gene pairs with their associated synthetic lethality scores.
Human synthetic lethal gene pairs available in 3 parts: part 1, part 2, part 3 and mouse too.
Jacunski A, Dixon SJ, Tatonetti NP. Connectivity Homology Enables Inter-Species Network Models of Synthetic Lethality. PLoS Comput Biol. 2015;11(10):e1004506. Published 2015 Oct 9. doi:10.1371/journal.pcbi.1004506