Find out which data we used and how we accessed it,
including our main dataset :
BindingDB
"BindingDB is a public, web-accessible database of measured binding affinities, focusing chiefly on the interactions of proteins considered to be candidate drug-targets with ligands that are small, drug-like molecules."
- Wikipedia

DrugBank
Accessed via a download.
Used to retrieve generic names of commercialised drugs. One of the steps to link prescription data to the dataset.


ZINC
We are leveraging the Zinc Docking API to retrieve clinical trial data for ligands associated with ZINC IDs from the BindingDB database. Using the endpoint https://zinc.docking.org/substances/{id}/trials.json?count=all
, where {id}
corresponds to a specific ZINC ID, we began by processing the 3,000 most common ZINC IDs as part of Phase 2 to preview the available clinical trial data. Fully retrieving trial data for all ZINC IDs, a task estimated to require approximately 120 hours, was executed for Phase 3 of the project.
UniProt
We are utilizing the UniProt API to gather detailed information about disease areas associated with ligands.


Article DOI Metadata
Accessed with the crossref API to access publication metadata, based on a DOI (year of publication, journal, publisher, number of citations, authors, …)
