As introduced back in 2013 we offer users a selection of attributed target lists extracted from the literature either as supplementary data or downloads from various databases. While we noticed these have seen some past user traffic we have decided not to regularly update them (in the way that we do for every release of our own database). This is mainly due to the proliferation of more sets making it difficult to keep on top of new versions. Anyone specifically interested in an update of any of those below but may be having difficulty with the in situ downloads, is welcome to contact us.
The criteria for inclusion in these lists are drug target coverage for human proteins. However, the exact definition varies between lists, as explained in the metadata below. This includes different terminology (e.g. "successful, "approved" or "proven"). There are also differences in primary target (~1:1 drug: protein) vs. secondary or subunit mappings (1:many).
There are many utilities you can explore but two that you might consider are a) following the database links and b) comparing them for intersects (protein IDs in common) and differentials (protein IDs unique to particular lists or subsets). This obviously extends to comparisons with lists you may generate in the course of your own or other published work (e.g. expression data or disease association gene candidates). We would be interested to hear from you a) what other utilities you find valuable and b) other recently published target lists that you recommend for inclusion. If you have an unpublished but openly provenanced new list (e.g. on figshare) we would be pleased to consider this. Obviously we may eventually need to cap the number we host but new ones can displace older lists.
Our metadata descriptions are minimal since context is provided either in the references and/or the download descriptions for the appropriate databases. The lists are Excel sheets of UniProtKB, HGNC and ChEMBL live links. You should be able to get to most other sources from these three entry points. In addition, if you paste the UniProtKB list into the ID mapping interface you can select different intersects by Boolean selects or post-query display options.
Lists that are not UniProtKB Accessions in the first place are normalised to these (e.g. mappings of Human Gene Nomenclature Committee (HGNC) Symbols or Entrez Gene IDs (EGID) to UniProtKB). They are then filtered to human and Swiss-Prot (i.e. any TrEMBL entries are removed) and to approved drug targets if this is an option in the original list. In such cases lists we host thus become transformations, rather than direct facsimiles, of the primary sources. Given such ID cross-mappings are not perfect; we cannot guarantee their absolute correctness. However, our versions are supplied in good faith and the originals are available in every case. If you need the cross-mapping details for any particular list you are welcome to contact us.
If you are unfamiliar with protein list "slicing and dicing" we recommend the following:
Attributions and brief descriptions for the Excel sheets:
In parallel with the target lists we offer users a selection of attributed drug lists, in the broad sense of encompassing approved, clinical or research small-molecules, together with biologicals in some cases. These have been downloaded from databases or extracted as supplementary data from journal papers. We much appreciate their availability but note they are supplied as-is. The lists have a variety of utilities but are particularly useful for name, synonym and identifier look-ups. There can be some inter-list discordance in name-to-structure mappings for technical reasons but you can resolve individual entries you are interested in. Those with SMILES can be cheminformatically processed (there is a general description of comparing the chemistry and targets between databases in this recent paper). While our own database has full search capability we strategically curate a smaller, more concise set of term-to-structure mappings between our ligands and proteins. Thus, some of the posted sets include nominally approved prescription compounds we have chosen not to activity-map (e.g. nutraceuticals, vitamins and some non-INNs) since their relationships are multiplexed.
As ever, we would be pleased to hear your views on utility and additional sources you might recommend for inclusion. Since our own approved drug lists are being consolidated and will imminently be updated in PubChem, we shall surface these in due course. In the interim, drug entries can be browsed and retrieved from our ligand list. Attributions and brief descriptions for the Excel sheets are:
Page last updated 27th January 2017