Data Integration

ChemoProfiling platform integrates data from PubChem BioAssay repository to annotate input compounds with targets. Details are presented in Figure 1 and Figure 2. All screenings deposited in Pubchem BioAssay repository that have reported active molecules are collected. In most target oriented screens the target protein Id is provided. In most assays the type of activity is commonly provided in the title (inhibitors, activators, agonist, antagonist ets.). If the type is not provided then a general type "interactors" is assigned. Different assays related to the same target and type is united together as well as different assays related to the same target with different type (inhibitor, activator) is united with the type "all" (to see cases when it is important to act on the target while the sign of action might not be important). At the moment, ChemoProfiling target data covers ~500 000 unique compounds that are annotated with at least one active target. About 500 protein targets are covered (with at least 100 molecules reported active).

Figure 1. ChemoProfiling platform: integrating Pubchem BioAssay data

Integrating Pathway data

The proteins (~500) covered by ChemoProfiling target data could be united into pathways and functional groups. ChemoProfiling integrates pathway or functionally related targets thus creating ChemoProfiling pathway data. In this case 500 000 molecules from ChemoProfiling target data regrouped in sets that inhibit (activate, ) a pathway or a functionally related proteins. The grouping of proteins to pathways is based on REACTOME database or Gene Ontology (in case of functionally related proteins).

Figure 2. ChemoProfiling platform: integrating Pathway data

Inference of Target/Pathway and multi-target enrichment

The input molecules (active and inactive) are annotated with target/pathway attributes. Enrichment analysis is applied to find targets/pathways significantly associated with input active molecules. Advanced enrichment analysis extends the standard one to find combination (duplet and triplets) of targets (joined by AND logical operator) that are overrepresented among active molecules. In this case one got potential multi-target mechanism. Similar, enrichment of a pathway/GO term is indicative of the potential pathway role in drug mode of action (Figure 3).

Figure 3. ChemoProfiling platform: mapping input molecules and enrichment analysis